Setup

Get started with ResearchArcade in minutes, including installation, configuration, and backend selection.

Overview

ResearchArcade provides a unified graph-based interface for academic data from ArXiv and OpenReview. This guide will help you install, configure, and start using ResearchArcade with either CSV or SQL backend.

Environment Requirements

Python

Python ≥ 3.9 (tested on 3.12)

PostgreSQL

PostgreSQL ≥ 14 (for SQL backend only)

Conda

Conda ≥ 22.0 (recommended)

API Keys

Semantic Scholar API key (optional)

Installation

Step 1: Create Environment

# Create a new conda environment
conda create -n research_arcade python=3.12
conda activate research_arcade

Step 2: Install Dependencies

# Install required libraries
pip install -r requirements.txt

Step 3: Configure Environment Variables

Copy the template file and configure your API keys and database settings:

cp .env.template .env

Edit the .env file to add your Semantic Scholar API key and database configurations.

Backend Selection

ResearchArcade supports two backend types: CSV for simple file-based storage and PostgreSQL for scalable database operations.

CSV Backend

Best for: Quick experimentation, small to medium datasets, portability

from research_arcade import ResearchArcade

research_arcade = ResearchArcade(
    db_type="csv",
    config={"csv_dir": "/path/to/csv/data/"}
)

SQL Backend

Best for: Large-scale datasets, complex queries, production use

from research_arcade import ResearchArcade

research_arcade = ResearchArcade(
    db_type="sql",
    config={
        "host": "localhost",
        "dbname": "conference_db",
        "user": "username",
        "password": "password",
        "port": "5432"
    }
)
SQL Backend Requirements: Make sure PostgreSQL is installed and running before initializing with SQL backend.