ResearchArcade: Graph Interface for Academic Tasks

University of Illinois Urbana–Champaign

Abstract

Academic research generates diverse data sources. As researchers increasingly use machine learning to assist research tasks, a crucial question arises: Can we build a unified data interface to support the development of machine learning models for various academic tasks? Models trained on such a unified interface can better support human researchers throughout the research process and eventually accelerate knowledge discovery. In this work, we introduce ResearchArcade, a graph-based interface that connects multiple academic data sources, unifies task definitions, and supports a wide range of base models to address key academic challenges. ResearchArcade utilizes a coherent multi-table format with graph structures to organize data from different sources, including academic corpora from ArXiv and peer reviews from OpenReview, while capturing information with multiple modalities, such as text, figures, and tables. ResearchArcade also preserves temporal evolution at both the manuscript and community levels, supporting the study of paper revisions as well as broader research trends over time. Additionally, ResearchArcade unifies diverse academic task definitions and supports various models with distinct input requirements. Our experiments across six academic tasks demonstrate that combining cross-source and multi-modal information enables a broader range of tasks, while incorporating graph structures consistently improves performance over baseline methods. This highlights the effectiveness of ResearchArcade and its potential to advance research progress.

1. Introduction

Diverse research tasks demand access to comprehensive data from multiple sources and various models are employed to accomplish these tasks. Building a unified data interface for academic tasks is challenging due to the diverse, relational nature of academic data sourced from platforms like ArXiv and OpenReview, which spans multiple modalities such as text, visuals, and tables. This requires a flexible framework that can manage these complexities while evolving with ongoing research. Additionally, defining academic tasks and accommodating various models, such as Large Language Models (LLMs) and Graph Neural Networks (GNNs), adds further complexity in terms of data preprocessing and model-specific interfaces.

In this paper, we propose ResearchArcade, a graph-based interface that links diverse academic data sources, unifies task definitions, and supports various base models to address important academic tasks. ResearchArcade offers four core features: Multi-Source, Multi-Modal, Highly Structural and Heterogeneous, and Dynamically Evolving. We unify diverse academic tasks within the academic graphs in ResearchArcade, enabling easy formulation of new tasks for both predictive and generative models. The structured knowledge in ResearchArcade can also be exported to standardized formats like CSV and JSON, ensuring seamless integration with models such as LLMs and GNNs.

To demonstrate the key advantages of ResearchArcade, we define six academic tasks: figure/table insertion, paragraph generation, revision retrieval, revision generation, acceptance prediction, and rebuttal generation. Extensive experiments show that models benefit from the multi-source, multi-modal, heterogeneous, and dynamic information in ResearchArcade.

2. ResearchArcade

Diagram showing ResearchArcade multi-table graph across sources and modalities.
Figure 1. ResearchArcade uses a multi-table format with graph structures to collect data from different sources with multiple modalities.

ResearchArcade integrates data from multiple sources, such as research papers from ArXiv and peer reviews from OpenReview, and handles multi-modal information like text, figures, and tables. These entities are organized in a multi-table format, where tables are treated as nodes and edges in a graph, enabling efficient management of relational and heterogeneous academic data. Additionally, ResearchArcade tracks academic evolution at both microscopic and macroscopic scales: it preserves paper revisions over time, while its extensible framework allows continuous data updates to analyze research trends.


Two-step task-unification scheme: target entity selection and neighborhood retrieval.
Figure 2. ResearchArcade unifies the academic task definitions in a two-step scheme.

ResearchArcade unifies the academic task definitions in the following two steps: (1) identifying the target entity and (2) retrieving the neighborhood of the target entity. Six academic tasks are defined based on the two-step scheme.

Tasks definition grid for six academic tasks supported by ResearchArcade.

3. Experiments

3.1 Main Results

Main experiment results across tasks comparing model families.

ResearchArcade is General. It supports diverse tasks by integrating multi-modal data from ArXiv and OpenReview and converting it into formats like CSV or JSON. Predictive tasks are handled by EMB-based, GNN-based, and GWM-based models, while generative tasks are managed by LLM-based models. The data quality in ResearchArcade is validated, with smaller LLMs approaching the performance of larger ones, particularly in tasks like Revision and Rebuttal Generation.

ResearchArcade Models Dynamic Evolution. It captures dynamic evolution at both intra-paper and inter-paper levels by integrating temporal data from ArXiv and OpenReview. It excels in tasks like Revision Retrieval and Revision Generation, where GNN-based and GWM-based models outperform EMB-based models, showcasing the framework's effectiveness in modeling manuscript evolution. Incorporating OpenReview rebuttal data significantly improves performance, while the Acceptance Prediction task highlights the difficulty of predicting research trends, with accuracy barely exceeding random chance.

Relational Graph Structure Delivers Consistent Gains. Graph-based models (GNN and GWM) outperform non-graph models (EMB and MLP) with performance improvements of 7.7%, 67%, and 7.2% in Figure/Table Insertion, Revision Retrieval, and Acceptance Prediction, respectively. Multi-hop aggregation further boosts performance in Acceptance Prediction, where 3-hop aggregation increases accuracy to 0.55, surpassing the MLP baseline. However, additional hops show limited or negative benefits in tasks like Figure/Table Insertion due to graph sparsity.

3.2 Ablation Study

Ablation study chart highlighting modality contributions.

Multi-Modal Information Is Critical. In tasks like Rebuttal Generation and Generate Missing Paragraph, including visual and tabular data improves understanding and leads to significant performance gains. For instance, revision generation scores increase from 0.693 to 0.717 for the larger model, while paragraph generation improves from 0.259 to 0.272, demonstrating the critical role of multi-modal information.

BibTeX

@inproceedings{researcharcade2025,
  title     = {ResearchArcade: Graph Interface for Academic Tasks},
  author    = {You, Jiaxuan and others},
  booktitle = {Proceedings of ...},
  year      = {2025},
  url       = {https://github.com/ulab-uiuc/research_arcade}
}