Import from JSON

Load structured and nested data from JSON files with support for complex hierarchies.

Overview

This tutorial covers importing data from JSON (JavaScript Object Notation) files into your ResearchArcade graph database. You'll learn how to handle nested structures, map JSON objects to graph entities, process arrays and hierarchies, validate schemas, and efficiently import both simple and complex JSON documents.

JSON File Structure

Basic JSON Format

Understand the JSON structures compatible with ResearchArcade:

{
  "papers": [
    {
      "paper_id": 1,
      "title": "Deep Learning Fundamentals",
      "authors": [
        {"name": "Smith, J.", "affiliation": "MIT"},
        {"name": "Jones, A.", "affiliation": "Stanford"}
      ],
      "year": 2023,
      "abstract": "This paper explores...",
      "citations": [2, 3, 5]
    },
    {
      "paper_id": 2,
      "title": "Graph Neural Networks",
      "authors": [
        {"name": "Brown, K.", "affiliation": "CMU"}
      ],
      "year": 2024,
      "abstract": "We present a novel...",
      "citations": [1]
    }
  ]
}

Nested vs Flat JSON

Work with both flat and deeply nested JSON structures:

# Code example placeholder
# Add your Python/API code here for handling nested structures

JSON Lines Format

Import newline-delimited JSON (JSONL/NDJSON) for streaming large datasets:

# Code example placeholder
# Add your Python/API code here for JSONL format

Importing Nodes from JSON

Basic Node Import

Load simple JSON objects as graph nodes:

# Code example placeholder
# Add your Python/API code here for basic node import

Mapping JSON Keys to Properties

Define how JSON fields correspond to node attributes:

# Code example placeholder
# Add your Python/API code here for property mapping

Handling Arrays of Objects

Import multiple entities from JSON arrays:

# Code example placeholder
# Add your Python/API code here for array processing

Handling Nested Structures

Creating Relationships from Nested Objects

Transform nested JSON into nodes and edges:

# Code example placeholder
# Add your Python/API code here for nested relationships

Flattening Hierarchies

Convert hierarchical JSON into graph structure:

# Code example placeholder
# Add your Python/API code here for flattening hierarchies

Preserving JSON Structure as Properties

Store complex nested data as node properties:

# Code example placeholder
# Add your Python/API code here for preserving structure

Importing Edges from JSON

Explicit Relationship Format

Import edges defined explicitly in JSON:

# Code example placeholder
# Add your Python/API code here for explicit edges

Implicit Relationships from References

Create edges from ID references in nested objects:

# Code example placeholder
# Add your Python/API code here for implicit relationships

Many-to-Many Relationships

Handle arrays of references for complex connections:

# Code example placeholder
# Add your Python/API code here for many-to-many relationships

Schema Validation

JSON Schema Validation

Validate JSON structure against a schema before import:

# Code example placeholder
# Add your Python/API code here for JSON schema validation

Custom Validation Rules

Implement domain-specific validation logic:

# Code example placeholder
# Add your Python/API code here for custom validation

Handling Invalid Data

Manage validation errors and malformed JSON:

# Code example placeholder
# Add your Python/API code here for handling invalid data

Data Transformation

Custom Field Transformations

Apply transformations during import:

# Code example placeholder
# Add your Python/API code here for field transformations

Type Coercion

Convert JSON types to appropriate database types:

# Code example placeholder
# Add your Python/API code here for type coercion

Computed Properties

Generate additional properties from JSON data:

# Code example placeholder
# Add your Python/API code here for computed properties

Advanced Import Features

Incremental Updates

Update existing nodes with new JSON data:

# Code example placeholder
# Add your Python/API code here for incremental updates

Conditional Import Logic

Apply conditional rules during import:

# Code example placeholder
# Add your Python/API code here for conditional import

Merging Multiple JSON Sources

Combine data from multiple JSON files:

# Code example placeholder
# Add your Python/API code here for merging sources

Performance Optimization

Streaming Large JSON Files

Process large files without loading entirely into memory:

# Code example placeholder
# Add your Python/API code here for streaming

Batch Processing

Import JSON records in optimized batches:

# Code example placeholder
# Add your Python/API code here for batch processing

Parallel Import

Use parallel processing for faster imports:

# Code example placeholder
# Add your Python/API code here for parallel import

Error Handling

Parse Errors

Handle malformed JSON gracefully:

# Code example placeholder
# Add your Python/API code here for parse error handling

Missing Required Fields

Manage records with missing mandatory data:

# Code example placeholder
# Add your Python/API code here for missing fields

Transaction Rollback

Implement rollback for failed imports:

# Code example placeholder
# Add your Python/API code here for rollback

Best Practices

  • Validate JSON schema before attempting large imports
  • Use consistent identifier fields across all JSON objects
  • Stream large JSON files instead of loading into memory
  • Define clear mapping rules for nested structures
  • Implement comprehensive error logging
  • Use transactions to maintain data consistency
  • Test import logic on sample data first
  • Document your JSON-to-graph mapping schema
  • Handle null and undefined values appropriately
  • Consider using JSON Lines format for very large datasets

Common Import Scenarios

Importing Academic Papers with Nested Metadata

# Code example placeholder
# Import complex paper records with authors, sections, and citations

Building Knowledge Graphs from Structured Data

# Code example placeholder
# Create entity-relationship graphs from JSON

Importing Hierarchical Taxonomies

# Code example placeholder
# Load nested category structures

JSON vs Other Formats

Feature JSON CSV API
Nested structures ✓ Native support ✗ Requires flattening ✓ Full support
Human readable ✓ Very readable ✓ Easy to read ✓ Readable
File size Medium Small N/A (network)
Schema flexibility ✓ Very flexible ✗ Fixed columns ✓ Flexible
Best for Complex data exports Tabular data Real-time data