Skip to content

Orchestra-Research/Agent-Native-Research-Artifact

Repository files navigation

Agent-Native Research Artifact (ARA)

License: MIT Agent Skills arXiv

A protocol that recasts the primary research object from narrative document to machine-executable knowledge package — so AI agents can navigate, reproduce, and extend published research without re-discovering every dead end.

Legacy PDF vs ARA

Publishing compiles a rich research object into a lossy narrative (left); ARA preserves the original as a high-fidelity, machine-executable knowledge package (right).


The Problem

Research produces a branching knowledge object — months of hypotheses tested and rejected, implementation tricks discovered through trial and error, design alternatives weighed. Publishing compiles this into a linear narrative, discarding everything that doesn't fit the final story.

This was tolerable when every consumer was human. It is not when AI agents routinely read papers to reproduce experiments and extend published methods.

Reproduction information gap

The numbers:

  • Only 45.4% of 8,921 reproduction requirements from 23 ICML 2024 papers are fully specified in their PDFs (PaperBench)
  • Failed agent runs account for 90.2% of total dollar cost across 24,008 runs on RE-Bench — agents without prior failure records rediscover every dead end independently

What is ARA?

ARA organizes research into four interlocking layers:

artifact/
  PAPER.md                    # Root manifest + layer index (~200 tokens)
  logic/                      # Cognitive layer — What & Why
    problem.md                #   Observations → gaps → key insight
    claims.md                 #   Falsifiable assertions with proof refs
    concepts.md               #   Formal definitions
    experiments.md            #   Declarative experiment plans
    solution/
      architecture.md         #   System design + component graph
      algorithm.md            #   Math + pseudocode
      constraints.md          #   Boundary conditions
      heuristics.md           #   Implementation tricks + rationale
    related_work.md           #   Typed dependency graph
  src/                        # Physical layer — How
    configs/                  #   Hyperparameters with rationale
    environment.md            #   Dependencies, hardware, seeds
  trace/                      # Exploration graph — Journey
    exploration_tree.yaml     #   Research DAG with typed nodes + dead ends
  evidence/                   # Raw proof
    tables/                   #   Exact result tables
    figures/                  #   Extracted data points

Cross-layer bindings

Cross-layer forensic bindings thread claims in /logic to code in /src and evidence in /evidence. Dead-end nodes (×) in the exploration graph preserve failure modes.

Key design principles

  • Progressive disclosurePAPER.md (~200 tokens) tells agents whether the artifact is relevant. Deeper files load on demand.
  • Cross-layer binding — Claims reference experiments, experiments reference evidence, heuristics reference code. Everything is linked.
  • Dead ends preserved — Failed approaches and rejected alternatives are first-class nodes in the exploration graph, preventing agents from rediscovering known failures.
  • Provenance tracking — Every entry carries a tag (user, ai-suggested, ai-executed, user-revised) distinguishing human-confirmed facts from AI inferences.

Skills

This repository ships three open-source agent skills that work with ARA:

Skill Description Invoke
compiler Compiles papers, repos, notes, or any research input into a structured ARA artifact /compiler <path>
research-manager End-of-turn recorder that captures decisions, experiments, and dead ends with provenance tags /research-manager
rigor-reviewer ARA Seal Level 2 semantic review — scores six dimensions of epistemic rigor /rigor-reviewer <artifact_dir>

Compiler

ARA Compiler

Converts ANY research input into a complete ARA artifact. Accepts PDFs, GitHub repos, experiment logs, code directories, raw notes, or combinations. Follows a 4-stage epistemic protocol:

  1. Semantic Deconstruction — extract raw knowledge atoms
  2. Cognitive Mapping — map to claims, concepts, experiments
  3. Physical Stubbing — generate configs and code stubs
  4. Exploration Graph Extraction — reconstruct the research DAG
/compiler path/to/paper.pdf
/compiler https://github.com/org/repo
/compiler path/to/paper.pdf path/to/code/ --output ./my-artifact/

See skills/compiler/SKILL.md for the full specification.

Research Manager (Live Capture)

Research Manager lifecycle

An end-of-turn recorder that runs after every turn and writes research-significant events into the ara/ artifact via a three-stage pipeline (Context Harvester → Event Router → Maturity Tracker). Trace events (decisions, experiments, dead ends, pivots) are recorded immediately; knowledge events (claims, heuristics, concepts, constraints) are staged and crystallize only on closure signals — so research knowledge accrues as a side-effect of ordinary development.

/research-manager

See skills/research-manager/SKILL.md for the full specification.

Rigor Reviewer (ARA Seal Level 2)

A semantic epistemic review that assumes Level 1 structural validation has passed, then scores six dimensions — evidence relevance, falsifiability, scope calibration, and more — producing a level2_report.json with severity-ranked findings and an overall recommendation.

/rigor-reviewer path/to/artifact/

See skills/rigor-reviewer/SKILL.md for the full specification.


Install

npx @orchestra-research/ara-skills

Auto-detects Claude Code, Cursor, Gemini CLI, OpenCode, Codex, and Hermes, then prompts for skills, agents, and install scope (global vs. local).

Non-interactive

# All three skills, every detected agent, user-level
npx @orchestra-research/ara-skills install --all

# One skill, one agent
npx @orchestra-research/ara-skills install --skill compiler --agent claude-code

# Into the current project (.claude/skills, .cursor/skills, …) instead of $HOME
npx @orchestra-research/ara-skills install --all --local

# List / update / remove
npx @orchestra-research/ara-skills list
npx @orchestra-research/ara-skills update
npx @orchestra-research/ara-skills uninstall --skill rigor-reviewer

Full CLI reference: packages/ara-skills/.


Compatibility

These skills follow the Agent Skills open standard and work with:


Citation

If you use ARA in your research, please cite:

@article{ara2026,
  title        = {The Last Human-Written Paper: Agent-Native Research Artifacts},
  author       = {Liu, Jiachen and Pei, Jiaxin and Huang, Jintao and Si, Chenglei and Qu, Ao and Tang, Xiangru and Lu, Runyu and Chen, Lichang and Bai, Xiaoyan and Zheng, Haizhong and Chen, Carl and Chen, Zhiyang and Ye, Haojie and Fu, Yujuan and He, Zexue and Jin, Zijian and Zhang, Zhenyu and Sun, Shangquan and Harmon, Maestro and Wang, John Dianzhuo and Zeng, Jianqiao and Sun, Jiachen and Wu, Mingyuan and Zhou, Baoyu and You, Chenyu and Lu, Shijian and Qiu, Yiming and Lai, Fan and Yuan, Yuan and Li, Yao and Hong, Junyuan and Zhu, Ruihao and Chen, Beidi and Pentland, Alex and Chen, Ang and Chowdhury, Mosharaf and Zhang, Zechen},
  year         = {2026},
  eprint       = {2604.24658},
  archivePrefix= {arXiv},
  url          = {https://arxiv.org/abs/2604.24658}
}

Contributing

See CONTRIBUTING.md for how to add or improve skills, or contribute ARA artifacts to ara-output/.

License

MIT

About

A protocol that recasts the primary research object from narrative document to machine-executable knowledge package — so AI agents can navigate, reproduce, and extend published research without re-discovering every dead end.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors