Skip to content

kislay536/DORA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DORA: Multi-Agent LLM-Based Research Exploration System

A sophisticated system for autonomously exploring research directions using multiple specialized LLM agents with stage-parallel execution.

Overview

This system uses 6 LLM agents organized in a hierarchical structure to iteratively explore, refine, and evaluate research ideas:

  • 3 Intern Agents: Explore and generate novel research ideas
  • 2 Mentor Agents: Evaluate and refine ideas from interns
  • 1 Supervisor Agent: Provides final judgment and actionable insights

Each iteration follows this workflow:

  1. All 3 interns run in parallel (exploration)
  2. Both mentors run in parallel (evaluation & refinement)
  3. Supervisor produces final evaluation
  4. Iteration is marked complete only after supervisor finishes

The system can run multiple iterations, with each iteration building on feedback from the previous one.

Synchronization behavior:

  • Interns run together, then wait for mentors and supervisor to finish the cycle
  • Mentors start only after all interns finish, then wait for supervisor
  • Supervisor starts only after both mentors finish
  • All agents move to waiting state until the next iteration starts

Agent memory isolation:

  • By default, agents are stateless across iterations (ENABLE_ITERATION_FEEDBACK = False in config.py)
  • This prevents cross-iteration carry-over unless explicitly enabled
  • To enable iterative refinement memory, set ENABLE_ITERATION_FEEDBACK = True

Prerequisites

  • Python 3.8+
  • Google AI Studio API key(s)
  • Research papers (optional but recommended)

Installation

1. Install Dependencies

cd /home/kislay/Documents/research
pip install -r requirements.txt

2. Set Up API Keys

You can provide API keys in two ways:

Option A: Environment Variables (Recommended)

export GOOGLE_AI_KEY_1="your-first-api-key"
export GOOGLE_AI_KEY_2="your-second-api-key"
export GOOGLE_AI_KEY_3="your-third-api-key"
# etc...

Or add to ~/.bashrc or ~/.zshrc:

# ~/.bashrc
export GOOGLE_AI_KEY_1="your-first-api-key"
export GOOGLE_AI_KEY_2="your-second-api-key"

Option B: Command Line Arguments

python main.py --api-keys "key1" "key2" "key3"

3. Prepare Your Research Goal

Edit goal.txt with your research objective:

Research Goal: [Your research question or objective]

Key aspects to consider:
- [Aspect 1]
- [Aspect 2]
- ...

Desired outcome: [What you hope to achieve]

4. Add Research Papers (Optional)

Place research papers in the papers/ folder:

  • Text files: .txt format (extracted paper content)
  • PDF files: .pdf format (requires PyPDF2)

Usage

Makefile Workflow (Recommended)

# Install dependencies
make install

# Interactive setup
make setup

# Run exploration (default ITERATIONS=3)
make run

# Run with custom iterations
make run ITERATIONS=5

# Resume from iteration 3
make resume RESUME=3

# htop-like live monitor
make monitor

# Quick progress and outputs
make progress
make logs

Run Default Configuration (3 Iterations)

python main.py

Run Custom Number of Iterations

python main.py --iterations 5

Resume from a Previous Iteration

python main.py --resume 3

This will:

  1. Load outputs from iteration 2
  2. Feed them as feedback to agents
  3. Continue from iteration 3

View Logs

tail -f agent_system.log

Interactive Visualizer (htop-like)

Run a live terminal dashboard in a separate terminal while main.py is running:

python monitor.py

Optional refresh interval:

python monitor.py --refresh 0.5

Press q to exit the visualizer.

Output Structure

Results are organized as follows:

outputs/
├── intern_1/
│   ├── iteration_1.txt
│   ├── iteration_2.txt
│   └── iteration_3.txt
├── intern_2/
│   └── ...
├── intern_3/
│   └── ...
├── mentor_1/
│   └── ...
├── mentor_2/
│   └── ...
├── supervisor/
│   ├── iteration_1.txt
│   ├── iteration_2.txt
│   └── iteration_3.txt
└── checkpoint_iteration_*.txt

Configuration

Edit config.py to customize:

Model Settings

  • MODEL_NAME: Model to use ("gemini-1.5-flash" or "gemini-1.5-pro")
  • TEMPERATURE: Creativity level (0.0-1.0, default 0.7)
  • MAX_TOKENS: Maximum response length (default 2000)

Execution Settings

  • DEFAULT_ITERATIONS: Number of iterations (default 3)
  • INTERN_AGENTS: Number of interns (default 3)
  • MENTOR_AGENTS: Number of mentors (default 2)

Rate Limiting

  • REQUEST_TIMEOUT: API timeout in seconds (default 30)
  • RETRY_ATTEMPTS: Failed request retries (default 3)
  • RETRY_DELAY: Delay between retries (default 2s)

Logging

  • VERBOSE_LOGGING: Enable debug logging (default True)
  • SAVE_INTERMEDIATE_RESULTS: Save all outputs (default True)

System Architecture

Agent Types

Intern Agent

  • Role: Exploration and idea generation
  • Input: Goal, papers, feedback from previous iteration
  • Output: 3-5 novel research ideas with feasibility assessment
  • Iteration: Gets refined feedback from mentors and supervisor

Mentor Agent

  • Role: Evaluation and synthesis
  • Input: All intern outputs from current iteration
  • Output: Refined, synthesized research directions
  • Quality Check: Merges ideas, removes redundancies, assesses rigor

Supervisor Agent

  • Role: Final judgment and direction setting
  • Input: Both mentor outputs
  • Output: Top priorities with actionable recommendations
  • Assessment: Identifies hallucinations, inconsistencies

Feedback Loop

Iteration N:
  Interns (input: Goal, Papers, Iteration N-1 Feedback)
    ↓ (outputs)
  Mentors (input: All Intern outputs)
    ↓ (outputs)
  Supervisor (input: Both Mentor outputs)
    ↓ (outputs) → Becomes feedback for Iteration N+1

Advanced Usage

Using Multiple API Keys

The system automatically load-balances across API keys using round-robin:

export GOOGLE_AI_KEY_1="key1"
export GOOGLE_AI_KEY_2="key2"
export GOOGLE_AI_KEY_3="key3"
python main.py

With 3 keys and 6 agents, the orchestrator will assign:

  • Interns 1-2 → Key 1
  • Intern 3 → Key 2
  • Mentors 1-2 → Key 3 and Key 1
  • Supervisor → Key 2

This distributes load and allows higher parallel throughput.

Custom Paper Processing

Papers are automatically summarized in agent prompts. To improve summarization:

  1. Create clean .txt files with extracted content
  2. Provide 1-2 page summaries instead of full papers
  3. Use clear headers for sections (e.g., "Abstract:", "Key Findings:")

Scoring System (Optional)

To enable optional scoring of ideas, set in config.py:

USE_SCORING = True
SCORING_METRICS = ["novelty", "feasibility", "impact", "grounded"]

The supervisor will assess ideas on these metrics.

Troubleshooting

API Key Issues

Error: "No API keys found"

Solution:

export GOOGLE_AI_KEY_1="your-key"
python main.py

Missing Goal File

Error: "Goal file not found"

Solution:

echo "Your research goal here" > goal.txt

API Timeouts

Solution: Increase REQUEST_TIMEOUT in config.py or reduce MAX_TOKENS.

Memory Issues with Large Papers

Solution:

  • Provide paper summaries (5-10 pages) instead of full papers
  • Reduce MAX_TOKENS in config.py
  • Use fewer papers per iteration

Rate Limiting

Error: "API rate limit exceeded"

Solution:

  • Add more API keys to GOOGLE_AI_KEY_* environment variables
  • Increase RETRY_DELAY in config.py
  • Reduce INTERN_AGENTS or MENTOR_AGENTS

Resuming Interrupted Runs

If a run is interrupted:

  1. Check the log to identify the last completed iteration
  2. Resume from the next iteration:
    python main.py --resume 3  # Resume from iteration 3

The system will load outputs from iteration 2 and use them as feedback.

Example Workflow

# Setup
export GOOGLE_AI_KEY_1="key1"
export GOOGLE_AI_KEY_2="key2"
export GOOGLE_AI_KEY_3="key3"

# Create goal
echo "Explore efficient attention mechanisms for long-sequence processing" > goal.txt

# Add papers
cp my_papers/*.pdf papers/
cp my_papers/*.txt papers/

# Run 3 iterations
python main.py --iterations 3

# Monitor progress
tail -f agent_system.log

# View final results
cat outputs/supervisor/iteration_3.txt

Output Interpretation

Intern Output

Look for:

  • Novel ideas grounded in papers
  • Clear descriptions with feasibility assessment
  • Identification of knowledge gaps

Mentor Output

Look for:

  • Synthesis of related ideas
  • Critical assessment of each idea
  • Top 2-3 most promising directions
  • Actionable next steps

Supervisor Output

This is the most important output. Look for:

  • Top 3-5 priority research directions
  • Quality assessment (Novelty, Feasibility, Impact)
  • Specific recommendations for next iteration
  • Overall research direction assessment

System Limitations

  1. Hallucinations: LLM can generate plausible-sounding but incorrect ideas

    • Supervisor explicitly checks for this
    • Ground everything in provided papers
  2. Paper Summarization: System auto-summarizes papers for brevity

    • Provide extracted text rather than PDFs when possible
    • Consider 5-10 page summaries of key papers
  3. Context Window: Limited token budget per request

    • MAX_TOKENS controls output length, not input
    • System automatically limits paper summaries
  4. Determinism: Same inputs may produce different outputs across runs

    • Set TEMPERATURE to 0.0 for more deterministic results

Best Practices

  1. Clear Goal: Write a specific, well-defined research objective
  2. Curated Papers: Select 5-10 most relevant papers
  3. Iterative Refinement: Run 3-5 iterations for best results
  4. Multiple Keys: Use 2-3 API keys for better parallelization
  5. Monitoring: Watch logs for convergence and quality improvement
  6. Validation: Manually validate supervisor recommendations

Performance Tips

  • Speed: Use gemini-1.5-flash instead of gemini-1.5-pro
  • Quality: Use gemini-1.5-pro for better analysis
  • Cost: Limit to 1-2 iterations per run
  • Parallelism: Provide multiple API keys

Citation

If you use this system in your research, please cite:

@software{multi_agent_research_2024,
  title={Multi-Agent LLM-Based Research Exploration System},
  author={Your Name},
  year={2024},
  url={https://your-repo-url}
}

License

[Specify your license here]

Support

For issues or questions:

  1. Check agent_system.log for error details
  2. Review configuration in config.py
  3. Ensure API keys are valid
  4. Check that goal.txt is properly formatted

Future Enhancements

  • Web UI for monitoring and interaction
  • Real-time result visualization
  • Integration with research databases (arXiv, Scholar, etc.)
  • Automated paper downloading
  • Scoring and ranking system
  • Export to research formats (LaTeX, Markdown)
  • Integration with citation managers
  • Collaborative multi-user mode

About

DORA: Multi-Agent Research Explorer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors