A sophisticated system for autonomously exploring research directions using multiple specialized LLM agents with stage-parallel execution.
This system uses 6 LLM agents organized in a hierarchical structure to iteratively explore, refine, and evaluate research ideas:
- 3 Intern Agents: Explore and generate novel research ideas
- 2 Mentor Agents: Evaluate and refine ideas from interns
- 1 Supervisor Agent: Provides final judgment and actionable insights
Each iteration follows this workflow:
- All 3 interns run in parallel (exploration)
- Both mentors run in parallel (evaluation & refinement)
- Supervisor produces final evaluation
- Iteration is marked complete only after supervisor finishes
The system can run multiple iterations, with each iteration building on feedback from the previous one.
Synchronization behavior:
- Interns run together, then wait for mentors and supervisor to finish the cycle
- Mentors start only after all interns finish, then wait for supervisor
- Supervisor starts only after both mentors finish
- All agents move to waiting state until the next iteration starts
Agent memory isolation:
- By default, agents are stateless across iterations (
ENABLE_ITERATION_FEEDBACK = Falseinconfig.py) - This prevents cross-iteration carry-over unless explicitly enabled
- To enable iterative refinement memory, set
ENABLE_ITERATION_FEEDBACK = True
- Python 3.8+
- Google AI Studio API key(s)
- Research papers (optional but recommended)
cd /home/kislay/Documents/research
pip install -r requirements.txtYou can provide API keys in two ways:
export GOOGLE_AI_KEY_1="your-first-api-key"
export GOOGLE_AI_KEY_2="your-second-api-key"
export GOOGLE_AI_KEY_3="your-third-api-key"
# etc...Or add to ~/.bashrc or ~/.zshrc:
# ~/.bashrc
export GOOGLE_AI_KEY_1="your-first-api-key"
export GOOGLE_AI_KEY_2="your-second-api-key"python main.py --api-keys "key1" "key2" "key3"Edit goal.txt with your research objective:
Research Goal: [Your research question or objective]
Key aspects to consider:
- [Aspect 1]
- [Aspect 2]
- ...
Desired outcome: [What you hope to achieve]
Place research papers in the papers/ folder:
- Text files:
.txtformat (extracted paper content) - PDF files:
.pdfformat (requires PyPDF2)
# Install dependencies
make install
# Interactive setup
make setup
# Run exploration (default ITERATIONS=3)
make run
# Run with custom iterations
make run ITERATIONS=5
# Resume from iteration 3
make resume RESUME=3
# htop-like live monitor
make monitor
# Quick progress and outputs
make progress
make logspython main.pypython main.py --iterations 5python main.py --resume 3This will:
- Load outputs from iteration 2
- Feed them as feedback to agents
- Continue from iteration 3
tail -f agent_system.logRun a live terminal dashboard in a separate terminal while main.py is running:
python monitor.pyOptional refresh interval:
python monitor.py --refresh 0.5Press q to exit the visualizer.
Results are organized as follows:
outputs/
├── intern_1/
│ ├── iteration_1.txt
│ ├── iteration_2.txt
│ └── iteration_3.txt
├── intern_2/
│ └── ...
├── intern_3/
│ └── ...
├── mentor_1/
│ └── ...
├── mentor_2/
│ └── ...
├── supervisor/
│ ├── iteration_1.txt
│ ├── iteration_2.txt
│ └── iteration_3.txt
└── checkpoint_iteration_*.txt
Edit config.py to customize:
MODEL_NAME: Model to use ("gemini-1.5-flash"or"gemini-1.5-pro")TEMPERATURE: Creativity level (0.0-1.0, default 0.7)MAX_TOKENS: Maximum response length (default 2000)
DEFAULT_ITERATIONS: Number of iterations (default 3)INTERN_AGENTS: Number of interns (default 3)MENTOR_AGENTS: Number of mentors (default 2)
REQUEST_TIMEOUT: API timeout in seconds (default 30)RETRY_ATTEMPTS: Failed request retries (default 3)RETRY_DELAY: Delay between retries (default 2s)
VERBOSE_LOGGING: Enable debug logging (default True)SAVE_INTERMEDIATE_RESULTS: Save all outputs (default True)
- Role: Exploration and idea generation
- Input: Goal, papers, feedback from previous iteration
- Output: 3-5 novel research ideas with feasibility assessment
- Iteration: Gets refined feedback from mentors and supervisor
- Role: Evaluation and synthesis
- Input: All intern outputs from current iteration
- Output: Refined, synthesized research directions
- Quality Check: Merges ideas, removes redundancies, assesses rigor
- Role: Final judgment and direction setting
- Input: Both mentor outputs
- Output: Top priorities with actionable recommendations
- Assessment: Identifies hallucinations, inconsistencies
Iteration N:
Interns (input: Goal, Papers, Iteration N-1 Feedback)
↓ (outputs)
Mentors (input: All Intern outputs)
↓ (outputs)
Supervisor (input: Both Mentor outputs)
↓ (outputs) → Becomes feedback for Iteration N+1
The system automatically load-balances across API keys using round-robin:
export GOOGLE_AI_KEY_1="key1"
export GOOGLE_AI_KEY_2="key2"
export GOOGLE_AI_KEY_3="key3"
python main.pyWith 3 keys and 6 agents, the orchestrator will assign:
- Interns 1-2 → Key 1
- Intern 3 → Key 2
- Mentors 1-2 → Key 3 and Key 1
- Supervisor → Key 2
This distributes load and allows higher parallel throughput.
Papers are automatically summarized in agent prompts. To improve summarization:
- Create clean
.txtfiles with extracted content - Provide 1-2 page summaries instead of full papers
- Use clear headers for sections (e.g., "Abstract:", "Key Findings:")
To enable optional scoring of ideas, set in config.py:
USE_SCORING = True
SCORING_METRICS = ["novelty", "feasibility", "impact", "grounded"]The supervisor will assess ideas on these metrics.
Error: "No API keys found"
Solution:
export GOOGLE_AI_KEY_1="your-key"
python main.pyError: "Goal file not found"
Solution:
echo "Your research goal here" > goal.txtSolution: Increase REQUEST_TIMEOUT in config.py or reduce MAX_TOKENS.
Solution:
- Provide paper summaries (5-10 pages) instead of full papers
- Reduce
MAX_TOKENSinconfig.py - Use fewer papers per iteration
Error: "API rate limit exceeded"
Solution:
- Add more API keys to
GOOGLE_AI_KEY_*environment variables - Increase
RETRY_DELAYinconfig.py - Reduce
INTERN_AGENTSorMENTOR_AGENTS
If a run is interrupted:
- Check the log to identify the last completed iteration
- Resume from the next iteration:
python main.py --resume 3 # Resume from iteration 3
The system will load outputs from iteration 2 and use them as feedback.
# Setup
export GOOGLE_AI_KEY_1="key1"
export GOOGLE_AI_KEY_2="key2"
export GOOGLE_AI_KEY_3="key3"
# Create goal
echo "Explore efficient attention mechanisms for long-sequence processing" > goal.txt
# Add papers
cp my_papers/*.pdf papers/
cp my_papers/*.txt papers/
# Run 3 iterations
python main.py --iterations 3
# Monitor progress
tail -f agent_system.log
# View final results
cat outputs/supervisor/iteration_3.txtLook for:
- Novel ideas grounded in papers
- Clear descriptions with feasibility assessment
- Identification of knowledge gaps
Look for:
- Synthesis of related ideas
- Critical assessment of each idea
- Top 2-3 most promising directions
- Actionable next steps
This is the most important output. Look for:
- Top 3-5 priority research directions
- Quality assessment (Novelty, Feasibility, Impact)
- Specific recommendations for next iteration
- Overall research direction assessment
-
Hallucinations: LLM can generate plausible-sounding but incorrect ideas
- Supervisor explicitly checks for this
- Ground everything in provided papers
-
Paper Summarization: System auto-summarizes papers for brevity
- Provide extracted text rather than PDFs when possible
- Consider 5-10 page summaries of key papers
-
Context Window: Limited token budget per request
MAX_TOKENScontrols output length, not input- System automatically limits paper summaries
-
Determinism: Same inputs may produce different outputs across runs
- Set
TEMPERATUREto 0.0 for more deterministic results
- Set
- Clear Goal: Write a specific, well-defined research objective
- Curated Papers: Select 5-10 most relevant papers
- Iterative Refinement: Run 3-5 iterations for best results
- Multiple Keys: Use 2-3 API keys for better parallelization
- Monitoring: Watch logs for convergence and quality improvement
- Validation: Manually validate supervisor recommendations
- Speed: Use
gemini-1.5-flashinstead ofgemini-1.5-pro - Quality: Use
gemini-1.5-profor better analysis - Cost: Limit to 1-2 iterations per run
- Parallelism: Provide multiple API keys
If you use this system in your research, please cite:
@software{multi_agent_research_2024,
title={Multi-Agent LLM-Based Research Exploration System},
author={Your Name},
year={2024},
url={https://your-repo-url}
}[Specify your license here]
For issues or questions:
- Check
agent_system.logfor error details - Review configuration in
config.py - Ensure API keys are valid
- Check that goal.txt is properly formatted
- Web UI for monitoring and interaction
- Real-time result visualization
- Integration with research databases (arXiv, Scholar, etc.)
- Automated paper downloading
- Scoring and ranking system
- Export to research formats (LaTeX, Markdown)
- Integration with citation managers
- Collaborative multi-user mode