| title | Lead Optimization Agent |
|---|---|
| emoji | 🧬 |
| colorFrom | green |
| colorTo | blue |
| sdk | streamlit |
| sdk_version | 1.35.0 |
| app_file | app.py |
| pinned | false |
| short_description | AI agent for iterative drug lead optimization with RDKit |
Try the live demo on HuggingFace Spaces
Drug lead optimization is one of the most expensive and slow phases of drug discovery. A medicinal chemist starts with a promising molecule and must iteratively explore chemical space — proposing analogues, synthesizing them, measuring ADMET properties (absorption, distribution, metabolism, excretion, toxicity), and deciding whether to continue or pivot.
This process is:
- Manual and slow — each analogue requires expert chemical intuition to design
- Disconnected — scoring tools, structure editors, and decision notes live in separate places
- Opaque — it's hard to audit why a particular direction was pursued across dozens of iterations
- Expensive to iterate — without fast local scoring, every feedback loop requires wet-lab or commercial prediction services
There is no lightweight tool that closes the loop between AI-assisted structural reasoning, fast local property scoring, and scientist oversight — all in one interface.
Lead Optimization Agent is an AI-native iterative sandbox for medicinal chemistry exploration. Given a starting molecule and a plain-English optimization goal, the agent proposes one structural change at a time, scores each candidate locally using RDKit, and surfaces results for the scientist to review, accept, or redirect.
The scientist stays in control. The agent accelerates the ideation and scoring cycle.
Scientist defines goal (brief)
│
▼
Agent proposes structural edit ←──────────────────┐
│ │
▼ │
RDKit scores candidate locally │
(QED, BBB, CNS MPO, solubility, SA score) │
│ │
▼ │
Attempt logged with rationale + property delta │
│ │
▼ │
Scientist reviews: accept / redirect / stop ────────┘
│
▼
Best candidate surfaced with full audit trail
Agent (Claude via Anthropic API) — handles the chemistry reasoning layer:
- reads the current molecule and property scores
- proposes the next structural edit and explains the logic
- does not perform property calculations
Local scoring (RDKit in agent_utils.py) — deterministic, fast, no API call:
- QED (drug-likeness)
- Lipinski rule-of-five summary
- BBB permeability heuristic
- CNS MPO score
- Aqueous solubility estimate
- GI absorption heuristic
- Structural alerts
- Synthetic accessibility (SA) heuristic
This separation means chemistry scoring is reproducible and free; the LLM is used only for reasoning about what change to try next.
The Streamlit app presents each iteration as a candidate card with:
- 2D structure with the changed region highlighted
- plain-English change summary
- per-property deltas vs. the starting molecule
Candidate Journeytab for attempt-by-attempt reviewPerformance Overviewtab with trajectory plots and a start-vs-best radar chart
git clone https://github.com/mondalsou/lead_optimization_agent.git
cd lead_optimization_agent
pip install -r requirements.txt
# if RDKit fails via pip: conda install -c conda-forge rdkit
export ANTHROPIC_API_KEY=sk-ant-...
streamlit run app.pyOpen http://localhost:8501, pick a preset or paste your own SMILES, write the optimization brief, and click Run Optimisation.
lead_optimization_agent/
├── app.py # Streamlit UI + agent orchestration
├── agent_utils.py # RDKit scoring, SMILES validation, helpers
├── requirements.txt
├── candidates.json
├── saved_runs/ # Local run persistence (JSON)
└── notebooks/
├── 01_admet_tool.ipynb
├── 02_agent_loop.ipynb
└── 03_visualization.ipynb
Atenolol → Brain PenetrationAspirin → CNS Drug ProfileIbuprofen → Aqueous SolubilityCustom molecule
- Heuristic prototyping tool, not a validated drug-discovery platform
- Property outputs are local approximations, not experimental measurements
- Agent suggestions should be reviewed by a domain expert before serious decisions