SynBO: Synthetic Bayesian Optimization for Reaction Condition Screening

SynBO (Synthetic Bayesian Optimization) is an intelligent reaction optimization framework that uses Bayesian Optimization to find optimal reaction conditions with minimal experimental effort.

🚩 Getting Started with SynBO & AutoClaw

Download AutoClaw Download and set up AutoClaw from the official release page: https://autoglm.z.ai/autoclaw/

Install SynBO Skills Install the SynBO skill package via SkillHub: https://skillhub.cn/skills/synbo When initiating the installation, provide the following input context to ensure proper setup:

Please check if the SkillHub store is already installed. If not, follow the guide at https://skillhub.cn/install/skillhub.md to install only the SkillHub CLI, then install the SynBO skill. If it is already installed, proceed directly to install the SynBO skill.

Optimize Reaction Conditions Launch AutoClaw and prompt it to optimize your reaction conditions using the integrated SynBO skills. Example usage:
```
Please optimize the reaction condition for [insert your reaction name].
```
AutoClaw will then guide you through the SynBO optimization process, providing recommendations and insights to improve your reaction outcomes.

📁 Example Project: Cobalt-Catalyzed Asymmetric Reaction

The examples/ directory contains a complete, runnable example of a cobalt-catalyzed reaction optimization with 5 reagent types and 2 objectives (yield + ee):

examples/
├── optimization_settings.json          # Optimization goals & settings
├── rxn_space/                          # Reaction space definitions
│   ├── alkali.csv                      #   9 alkali/additive options
│   ├── cobalt_catalyst.csv             #   8 Co-catalyst candidates
│   ├── organo_catalyst.csv             #   9 organocatalyst candidates
│   ├── oxidant.csv                     #   9 oxidant options
│   └── solvent.csv                     # 10 solvent options
├── descriptors/                        # RDKit molecular descriptors
│   ├── alkali_RDKit.csv
│   ├── cobalt_catalyst_RDKit.csv
│   ├── organo_catalyst_RDKit.csv
│   ├── oxidant_RDKit.csv
│   └── solvent_RDKit.csv
└── results/                            # Example optimization outputs
    ├── batch-0_20260420.csv            # Initial sampling results
    ├── batch-0_20260420.xlsx
    ├── batch-1_20260420.csv            # 1st optimization round results
    └── batch-1_20260420.xlsx

Reaction space size: 9 × 8 × 9 × 9 × 10 = 58,320 possible combinations

Step-by-Step Workflow

1. Define Your Reaction Space

Create CSV files for each reagent/condition type under rxn_space/. Each file must contain SMILES and name columns:

# rxn_space/solvent.csv
SMILES,name
ClCCl,DCM
CC#N,CH3CN
C1CCOC1,THF
...

2. Generate Molecular Descriptors

python scripts/get_desc.py --input rxn_space/solvent.csv --smiles-col 'SMILES' --name-col 'name'

Repeat for each reagent type. Outputs go to descriptors/{reagent}_RDKit.csv.

3. Define Optimization Goals

Create optimization_settings.json:

{
    "reagent_types": ["alkali", "cobalt_catalyst", "organo_catalyst", "oxidant", "solvent"],
    "opt_metrics": ["yield", "ee"],
    "opt_direct_info": [
        {"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
        {"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0}
    ]
}

4. Initialize — Generate First Batch

CLI:

python scripts/initialize.py --project-dir examples --batch-size 8 --sampling-method lhs

Python API:

from synbo import ReactionOptimizer
from synbo.utils import load_desc_dict

desc_dict, condition_dict = load_desc_dict(
    reagent_types=["alkali", "cobalt_catalyst", "organo_catalyst", "oxidant", "solvent"],
    desc_dir="examples/descriptors",
    name_suffix="_RDKit",
    index_col="name",
    return_condition_dict=True,
)

optimizer = ReactionOptimizer(
    opt_metrics=["yield", "ee"],
    opt_type="init",
    random_seed=42,
    save_dir="examples/results",
)
optimizer.load_rxn_space(condition_dict)
optimizer.load_desc(desc_dict)
optimizer.initialize(batch_size=8, sampling_method="lhs")
optimizer.save_results(filetype="excel")

5. Run Experiments & Record Results

Run the recommended experiments in the lab. Fill in the yield and ee columns in the output file (replace [exp_data] with actual measurements).

6. Optimize — Get the Next Batch

CLI:

python scripts/optimize.py --project-dir examples --batch-size 5

Python API:

from synbo.utils import get_prev_rxn

prev_data = get_prev_rxn("examples/results", "batch-*.csv")

optimizer = ReactionOptimizer(
    opt_metrics=["yield", "ee"],
    opt_type="auto",
    random_seed=42,
    save_dir="examples/results",
)
optimizer.load_rxn_space(condition_dict)
optimizer.load_desc(desc_dict)
optimizer.load_prev_rxn(prev_data)
optimizer.optimize(batch_size=5)
optimizer.save_results(filetype="excel")

7. Repeat Steps 5–6 Until Satisfactory Results

📊 Jupyter Notebook Demo

An interactive Jupyter notebook demonstrating the full optimization workflow with visualizations is available at examples/demo_optimization.ipynb. It covers:

Loading the example reaction space and descriptors
Running initialization and optimization rounds
Visualizing the Pareto front (yield vs ee trade-off)
Tracking optimization progress with Hypervolume metrics
Interpreting explore vs exploit recommendations

Run it: jupyter notebook examples/demo_optimization.ipynb

🚀 Quick Start

Installation

conda create -n synbo python=3.13 # if there is a conda on your computer
pip install synbo

Minimal Python Example

from synbo import ReactionOptimizer

optimizer = ReactionOptimizer(
    opt_metrics=['yield', 'ee'],
    opt_type='auto',
    random_seed=42
)

# Load reaction space
optimizer.load_rxn_space({
    'catalyst': ['Pd(OAc)2', 'Pd(PPh3)4', 'Pd2(dba)3'],
    'solvent': ['THF', 'Dioxane', 'Toluene', 'DMF', 'MeCN'],
    'base': ['Cs2CO3', 'K2CO3', 'NaOEt', 'DBU'],
    'temperature': [25, 50, 80, 100]
})

# Use OneHot encoding (auto-generated when no descriptors provided)
optimizer.load_desc()

# Initial sampling
optimizer.run(batch_size=8)
optimizer.save_results(filetype='csv')

# After experiments, load results and optimize
# optimizer.load_prev_rxn(pd.read_csv('results.csv'))
# optimizer.run(batch_size=5)

🧪 Python API Reference

ReactionOptimizer

optimizer = ReactionOptimizer(
    opt_metrics=["yield", "ee"],
    opt_metric_settings=[
        {"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
        {"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
    ],
    opt_type="auto",    # "init" | "opt" | "auto"
    random_seed=42,
    save_dir="./results",
)

Key Methods

Method	Description
`load_rxn_space(condition_dict)`	Load reaction space (all possible reagent combinations)
`load_desc(desc_dict=None)`	Load molecular descriptors (OneHot encoding used if None)
`load_prev_rxn(df)`	Load previous experimental results for optimization
`initialize(batch_size, sampling_method)`	Generate initial batch (LHS/Sobol/K-Means/Random)
`optimize(batch_size, constraints)`	Run Bayesian optimization to recommend next batch
`save_results(filetype)`	Save recommendations to CSV/Excel/JSON
`calculate_current_hv()`	Calculate current Hypervolume (multi-objective progress)
`calculate_hv_by_batch()`	Track Hypervolume across optimization rounds

📈 Understanding Optimization Results

Predictions with Uncertainty

Output files include predicted values with uncertainties:

batch	alkali	cobalt_catalyst	...	pred yield	pred ee	yield	ee
1	DBU	[Co]-5	...	62.35±3.12	85.20±2.87	[exp_data]	[exp_data]

pred yield / pred ee: Model prediction ± uncertainty
[exp_data]: Placeholder for your experimental results

Explore vs Exploit

EXPLORE: Testing new areas of the reaction space
EXPLOIT: Refining near known good results

Hypervolume Tracking

hv = optimizer.calculate_current_hv()
print(f"Progress: {hv['hv_normalized']*100:.1f}%")
history = optimizer.calculate_hv_by_batch()

⚙️ Advanced Features

Reaction Constraints

constraints = {"alkali": ["DBU"], "solvent": ["DMSO"]}
optimizer.optimize(batch_size=5, constraints=constraints)

Or use prohibited_reagent.json for automatic loading.

GPU Acceleration

SynBO auto-detects GPU. Force CPU: optimizer.optimize(batch_size=5, device="cpu")

Excel Output with Molecular Structures

optimizer.save_results(
    filetype="excel",
    figure_output=["cobalt_catalyst", "organo_catalyst"],
    figure_path="examples/figures",
)

🔧 CLI Quick Reference

synbo --version
synbo create-config -o my_config.json
synbo validate my_config.json
synbo init my_config.json -b 8 -m lhs -o results/
synbo optimize my_config.json results/batch-0.csv -b 5 -o results/

📦 Dependencies

Core: numpy, pandas, scikit-learn, torch, botorch | Chemistry: rdkit, epam.indigo | CLI: typer, rich | Viz: matplotlib, seaborn

See pyproject.toml for the complete list.

📧 Contact

Author: Zhenzhi Tan
Email: zhenzhi-tan@outlook.com

Name		Name	Last commit message	Last commit date
Latest commit History 568 Commits
benchmark		benchmark
docs		docs
examples		examples
reference		reference
scripts		scripts
src/synbo		src/synbo
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

SynBO: Synthetic Bayesian Optimization for Reaction Condition Screening

🚩 Getting Started with SynBO & AutoClaw

📁 Example Project: Cobalt-Catalyzed Asymmetric Reaction

Step-by-Step Workflow

1. Define Your Reaction Space

2. Generate Molecular Descriptors

3. Define Optimization Goals

4. Initialize — Generate First Batch

5. Run Experiments & Record Results

6. Optimize — Get the Next Batch

7. Repeat Steps 5–6 Until Satisfactory Results

📊 Jupyter Notebook Demo

🚀 Quick Start

Installation

Minimal Python Example

🧪 Python API Reference

ReactionOptimizer

Key Methods

📈 Understanding Optimization Results

Predictions with Uncertainty

Explore vs Exploit

Hypervolume Tracking

⚙️ Advanced Features

Reaction Constraints

GPU Acceleration

Excel Output with Molecular Structures

🔧 CLI Quick Reference

📦 Dependencies

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages