SynBO (Synthetic Bayesian Optimization) is an intelligent reaction optimization framework that uses Bayesian Optimization to find optimal reaction conditions with minimal experimental effort.
-
Download AutoClaw Download and set up AutoClaw from the official release page: https://autoglm.z.ai/autoclaw/
-
Install SynBO Skills Install the SynBO skill package via SkillHub: https://skillhub.cn/skills/synbo When initiating the installation, provide the following input context to ensure proper setup:
Please check if the SkillHub store is already installed. If not, follow the guide at https://skillhub.cn/install/skillhub.md to install only the SkillHub CLI, then install the SynBO skill. If it is already installed, proceed directly to install the SynBO skill.
-
Optimize Reaction Conditions Launch AutoClaw and prompt it to optimize your reaction conditions using the integrated SynBO skills. Example usage:
Please optimize the reaction condition for [insert your reaction name].AutoClaw will then guide you through the SynBO optimization process, providing recommendations and insights to improve your reaction outcomes.
The examples/ directory contains a complete, runnable example of a cobalt-catalyzed reaction optimization with 5 reagent types and 2 objectives (yield + ee):
examples/
├── optimization_settings.json # Optimization goals & settings
├── rxn_space/ # Reaction space definitions
│ ├── alkali.csv # 9 alkali/additive options
│ ├── cobalt_catalyst.csv # 8 Co-catalyst candidates
│ ├── organo_catalyst.csv # 9 organocatalyst candidates
│ ├── oxidant.csv # 9 oxidant options
│ └── solvent.csv # 10 solvent options
├── descriptors/ # RDKit molecular descriptors
│ ├── alkali_RDKit.csv
│ ├── cobalt_catalyst_RDKit.csv
│ ├── organo_catalyst_RDKit.csv
│ ├── oxidant_RDKit.csv
│ └── solvent_RDKit.csv
└── results/ # Example optimization outputs
├── batch-0_20260420.csv # Initial sampling results
├── batch-0_20260420.xlsx
├── batch-1_20260420.csv # 1st optimization round results
└── batch-1_20260420.xlsx
Reaction space size: 9 × 8 × 9 × 9 × 10 = 58,320 possible combinations
Create CSV files for each reagent/condition type under rxn_space/. Each file must contain SMILES and name columns:
# rxn_space/solvent.csv
SMILES,name
ClCCl,DCM
CC#N,CH3CN
C1CCOC1,THF
...python scripts/get_desc.py --input rxn_space/solvent.csv --smiles-col 'SMILES' --name-col 'name'Repeat for each reagent type. Outputs go to descriptors/{reagent}_RDKit.csv.
Create optimization_settings.json:
{
"reagent_types": ["alkali", "cobalt_catalyst", "organo_catalyst", "oxidant", "solvent"],
"opt_metrics": ["yield", "ee"],
"opt_direct_info": [
{"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
{"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0}
]
}CLI:
python scripts/initialize.py --project-dir examples --batch-size 8 --sampling-method lhsPython API:
from synbo import ReactionOptimizer
from synbo.utils import load_desc_dict
desc_dict, condition_dict = load_desc_dict(
reagent_types=["alkali", "cobalt_catalyst", "organo_catalyst", "oxidant", "solvent"],
desc_dir="examples/descriptors",
name_suffix="_RDKit",
index_col="name",
return_condition_dict=True,
)
optimizer = ReactionOptimizer(
opt_metrics=["yield", "ee"],
opt_type="init",
random_seed=42,
save_dir="examples/results",
)
optimizer.load_rxn_space(condition_dict)
optimizer.load_desc(desc_dict)
optimizer.initialize(batch_size=8, sampling_method="lhs")
optimizer.save_results(filetype="excel")Run the recommended experiments in the lab. Fill in the yield and ee columns in the output file (replace [exp_data] with actual measurements).
CLI:
python scripts/optimize.py --project-dir examples --batch-size 5Python API:
from synbo.utils import get_prev_rxn
prev_data = get_prev_rxn("examples/results", "batch-*.csv")
optimizer = ReactionOptimizer(
opt_metrics=["yield", "ee"],
opt_type="auto",
random_seed=42,
save_dir="examples/results",
)
optimizer.load_rxn_space(condition_dict)
optimizer.load_desc(desc_dict)
optimizer.load_prev_rxn(prev_data)
optimizer.optimize(batch_size=5)
optimizer.save_results(filetype="excel")An interactive Jupyter notebook demonstrating the full optimization workflow with visualizations is available at examples/demo_optimization.ipynb. It covers:
- Loading the example reaction space and descriptors
- Running initialization and optimization rounds
- Visualizing the Pareto front (yield vs ee trade-off)
- Tracking optimization progress with Hypervolume metrics
- Interpreting explore vs exploit recommendations
Run it:
jupyter notebook examples/demo_optimization.ipynb
conda create -n synbo python=3.13 # if there is a conda on your computer
pip install synbofrom synbo import ReactionOptimizer
optimizer = ReactionOptimizer(
opt_metrics=['yield', 'ee'],
opt_type='auto',
random_seed=42
)
# Load reaction space
optimizer.load_rxn_space({
'catalyst': ['Pd(OAc)2', 'Pd(PPh3)4', 'Pd2(dba)3'],
'solvent': ['THF', 'Dioxane', 'Toluene', 'DMF', 'MeCN'],
'base': ['Cs2CO3', 'K2CO3', 'NaOEt', 'DBU'],
'temperature': [25, 50, 80, 100]
})
# Use OneHot encoding (auto-generated when no descriptors provided)
optimizer.load_desc()
# Initial sampling
optimizer.run(batch_size=8)
optimizer.save_results(filetype='csv')
# After experiments, load results and optimize
# optimizer.load_prev_rxn(pd.read_csv('results.csv'))
# optimizer.run(batch_size=5)optimizer = ReactionOptimizer(
opt_metrics=["yield", "ee"],
opt_metric_settings=[
{"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
{"opt_direct": "max", "opt_range": [0, 100], "metric_weight": 1.0},
],
opt_type="auto", # "init" | "opt" | "auto"
random_seed=42,
save_dir="./results",
)| Method | Description |
|---|---|
load_rxn_space(condition_dict) |
Load reaction space (all possible reagent combinations) |
load_desc(desc_dict=None) |
Load molecular descriptors (OneHot encoding used if None) |
load_prev_rxn(df) |
Load previous experimental results for optimization |
initialize(batch_size, sampling_method) |
Generate initial batch (LHS/Sobol/K-Means/Random) |
optimize(batch_size, constraints) |
Run Bayesian optimization to recommend next batch |
save_results(filetype) |
Save recommendations to CSV/Excel/JSON |
calculate_current_hv() |
Calculate current Hypervolume (multi-objective progress) |
calculate_hv_by_batch() |
Track Hypervolume across optimization rounds |
Output files include predicted values with uncertainties:
| batch | alkali | cobalt_catalyst | ... | pred yield | pred ee | yield | ee |
|---|---|---|---|---|---|---|---|
| 1 | DBU | [Co]-5 | ... | 62.35±3.12 | 85.20±2.87 | [exp_data] | [exp_data] |
pred yield/pred ee: Model prediction ± uncertainty[exp_data]: Placeholder for your experimental results
- EXPLORE: Testing new areas of the reaction space
- EXPLOIT: Refining near known good results
hv = optimizer.calculate_current_hv()
print(f"Progress: {hv['hv_normalized']*100:.1f}%")
history = optimizer.calculate_hv_by_batch()constraints = {"alkali": ["DBU"], "solvent": ["DMSO"]}
optimizer.optimize(batch_size=5, constraints=constraints)Or use prohibited_reagent.json for automatic loading.
SynBO auto-detects GPU. Force CPU: optimizer.optimize(batch_size=5, device="cpu")
optimizer.save_results(
filetype="excel",
figure_output=["cobalt_catalyst", "organo_catalyst"],
figure_path="examples/figures",
)synbo --version
synbo create-config -o my_config.json
synbo validate my_config.json
synbo init my_config.json -b 8 -m lhs -o results/
synbo optimize my_config.json results/batch-0.csv -b 5 -o results/Core: numpy, pandas, scikit-learn, torch, botorch | Chemistry: rdkit, epam.indigo | CLI: typer, rich | Viz: matplotlib, seaborn
See pyproject.toml for the complete list.
- Author: Zhenzhi Tan
- Email: zhenzhi-tan@outlook.com