Skip to content

ChemBioHTP/PromisE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PromisE

Minimal standalone PromisE inference package (no dependency on upper-level project code).

1. Repository Layout

  • promisE/
    • Minimal model code (pl_trigo + mpnn + multitask head + wrapper)
    • Sequence embedder (ESM)
    • Inference entry (promisE/inference.py)
  • sample_input.json: example input
  • requirements.txt: minimal runtime dependencies
  • params/: checkpoint location (params/params.pt)

2. Model Checkpoint (GitHub Release)

The model checkpoint is not stored in git history because it is large (~375MB).

Place the file at:

  • params/params.pt

Recommended release asset name:

  • params.pt

Example (replace v0.1.0 with your release tag):

curl -L -o params/params.pt \
  https://github.com/ChemBioHTP/PromisE/releases/download/v0.1.0/params.pt

3. Input Format

Input JSON fields:

  • enzyme_sequence: protein sequence string
  • substrate_panels: list of substrate iso-smiles (multiple entries)
  • protein_structure: optional path string (accepted but not required by this minimal model)

Example (sample_input.json):

{
  "enzyme_sequence": "MNNRIT...",
  "substrate_panels": ["CC(=O)O", "CCO", "CC(C)O"],
  "protein_structure": "/path/to/optional_structure.pdb"
}

4. Install

pip install -r requirements.txt

5. Run Inference

python -m promisE.inference \
  --input sample_input.json \
  --output predictions.json

Optional flags:

  • --device cuda or --device cpu
  • --cache-dir ./esm_cache_msi_local
  • --esm-model esm2_t30_150M_UR50D
  • --checkpoint /custom/path/params.pt

6. Output

predictions.json includes:

  • Panel-level summary (msi_mean/std, log_kcat_mean/std)
  • Per-substrate predictions:
    • pred_msi
    • pred_log_kcat

About

A deep learning models for enzyme multispecificity prediction

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages