Design, rank, explain, and validate CRISPR guides β in one lightweight platform.
Transparent guide prioritization: interpretable on-target scoring, both-strand off-target analysis, and prime-editing support β without GPUs, cloud dependencies, or black-box predictions.
CI runs the test suite and renders a live UI screenshot (downloadable as a build artifact) on every push.
Rendered automatically by CI on every push β one Score per guide, with a per-feature Details breakdown.
| Feature | What it means | |
|---|---|---|
| π― | One Score | A single 0β100 prioritization number ranks each guide β no column soup. |
| π | Explainable | POST /api/explain shows the per-feature breakdown behind every score. |
| 𧬠| Both-strand off-targets | Vectorised NumPy scan + per-site CFD & MIT/Hsu + aggregate specificity. |
| π | Prime Editing Studio | PRIDICT2.0-informed pegRNA design (Spacer + RTT + PBS). |
| π | Peer-reviewed scoring | CRISPRscan weights reproduced verbatim & unit-validated β zero downloads. |
| π | Pluggable models | onnx β trained-linear β heuristic, auto-selected and reported. |
| β‘ | Lightweight | No GPU, no LLM keys, no DB β everything computed per request. |
cd crispr_app
pip install -r requirements.txt
uvicorn main:app --reloadβ‘οΈ Open http://127.0.0.1:8000
DNA sequence (paste or FASTA)
β
βΌ
Guide discovery ββ both strands, multi-PAM (NGG/NAG/NG/TTTV)
β
βΌ
On-target scoring ββ built-in model + CRISPRscan
β
βΌ
Off-target analysis ββ CFD + MIT/Hsu + aggregate specificity
β
βΌ
Ranking ββ one 0β100 Score per guide
β
βΌ
Explanation ββ per-feature breakdown (/api/explain)
curl -s -X POST http://127.0.0.1:8000/api/design \
-H 'Content-Type: application/json' \
-d '{"dna_sequence": "ATGGCCGAGTACAAGCCCACGGTGCGCCTCGCC...", "pam": "NGG"}'Real output (288 bp input β 21 guides found; model: linear, the shipped default):
| # | Guide (5β²β3β²) | PAM | Strand | GC% | Score |
|---|---|---|---|---|---|
| 1 | GATGTGGCGGTCCGGATCGA |
CGG | β | 65 | 74 |
| 2 | AAGGTGTGGGTCGCGGACGA |
CGG | + | 65 | 74 |
| 3 | ATCGACGGTGTGGCGCGTGG |
CGG | β | 70 | 69 |
POST /api/explain then returns the per-feature breakdown (GC, Tm, position-specific contributions) behind any guide's Score.
Each guide gets one Score, 0β100 (higher = better) β a relative prioritization score combining the on-target predictors, not a literal % editing rate. Color-coded:
| π’ High | π‘ Moderate | π΄ Low |
|---|---|---|
| β₯ 60 | 40 β 59 | < 40 |
Click Details on any guide to see why it scored that way (GC, Tm, position-specific featuresβ¦). Component sub-scores stay in the API/CSV for power users β never on screen.
Held-out Spearman Ο on real public datasets (full table + method in BENCHMARKS.md). The platform offers two tiers β a transparent built-in ranker (default) and an optional external deep-learning backend for maximum raw accuracy:
πͺΆ Built-in β lightweight, interpretable, zero setup
| Model | Ο | Notes |
|---|---|---|
| Shipped trained (default) | 0.22 β 0.41 | pooled human SpCas9, leave-one-dataset-out |
| Trained on your own data | 0.40 β 0.52 | one command β train.py |
| Heuristic (always available) | ~0.25 | fully interpretable fallback |
| CRISPRscan (peer-reviewed, validated) | 0.58 | on its home dataset |
π§ Optional β external deep-learning backend
| Model | Ο | Notes |
|---|---|---|
| ONNX (DeepSpCas9 / CRISPRon) | ~0.85 | bring your own export; auto-detected |
The built-in tier optimises for transparency and speed β its job is to rank candidates well enough to prioritise, with every score explainable. For maximum raw correlation, drop in a deep model via ONNX.
β οΈ Honesty note. No predictor can exceed the ~0.71β0.77 reproducibility ceiling of the wet-lab data itself; published state-of-the-art tops out around ~0.85β0.88. Our scores are deterministic surrogates for ranking; wet-lab validation remains essential.
cd crispr_app
python train.py dataset.csv # columns: guide,measured[,ngg_context]
# β writes models/linear.json; the API auto-loads it and reports model="linear"
python benchmark.py data.context.tab # measure Spearman on a CRISPOR-format setPosition-specific dinucleotide features roughly double Spearman on datasets with signal (chari2015 0.20β0.40, morenoMateos 0.17β0.43); gradient boosting matched ridge to Β±0.02, so we stay dependency-free.
Honest positioning β including where we're weaker. CRISPOR/CHOPCHOP are mature, genome-aware tools; our edge is transparent prioritization in a lightweight, API-first package.
| Capability | CRISPR Precision Studio | CRISPOR | CHOPCHOP | Benchling |
|---|---|---|---|---|
| Single explainable prioritization score | β | partialΒΉ | partialΒΉ | β |
| Per-feature score breakdown (API) | β | β | β | β |
| Both-strand off-target (CFD + MIT) | β | β (reference) | β | β |
| Genome-wide off-target search | β (background seq only) | β | β | β |
| Prime-editing pegRNA design | β | βΒ² | partial | β |
| JSON API-first | β | partial | β | β |
| Runs locally, no GPU / no keys | β | βΒ³ | βΒ³ | β (SaaS) |
ΒΉ Report several separate scores rather than one explained number. Β² CRISPOR targets Cas9/Cas12a guide design; pegRNA design is usually a separate tool (PrimeDesign / pegFinder). Β³ Open-source but heavier to self-host. Marks reflect typical usage and may change as those tools evolve.
Honest gap: genome-wide off-target scanning is the main capability CRISPOR/CHOPCHOP have that we don't β it's on the roadmap.
Browser (templates/index.html + static/app.js)
β JSON over fetch()
βΌ
FastAPI (main.py) βββΊ Pydantic validation + utils.validate_sequence
β
βΌ
Science layer
βββ scoring.py on-target efficiency (Doench RS2 / Azimuth-informed)
βββ crisprscan.py CRISPRscan (Moreno-Mateos 2015, verbatim)
βββ offtarget.py CFD + MIT/Hsu + aggregate specificity
βββ prime.py pegRNA design (PRIDICT2.0-informed)
βββ features.py / models.py / train.py pluggable + trainable models
βββ analysis.py pipeline + vectorised both-strand off-target search
β pandas DataFrame β JSON
βΌ
Browser renders one ranked table
| Method & route | Purpose |
|---|---|
GET /health |
liveness check |
POST /api/design |
ranked gRNAs with the ConsensusScore (the 0β100 Score) |
POST /api/offtargets |
per-site CFD/MIT hits + per-guide specificity summary |
POST /api/simulate |
protein / indel outcome of an edit |
POST /api/prime-design |
ranked pegRNAs (Spacer + RTT + PBS) |
POST /api/explain |
interpretable per-feature score breakdown |
POST /api/upload-fasta |
parse pasted FASTA / plain DNA |
GET /api/models |
active & available on-target backends |
For a target base substitution, prime.py enumerates and ranks candidate pegRNAs using determinants from PRIDICT2.0 (Mathis 2024) and Anzalone 2019:
- Spacer / nick. Scan NGG PAMs within ~30 nt of the target; place the Cas9 nick 3 bp 5β² of each PAM. Require the edit to fall 0β15 nt downstream of the nick.
- PBS (primer-binding site). Enumerate lengths 8β17 nt; the PBS is the reverse complement of the sequence immediately 5β² of the nick. Its nearest-neighbour Tm is optimised toward ~37 Β°C (Gaussian reward), with mild length penalties favouring ~13 nt.
- RTT (reverse-transcriptase template). Enumerate lengths 10β20 nt; the RTT encodes the edit and must retain β₯3 nt of 3β² homology past the edit for flap resolution. Penalties: RTT that begins with C (destabilises the edited flap) and RTT GC far from ~55%. Length term favours ~12 nt.
- Ranking. A calibrated logistic score blends the PBS Tm, PBS/RTT length terms, 3β²-homology constraint, RTT-starts-with-C penalty, and GC term into one 0β1
Score.
The pegRNA score is PRIDICT2.0-informed, not the trained PRIDICT2.0 network. It reproduces the published determinants for ranking; it has not yet been numerically benchmarked against a PRIDICT test set (on the roadmap). No secondary-structure (e.g. RNAfold) penalty is applied yet.
| Component | Model / source |
|---|---|
| On-target | Doench 2014/2016 Rule Set 2/Azimuth (Nat. Biotechnol. 34:184); CRISPRscan (Moreno-Mateos, Nat. Methods 2015) |
| Off-target (site) | CFD (Doench 2016) Β· MIT/Hsu (Hsu 2013, Nat. Biotechnol. 31:827) |
| Off-target (guide) | aggregate specificity 10000 / (100 + Ξ£ scores) (CRISPOR convention) |
| Prime editing | PRIDICT2.0 (Mathis 2024, doi:10.1038/s41587-024-02268-2); Anzalone 2019 (Nature 576:149) |
pip install pytest
python -m pytest tests/ -q # 35 passingCovers on-target scoring, CFD/MIT scoring, aggregate specificity, both-strand off-target detection, pegRNA design, the model registry & trainer, CRISPRscan reference-vector validation, performance, and dependency hygiene.
MIT licensed Β· No API keys required Β· Wet-lab validation always essential
