Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
4914e3b
feat(internal)!: remove nnparams.go and update thermo scoring model
ericksamera May 14, 2026
6a542a1
feat(internal)!: add NN structure model with breaking changes
ericksamera May 15, 2026
06001b1
feat(core)!: introduce imperfect duplex model for primer-template bin…
ericksamera May 15, 2026
7f5dc1f
feat(core)!: enhance thermodynamic model with terminal and dangling ends
ericksamera May 15, 2026
deddcb2
feat(core)!: introduce V2 structure models and remove legacy functions
ericksamera May 15, 2026
b412d30
feat(internal)!: add IUPAC thermodynamics support for primer scoring
ericksamera May 15, 2026
a97920a
feat(internal)!: add probe thermodynamics support in scoring model
ericksamera May 15, 2026
8b0c927
feat(docs)!: add thermodynamic models and release checklist documenta…
ericksamera May 15, 2026
ecc4270
feat(core): add golden tests for thermodynamic models
ericksamera May 16, 2026
7f9b78f
feat(core)!: introduce curated pair-family mismatch parameters
ericksamera May 16, 2026
e1b8dc3
feat(core): add partitioning support for secondary structure evaluation
ericksamera May 16, 2026
4d196a1
feat(core)!: introduce triplet-level mismatch scoring and deprecate o…
ericksamera May 16, 2026
a183b5f
feat(core)!: enhance mismatch handling with provenance metadata
ericksamera May 16, 2026
c8db15d
feat(core): add SantaLucia-Hicks terminal dangling-end parameters
ericksamera May 16, 2026
171451c
feat(core)!: add terminal mismatch parameters to thermodynamics
ericksamera May 16, 2026
37e0b60
fix(core): simplify terminal penalty calculation logic
ericksamera May 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 24 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
**ipcr** is a fast, streaming, IUPAC-aware in-silico PCR toolkit for large (including .gzipped) references.
It finds amplicons from primer pairs under a mismatch model with a **3′ terminal window**, supports **internal probes**, **nested PCR**, **multiplex panels**, circular templates, and emits **TSV**, **FASTA**, **JSON**, or **JSONL**.

`ipcr-thermo` provides nearest-neighbor-informed ranking with explicit approximation metadata. It is a thermodynamically informed in-silico PCR ranker, not a complete PCR kinetics simulator. See [Thermodynamic models, score profiles, and release claims](./docs/THERMO_MODELS.md).

---

- **Fast & parallel**: multi-threaded seeded scanner with per-hit verification.
Expand All @@ -17,6 +19,7 @@ It finds amplicons from primer pairs under a mismatch model with a **3′ termin
- **Pretty mode**: readable ASCII alignment blocks.
- **Deterministic**: `--sort` gives stable order; JSON/JSONL use versioned, stable wire schemas.
- **Good UX**: cancelable I/O (Ctrl-C → exit 130), consistent warnings (gated by `--quiet`), clear validation errors.
- **Transparent thermo modes**: NN models, score profiles, salt/DNTp conditions, IUPAC expansion, structure, and probe terms are exposed as output metadata when enabled.

---

Expand All @@ -28,7 +31,7 @@ It finds amplicons from primer pairs under a mismatch model with a **3′ termin
| `ipcr-probe` | ipcr + **internal probe** annotation & filtering | qPCR/TaqMan-style assays |
| `ipcr-nested` | **Nested PCR**: outer amplicon + inner scan | Two-round/nested assays |
| `ipcr-multiplex` | Panels from TSV or **pooled inline** primers | Screens / large panels |
| `ipcr-thermo` | Thermodynamically-informed scoring & ranking | Ranking / assay robustness |
| `ipcr-thermo` | Thermodynamically informed scoring & ranking | Ranking / assay robustness |

---

Expand Down Expand Up @@ -137,7 +140,7 @@ ipcr-multiplex \
}
```

### Thermodynamically-informed :
### Thermodynamically informed ranking:

```bash
# With multiplex primer pool described by Xiong (2017); DOI: 10.3389/fmicb.2017.00420
Expand All @@ -158,6 +161,25 @@ Salmonella-Enteritidis NZ_CP025559.1 O1+O2 1853303 1854185 882 revcomp 0 0 -137.

---

## Thermodynamic scoring scope

`ipcr-thermo` has multiple thermodynamic implementation modes and empirical score profiles. Use the mode/profile labels in output metadata rather than treating all scores as one universal scale.

Common modes and profiles:

| Setting | Meaning |
| ------------------ | -------------------------------------------------------- |
| `legacy-heuristic` | Historical compatibility path. |
| `nn-duplex-v1` | Nearest-neighbor primer-template duplex scoring. |
| `nn-structure-v1` | NN duplex scoring plus primer hairpin/dimer competition. |
| `binding` | Primer-template binding rank. |
| `pcr` | Binding plus extension and length proxy. |
| `gel` | PCR proxy plus band-mass proxy. |

The `pcr` and `gel` profiles are useful empirical rankers, but they are not full polymerase kinetics or quantitative gel-intensity models. Modified probes such as MGB probes are not fully calibrated; use `--probe-score-mode annotate` or `--probe-thermo=false` unless a calibrated modifier model is available.

See [docs/THERMO_MODELS.md](./docs/THERMO_MODELS.md) for the model matrix and fallback labels, and [docs/THERMO_RELEASE_CHECKLIST.md](./docs/THERMO_RELEASE_CHECKLIST.md) for release/smoke-test guidance.

## Inputs

- **Inline primers**: `-f/--forward`, `-r/--reverse` (5′→3′; IUPAC allowed).
Expand Down
190 changes: 189 additions & 1 deletion core/engine/product.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,196 @@ type Product struct {
// Optional amplicon sequence
Seq string `json:"seq,omitempty"`

// NEW: optional score (thermo / realistic mode)
// Optional score (thermo / realistic mode). Higher is better. The numeric
// meaning depends on the selected thermo model; see Thermo for components.
Score float64 `json:"score,omitempty"`

// Optional thermodynamic score components. Populated by ipcr-thermo NN modes.
Thermo *ThermoDetails `json:"thermo,omitempty"`

SourceFile string `json:"source_file"`
}

// ThermoDetails contains interpretable thermodynamic score components for a
// product. It is intentionally model-labelled because legacy heuristic scores
// and NN-derived scores are not numerically comparable.
type ThermoDetails struct {
Model string `json:"model"`
SaltModel string `json:"salt_model"`
NaM float64 `json:"na_m,omitempty"`
MgM float64 `json:"mg_m,omitempty"`
DntpM float64 `json:"dntp_m,omitempty"`
EffectiveNaM float64 `json:"effective_na_m,omitempty"`
FreeMgM float64 `json:"free_mg_m,omitempty"`
AnnealTempC float64 `json:"anneal_temp_c"`
IUPACPolicy string `json:"iupac_policy"`
IUPACThermoPolicy string `json:"iupac_thermo_policy,omitempty"`
IUPACExpansionCount int `json:"iupac_expansion_count,omitempty"`
IUPACExpansionCapped bool `json:"iupac_expansion_capped,omitempty"`
IUPACEffectiveVariant string `json:"iupac_effective_variant,omitempty"`
IUPACVariants []ThermoVariant `json:"iupac_variants,omitempty"`
MismatchPolicy string `json:"mismatch_policy"`
StructurePolicy string `json:"structure_policy,omitempty"`
ScoreProfile string `json:"score_profile,omitempty"`
ScoreC float64 `json:"score_c"`
BaseScoreC float64 `json:"base_score_c,omitempty"`
AmpliconAdjustmentC float64 `json:"amplicon_adjustment_c,omitempty"`
ExtensionLogit float64 `json:"extension_logit,omitempty"`
ExtensionBonusC float64 `json:"extension_bonus_c,omitempty"`
LengthPenaltyC float64 `json:"length_penalty_c,omitempty"`
BandMassBonusC float64 `json:"band_mass_bonus_c,omitempty"`
StructurePenaltyC float64 `json:"structure_penalty_c,omitempty"`
LimitingSide string `json:"limiting_side"`
Fwd ThermoEndpoint `json:"fwd"`
Rev ThermoEndpoint `json:"rev"`
Probe *ProbeThermoDetails `json:"probe,omitempty"`
WorstHairpin *ThermoStructure `json:"worst_hairpin,omitempty"`
WorstSelfDimer *ThermoStructure `json:"worst_self_dimer,omitempty"`
CrossDimer *ThermoStructure `json:"cross_dimer,omitempty"`
PanelCrossDimer *ThermoStructure `json:"panel_cross_dimer,omitempty"`
PanelCrossDimerPenaltyC float64 `json:"panel_cross_dimer_penalty_c,omitempty"`
PanelCrossDimerBurdenC float64 `json:"panel_cross_dimer_burden_c,omitempty"`
PanelCrossDimerCount int `json:"panel_cross_dimer_count,omitempty"`
}

// ThermoVariant summarizes one concrete A/C/G/T expansion of a degenerate
// primer pair under an IUPAC thermodynamics policy. It is populated only for
// enumerate mode to keep ordinary JSON output compact.
type ThermoVariant struct {
FwdPrimer string `json:"fwd_primer"`
RevPrimer string `json:"rev_primer"`
ScoreC float64 `json:"score_c"`
BaseScoreC float64 `json:"base_score_c,omitempty"`
StructurePenaltyC float64 `json:"structure_penalty_c,omitempty"`
LimitingSide string `json:"limiting_side,omitempty"`
FwdTmC float64 `json:"fwd_tm_c,omitempty"`
RevTmC float64 `json:"rev_tm_c,omitempty"`
FwdMarginC float64 `json:"fwd_margin_c,omitempty"`
RevMarginC float64 `json:"rev_margin_c,omitempty"`
}

// ProbeThermoDetails contains internal-probe annotation plus NN probe-target
// thermodynamics. It is populated by ipcr-thermo when --probe is supplied and
// probe thermodynamics are enabled.
type ProbeThermoDetails struct {
Name string `json:"name"`
Seq string `json:"seq"`
Found bool `json:"found"`
Strand string `json:"strand,omitempty"`
Pos int `json:"pos,omitempty"`
MM int `json:"mm,omitempty"`
Site string `json:"site,omitempty"`
ScoreMode string `json:"score_mode"`
MinMarginC float64 `json:"min_margin_c,omitempty"`
ScoreContributionC float64 `json:"score_contribution_c,omitempty"`
GatePenaltyC float64 `json:"gate_penalty_c,omitempty"`
IUPACThermoPolicy string `json:"iupac_thermo_policy,omitempty"`
IUPACExpansionCount int `json:"iupac_expansion_count,omitempty"`
IUPACExpansionCapped bool `json:"iupac_expansion_capped,omitempty"`
IUPACEffectiveVariant string `json:"iupac_effective_variant,omitempty"`
TmC float64 `json:"tm_c,omitempty"`
AnnealMarginC float64 `json:"anneal_margin_c,omitempty"`
DeltaGAtAnnealKcal float64 `json:"delta_g_at_anneal_kcal,omitempty"`
MismatchPenaltyC float64 `json:"mismatch_penalty_c,omitempty"`
MismatchDeltaGKcal float64 `json:"mismatch_delta_g_kcal,omitempty"`
MismatchCount int `json:"mismatch_count,omitempty"`
MismatchFallbackCount int `json:"mismatch_fallback_count,omitempty"`
MismatchTripletCount int `json:"mismatch_triplet_count,omitempty"`
MismatchCuratedPairCount int `json:"mismatch_curated_pair_count,omitempty"`
MismatchSources []string `json:"mismatch_sources,omitempty"`
MismatchParameterSets []string `json:"mismatch_parameter_sets,omitempty"`
MismatchCitations []string `json:"mismatch_citations,omitempty"`
MismatchParameterNotes []string `json:"mismatch_parameter_notes,omitempty"`
TerminalMismatchPenaltyC float64 `json:"terminal_mismatch_penalty_c,omitempty"`
TerminalMismatchDeltaGKcal float64 `json:"terminal_mismatch_delta_g_kcal,omitempty"`
TerminalMismatchCount int `json:"terminal_mismatch_count,omitempty"`
FivePrimeTerminalMismatchCount int `json:"five_prime_terminal_mismatch_count,omitempty"`
ThreePrimeTerminalMismatchCount int `json:"three_prime_terminal_mismatch_count,omitempty"`
TerminalMismatchSources []string `json:"terminal_mismatch_sources,omitempty"`
TerminalMismatchParameterSets []string `json:"terminal_mismatch_parameter_sets,omitempty"`
TerminalMismatchCitations []string `json:"terminal_mismatch_citations,omitempty"`
TerminalMismatchParameterNotes []string `json:"terminal_mismatch_parameter_notes,omitempty"`
MismatchPolicy string `json:"mismatch_policy,omitempty"`
HasNonWatsonCrick bool `json:"has_non_watson_crick,omitempty"`
UsedHeuristicAdjust bool `json:"used_heuristic_adjust,omitempty"`
}

// ThermoEndpoint describes one primer-template endpoint in 5'→3' primer
// coordinates. DeltaGAtAnnealKcal is an effective two-state binding term at the
// configured annealing temperature; negative values are favorable.
type ThermoEndpoint struct {
Side string `json:"side"`
TmC float64 `json:"tm_c"`
AnnealMarginC float64 `json:"anneal_margin_c"`
DeltaGAtAnnealKcal float64 `json:"delta_g_at_anneal_kcal"`
MismatchPenaltyC float64 `json:"mismatch_penalty_c"`
MismatchDeltaGKcal float64 `json:"mismatch_delta_g_kcal,omitempty"`
TerminalMismatchPenaltyC float64 `json:"terminal_mismatch_penalty_c,omitempty"`
TerminalMismatchDeltaGKcal float64 `json:"terminal_mismatch_delta_g_kcal,omitempty"`
DanglingEndAdjustmentC float64 `json:"dangling_end_adjustment_c,omitempty"`
DanglingEndDeltaGKcal float64 `json:"dangling_end_delta_g_kcal,omitempty"`
DanglingEndCount int `json:"dangling_end_count,omitempty"`
MismatchCount int `json:"mismatch_count,omitempty"`
FivePrimeMismatchCount int `json:"five_prime_mismatch_count,omitempty"`
ThreePrimeMismatchCount int `json:"three_prime_mismatch_count,omitempty"`
FivePrimeTerminalMismatchCount int `json:"five_prime_terminal_mismatch_count,omitempty"`
ThreePrimeTerminalMismatchCount int `json:"three_prime_terminal_mismatch_count,omitempty"`
TerminalMismatchCount int `json:"terminal_mismatch_count,omitempty"`
FivePrimeTerminalMismatchPenaltyC float64 `json:"five_prime_terminal_mismatch_penalty_c,omitempty"`
ThreePrimeTerminalMismatchPenaltyC float64 `json:"three_prime_terminal_mismatch_penalty_c,omitempty"`
MismatchFallbackCount int `json:"mismatch_fallback_count,omitempty"`
MismatchTripletCount int `json:"mismatch_triplet_count,omitempty"`
MismatchCuratedPairCount int `json:"mismatch_curated_pair_count,omitempty"`
MismatchSources []string `json:"mismatch_sources,omitempty"`
MismatchParameterSets []string `json:"mismatch_parameter_sets,omitempty"`
MismatchCitations []string `json:"mismatch_citations,omitempty"`
MismatchParameterNotes []string `json:"mismatch_parameter_notes,omitempty"`
TerminalMismatchSources []string `json:"terminal_mismatch_sources,omitempty"`
TerminalMismatchParameterSets []string `json:"terminal_mismatch_parameter_sets,omitempty"`
TerminalMismatchCitations []string `json:"terminal_mismatch_citations,omitempty"`
TerminalMismatchParameterNotes []string `json:"terminal_mismatch_parameter_notes,omitempty"`
EffectiveDenomCalK float64 `json:"effective_denom_cal_per_k_mol"`
MismatchPolicy string `json:"mismatch_policy"`
EndEffectPolicy string `json:"end_effect_policy,omitempty"`
HasNonWatsonCrick bool `json:"has_non_watson_crick"`
UsedHeuristicAdjust bool `json:"used_heuristic_adjust"`
}

// ThermoStructure describes a primer secondary-structure candidate used by
// nn-structure-v1. PenaltyC is the °C-equivalent competition penalty applied to
// the final score.
type ThermoStructure struct {
Kind string `json:"kind"`
Model string `json:"model,omitempty"`
QueryA string `json:"query_a,omitempty"`
QueryB string `json:"query_b,omitempty"`
DeltaGAtAnnealKcal float64 `json:"delta_g_at_anneal_kcal"`
TmC float64 `json:"tm_c"`
AnnealMarginC float64 `json:"anneal_margin_c"`
StemLen int `json:"stem_len"`
LoopLen int `json:"loop_len,omitempty"`
AStart int `json:"a_start"`
AEnd int `json:"a_end"`
BStart int `json:"b_start"`
BEnd int `json:"b_end"`
ThreePrimeAnchored bool `json:"three_prime_anchored"`
BothThreePrimeAnchor bool `json:"both_three_prime_anchor,omitempty"`
SegmentCount int `json:"segment_count,omitempty"`
BulgeCount int `json:"bulge_count,omitempty"`
InternalLoopCount int `json:"internal_loop_count,omitempty"`
DanglingEndCount int `json:"dangling_end_count,omitempty"`
LoopPenaltyKcal float64 `json:"loop_penalty_kcal,omitempty"`
BulgePenaltyKcal float64 `json:"bulge_penalty_kcal,omitempty"`
InternalLoopPenaltyKcal float64 `json:"internal_loop_penalty_kcal,omitempty"`
StructureDanglingDeltaGKcal float64 `json:"structure_dangling_delta_g_kcal,omitempty"`
EnsembleDeltaGAtAnnealKcal float64 `json:"ensemble_delta_g_at_anneal_kcal,omitempty"`
PartitionFunction float64 `json:"partition_function,omitempty"`
EnsembleWeight float64 `json:"ensemble_weight,omitempty"`
EnsembleCandidateCount int `json:"ensemble_candidate_count,omitempty"`
DPCellCount int `json:"dp_cell_count,omitempty"`
DPStateCount int `json:"dp_state_count,omitempty"`
DPExpectedPairs float64 `json:"dp_expected_pairs,omitempty"`
DPMFEDeltaGAtAnnealKcal float64 `json:"dp_mfe_delta_g_at_anneal_kcal,omitempty"`
DPEnsembleDeltaGAtAnnealKcal float64 `json:"dp_ensemble_delta_g_at_anneal_kcal,omitempty"`
PenaltyC float64 `json:"penalty_c,omitempty"`
}
19 changes: 17 additions & 2 deletions core/oligo/oligo.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,9 @@ func BestHit(amplicon, probe string, maxMM int) Hit {
prbB := []byte(prb)
rcB := primer.RevComp(prbB)

// Exact match fast-path
if maxMM == 0 {
// Exact match fast-path. Keep it only for strict A/C/G/T probes; degenerate
// probes must go through the IUPAC-aware matcher even when maxMM is zero.
if maxMM == 0 && isStrictACGT(prb) {
if i := strings.Index(amp, prb); i >= 0 {
return Hit{Found: true, Strand: "+", Pos: i, MM: 0, Site: amp[i : i+len(prb)]}
}
Expand Down Expand Up @@ -68,3 +69,17 @@ func BestHit(amplicon, probe string, maxMM int) Hit {
}
return best
}

func isStrictACGT(s string) bool {
if s == "" {
return false
}
for i := 0; i < len(s); i++ {
switch s[i] {
case 'A', 'C', 'G', 'T':
default:
return false
}
}
return true
}
7 changes: 7 additions & 0 deletions core/oligo/oligo_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,10 @@ func TestBestHit(t *testing.T) {
t.Fatalf("expected a hit on RC with mismatches")
}
}

func TestBestHitDegenerateProbeExactUsesIUPACMatcher(t *testing.T) {
h := BestHit("AAAGACCC", "GAY", 0)
if !h.Found || h.Strand != "+" || h.Pos != 3 || h.Site != "GAC" {
t.Fatalf("unexpected degenerate hit: %+v", h)
}
}
Loading
Loading