imspy-predictors: NCE calibration + native fine-tuning by theGreatHerrLebert · Pull Request #395 · theGreatHerrLebert/rustims

theGreatHerrLebert · 2026-05-18T15:27:43Z

Merges the fix/imspy-predictors-im-refit work plus the NCE calibration fix.

NCE calibration (3302d070)

New calibrate_nce(): sweeps an absolute NCE over high-confidence target PSMs and returns the value maximizing mean spectral angle — one NCE per run. The model conditions on a per-run NCE scalar (fine-tuned on collision_energy_aligned_normed), so calibration is an absolute sweep, not an offset on the observed collision energy.
predict_intensities_prosit uses calibrate_nce; sets collision_energy_calibrated to the absolute best NCE.
get_collision_energy_calibration_factor kept as a deprecated compat wrapper; fixes the bug where it calibrated on the unmodified sequence while the real prediction used sequence_modified.

Fine-tuning fixes (9981465c, 86857fba, f7ac8805, e9c6c18e)

Native intensity fine-tuning; per-epoch fine-tune history; out-of-range charge filtering in IM simulate/fine-tune; fine-tuning output fixes.

simulate_ion_mobilities builds a charge one-hot with num_classes=4 (charges 1..4). Inputs outside that range trigger F.one_hot's index assertion — silent on CPU, hard CUDA crash on GPU (ScatterGatherKernel idx_dim < index_size). Filter such rows before the one_hot, predict on the valid subset, and NaN-pad invalid positions in the output array. Emit a RuntimeWarning so callers see how many were skipped. Charges of 5+ leak through sage matching at ~0.15% on HeLa data even with precursor_charge=[2,4], which is enough to take down a GPU run.

Three small additions to the fine-tune training loops in imspy_predictors: - rt/predictors.py: also accumulate train_loss across train batches, store {epochs, train_loss, val_loss} on self._finetune_history. - ccs/predictors.py: same for CCS/IM fine_tune_model. Also drop training rows whose charge is outside the model's [1, 4] one-hot domain before constructing the charge tensor; same root cause as the earlier simulate_ion_mobilities filter (CUDA assertion on charge=5+ PSMs that leak through sage matching). - intensity/predictors.py: same train_loss accumulation and history capture for the native intensity fine-tune loop. The history dict is the shape sagepy-rescore's report.py expects so it can render per-head loss curves and improvement-vs-epoch-0 panels for the sagepy-rescore HTML report.

Add calibrate_nce(): sweeps absolute NCE over high-confidence target PSMs and returns the value maximizing mean spectral angle -- one NCE per run. The model conditions on a per-run NCE scalar (fine-tuned on collision_energy_aligned_normed, domain ~7-43), so calibration must be an absolute sweep, not an offset added to the observed collision energy. - predict_intensities_prosit: use calibrate_nce; set collision_energy_calibrated to the absolute best NCE instead of collision_energy + offset. - get_collision_energy_calibration_factor: kept as a deprecated compat wrapper over calibrate_nce. Fixes the bug where it calibrated on the unmodified sequence while the real prediction used sequence_modified.

theGreatHerrLebert added 5 commits May 6, 2026 17:30

Fix imspy predictor fine-tuning outputs

e9c6c18

Add native intensity fine-tuning

f7ac880

theGreatHerrLebert merged commit f4c06ea into main May 18, 2026
2 checks passed

theGreatHerrLebert mentioned this pull request May 19, 2026

Merge feature/predictor-finetune: native Chronologer RT adapter + per-task fine-tuning #397

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imspy-predictors: NCE calibration + native fine-tuning#395

imspy-predictors: NCE calibration + native fine-tuning#395
theGreatHerrLebert merged 5 commits into
mainfrom
fix/nce-calibration

theGreatHerrLebert commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

theGreatHerrLebert commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant