Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,9 +257,18 @@ pip install -e ./imspy-vis

### Docker

Pre-built images available for reproducible environments:
- [AMD64](https://github.com/MatteoLacki/rustims_docker/raw/refs/heads/main/release.zip)
- [ARM64](https://github.com/MatteoLacki/rustims_docker/raw/refs/heads/main/release_arm64.zip)
Build from the included Dockerfile (CUDA 12.4, Python 3.12):

```bash
# Build the image
docker build -t rustims .

# Verify GPU support
docker run --rm --gpus all rustims python -c "import torch; print(torch.cuda.is_available())"

# Run a simulation
docker run --rm --gpus all -v /data:/workspace rustims timsim /workspace/config.toml
```

## Documentation

Expand Down
22 changes: 19 additions & 3 deletions packages/imspy-predictors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,24 @@ pip install imspy-predictors[koina]
- **Retention Time Prediction**: GRU-based retention time predictors
- **Fragment Intensity Prediction**: Prosit 2023 timsTOF intensity predictor
- **Charge State Prediction**: Binomial and deep learning charge state distribution models
- **Koina Integration**: Access remote prediction models via Koina servers (optional)
- **Koina Integration**: Access remote prediction models via [Koina](https://koina.wilhelmlab.org) servers (optional)

### Available Koina Remote Models

When using `pip install imspy-predictors[koina]`, remote models can be configured in TimSim's `[models]` TOML section:

| Task | Model Name |
|------|-----------|
| **RT** | `Deeplc_hela_hf`, `Chronologer_RT`, `AlphaPeptDeep_rt_generic`, `Prosit_2019_irt` |
| **CCS** | `AlphaPeptDeep_ccs_generic`, `IM2Deep` |
| **Intensity** | `prosit`, `alphapeptdeep`, `ms2pip` |

```toml
[models]
rt_model = "AlphaPeptDeep_rt_generic"
ccs_model = "" # "" = local PyTorch model
intensity_model = "prosit"
```

## Quick Start

Expand Down Expand Up @@ -53,8 +70,7 @@ intensity_model = Prosit2023TimsTofWrapper()
## Dependencies

- **imspy-core**: Core data structures (required)
- **TensorFlow**: Deep learning framework (required)
- **dlomix**: Deep learning for omics (required)
- **PyTorch**: Deep learning framework (required)
- **koinapy**: Koina API client (optional, for remote models)

## Optional Dependencies
Expand Down
36 changes: 35 additions & 1 deletion packages/imspy-simulation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,19 @@ For search integration (validation workflows):
pip install imspy-simulation[search]
```

For KOINA remote model support (optional):

```bash
pip install imspy-predictors[koina]
```

## Features

- **Frame Builders**: DIA and DDA frame simulation with annotation support
- **TimSim**: Complete simulation pipeline for synthetic timsTOF data
- **Prediction Models**: Local PyTorch models with optional KOINA remote model support (see [Prediction Models](#prediction-models))
- **Validation**: Tools for validating simulated data against search results
- **Integration Testing (EVAL)**: Automated validation against DiaNN, FragPipe, and Sage (see [Integration Testing](#integration-testing))
- **Isotope Simulation**: Accurate isotope distribution generation
- **TDF Writing**: Write simulated data to Bruker TDF format

Expand Down Expand Up @@ -49,9 +57,35 @@ frames = frame_builder.build_frames([1, 2, 3])
### timsim
Full simulation pipeline:
```bash
timsim --config config.toml --output /path/to/output
timsim config.toml
timsim config.toml --save-path output.d --reference-path reference.d --fasta-path proteome.fasta
```

## Prediction Models

TimSim uses deep learning models for retention time, ion mobility (CCS), and fragment intensity prediction. By default, local PyTorch models are used. Optionally, remote models can be accessed via [KOINA](https://koina.wilhelmlab.org) servers:

```toml
[models]
rt_model = "" # "" = local (default), or e.g. "Deeplc_hela_hf"
ccs_model = "" # "" = local (default), or e.g. "AlphaPeptDeep_ccs_generic"
intensity_model = "" # "" = local (default), or e.g. "prosit", "alphapeptdeep"
```

Requires `pip install imspy-predictors[koina]` for remote models. Falls back to local models if KOINA is unreachable. See [SIMULATOR_README.md](SIMULATOR_README.md) for the full list of available models.

## Integration Testing

The EVAL pipeline validates simulated datasets against production proteomics search engines:

```bash
python -m imspy_simulation.timsim.integration.sim --env env.toml --list
python -m imspy_simulation.timsim.integration.sim --env env.toml --test IT-DIA-HELA
python -m imspy_simulation.timsim.integration.eval --env env.toml --test IT-DIA-HELA
```

See the [Validation README](src/imspy_simulation/timsim/integration/VALIDATION_README.md) for setup, available tests, and configuration details.

## Submodules

- **builders/**: Frame builder implementations (DIA, DDA)
Expand Down
79 changes: 72 additions & 7 deletions packages/imspy-simulation/SIMULATOR_README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,23 @@ A high-fidelity proteomics simulation engine for Bruker timsTOF instruments. Gen
### 1. Installation

```bash
# Activate your Python environment
# From PyPI (recommended)
pip install imspy-simulation

# With KOINA remote model support (optional)
pip install imspy-predictors[koina]
```

**Docker** (includes all dependencies + GPU support):
```bash
docker build -t rustims .
docker run --rm --gpus all -v /data:/workspace rustims timsim /workspace/config.toml
```

<details>
<summary>From source</summary>

```bash
source /path/to/your/env/bin/activate

# Install the Rust backend (requires maturin)
Expand All @@ -16,9 +32,10 @@ maturin develop --release

# Install Python packages
pip install -e /path/to/rustims/packages/imspy-core
pip install -e /path/to/rustims/packages/imspy-simulation
pip install -e /path/to/rustims/packages/imspy-predictors
pip install -e /path/to/rustims/packages/imspy-simulation
```
</details>

### 2. Create a Configuration File

Expand Down Expand Up @@ -71,15 +88,17 @@ You need a real timsTOF `.d` file as a template. The simulator will populate it
### Simulation Pipeline

```
FASTA → Digestion → RT Prediction → IM Prediction →
FASTA → Digestion → Model Selection → RT Prediction → IM Prediction →
(local/KOINA)
Fragment Intensity Prediction → Frame Assembly → .d File
```

1. **Digestion**: In-silico tryptic digest of proteins
2. **RT Prediction**: Deep learning model predicts retention times
3. **IM Prediction**: CCS/mobility prediction for each peptide ion
4. **Intensity Prediction**: Fragment ion intensities (PROSPECT model)
5. **Frame Assembly**: Signals placed into timsTOF frame structure
2. **Model Selection**: Choose local PyTorch or KOINA remote models (see [Prediction Model Selection](#prediction-model-selection-koina))
3. **RT Prediction**: Deep learning model predicts retention times
4. **IM Prediction**: CCS/mobility prediction for each peptide ion
5. **Intensity Prediction**: Fragment ion intensities (local or KOINA models)
6. **Frame Assembly**: Signals placed into timsTOF frame structure

### Output Files

Expand Down Expand Up @@ -121,6 +140,35 @@ output_dir/
| `missed_cleavages` | `2` | Allowed missed cleavages |
| `min_len` / `max_len` | `7` / `30` | Peptide length range |

### Prediction Model Selection (KOINA)

TimSim supports both local PyTorch models and remote [KOINA](https://koina.wilhelmlab.org) models for predictions. Configure via the `[models]` section:

```toml
[models]
rt_model = "" # "" or "local" = local PyTorch (default)
ccs_model = "" # "" or "local" = local PyTorch (default)
intensity_model = "" # "" or "local" = local PyTorch (default)
```

**Available remote models:**

| Task | Model Name | Notes |
|------|-----------|-------|
| **RT** | `"Deeplc_hela_hf"` | DeepLC HeLa model |
| | `"Chronologer_RT"` | Chronologer RT predictor |
| | `"AlphaPeptDeep_rt_generic"` | AlphaPeptDeep generic RT |
| | `"Prosit_2019_irt"` | Prosit indexed RT |
| **CCS** | `"AlphaPeptDeep_ccs_generic"` | AlphaPeptDeep generic CCS |
| | `"IM2Deep"` | IM2Deep predictor |
| **Intensity** | `"prosit"` | Prosit 2023 timsTOF (max 30 AA, limited mods) |
| | `"alphapeptdeep"` | AlphaPeptDeep generic (supports phospho) |
| | `"ms2pip"` | ms2pip timsTOF 2024 |

**Prerequisites**: `pip install imspy-predictors[koina]`

If a KOINA server is unreachable, the simulator automatically falls back to local models.

### Advanced Features

#### Partial Fragmentation (Unfragmented Precursors)
Expand Down Expand Up @@ -259,6 +307,23 @@ binomial_charge_model = true
charge_state_one_probability = 0.15
```

## Integration Testing (EVAL Pipeline)

Validate simulated datasets against production proteomics search engines (DiaNN, FragPipe, Sage):

```bash
# List available integration tests
python -m imspy_simulation.timsim.integration.sim --env env.toml --list

# Run a simulation
python -m imspy_simulation.timsim.integration.sim --env env.toml --test IT-DIA-HELA

# Analyze and validate against ground truth
python -m imspy_simulation.timsim.integration.eval --env env.toml --test IT-DIA-HELA
```

Third-party analysis tools (DiaNN, FragPipe, Sage) must be installed separately — they are not bundled due to licensing. See the full [Validation README](src/imspy_simulation/timsim/integration/VALIDATION_README.md) for setup instructions and available test scenarios.

## Troubleshooting

### "Bruker SDK not found"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,10 +81,11 @@ output/
8. [Property Variation Settings](#property-variation-settings)
9. [DDA Settings](#dda-settings)
10. [Charge State Probabilities](#charge-state-probabilities)
11. [Quad Transmission Settings](#quad-transmission-settings)
12. [Video Settings](#video-settings)
13. [Performance Settings](#performance-settings)
14. [Console and Execution](#console-and-execution)
11. [Prediction Model Settings](#prediction-model-settings)
12. [Quad Transmission Settings](#quad-transmission-settings)
13. [Video Settings](#video-settings)
14. [Performance Settings](#performance-settings)
15. [Console and Execution](#console-and-execution)

---

Expand Down Expand Up @@ -300,6 +301,67 @@ charge_state_one_probability = 0.0

---

## Prediction Model Settings

TimSim uses deep learning models for retention time (RT), collisional cross section (CCS), and fragment intensity prediction. By default, local PyTorch models ship with the package. Optionally, remote models can be used via [KOINA](https://koina.wilhelmlab.org) servers.

**Prerequisite for remote models**: `pip install imspy-predictors[koina]`

### `[models]` Section

| Parameter | Default | Description |
|-----------|---------|-------------|
| `rt_model` | `""` | Retention time prediction model |
| `ccs_model` | `""` | CCS / ion mobility prediction model |
| `intensity_model` | `""` | Fragment intensity prediction model |

### Available Models

**Retention Time (`rt_model`)**:

| Value | Description |
|-------|-------------|
| `""` or `"local"` | Local PyTorch model (default) |
| `"Deeplc_hela_hf"` | DeepLC HeLa model (KOINA) |
| `"Chronologer_RT"` | Chronologer RT predictor (KOINA) |
| `"AlphaPeptDeep_rt_generic"` | AlphaPeptDeep generic RT (KOINA) |
| `"Prosit_2019_irt"` | Prosit indexed RT (KOINA) |

**CCS / Ion Mobility (`ccs_model`)**:

| Value | Description |
|-------|-------------|
| `""` or `"local"` | Local PyTorch model (default) |
| `"AlphaPeptDeep_ccs_generic"` | AlphaPeptDeep generic CCS (KOINA) |
| `"IM2Deep"` | IM2Deep predictor (KOINA) |

**Fragment Intensity (`intensity_model`)**:

| Value | Description |
|-------|-------------|
| `""` or `"local"` | Local PyTorch PROSPECT fine-tuned model (default) |
| `"prosit"` | Prosit 2023 timsTOF (KOINA) — max 30 AA, limited modifications |
| `"alphapeptdeep"` | AlphaPeptDeep generic (KOINA) — supports all modifications including phospho |
| `"ms2pip"` | ms2pip timsTOF 2024 (KOINA) |

### Notes

- If a KOINA server is unreachable, the simulator automatically falls back to local models.
- For phosphorylated peptides, use `"alphapeptdeep"` as the intensity model — Prosit does not support phosphorylation modifications.
- Prosit intensity models are limited to peptides ≤ 30 amino acids with standard modifications.
- AlphaPeptDeep supports all UNIMOD modifications and has no peptide length restriction.

### Configuration Example

```toml
[models]
rt_model = "AlphaPeptDeep_rt_generic"
ccs_model = "AlphaPeptDeep_ccs_generic"
intensity_model = "prosit"
```

---

## Quad Transmission Settings

Advanced settings for quadrupole-dependent isotope transmission and **partial fragmentation** (precursor survival).
Expand Down Expand Up @@ -431,6 +493,11 @@ isotope_k = 8
isotope_min_intensity = 1
isotope_centroid = true

[models]
rt_model = ""
ccs_model = ""
intensity_model = ""

[noise]
mz_noise_precursor = true
precursor_noise_ppm = 6.5
Expand Down
Loading