A centralized metadata database and web application for mass spectrometry proteomics experiments.
Helmsdeep captures, validates, and standardizes experimental metadata at every step of the proteomics workflow — from cell culture through mass spectrometry acquisition.
This repository is published as a static public snapshot. It is not actively maintained, and no pull requests, issues, feature requests, roadmap items, or release updates should be expected. Use it as a reference implementation, starting point, or foundation for your own adapted version.
The included Docker Compose setup is intended for local evaluation and development. It uses local demo credentials and has no built-in authentication; do not expose it to real users without adding your own authentication and deployment controls.
The platform is built around three ideas:
- Workflow-centric data model — database tables mirror actual laboratory procedures (cell culture → fractionation → peptide digest → MS run)
- Validation at entry — Pydantic schemas and relational constraints prevent errors from propagating downstream
- Adherence to FAIR principles — sample metadata in the database is intended to make proteomic mass spectrometry data Findable, Accessible, Interoperable, and Reusable.
┌──────────────┐ HTTP/JSON ┌──────────────────┐ SQLAlchemy ┌─────────┐
│ Streamlit │────────────────▶│ FastAPI (API) │────────────────▶│ MySQL │
│ UI │◀────────────────│ helmsdeep-api │ │ DB │
└──────────────┘ └──────────────────┘ └─────────┘
helmsdeep-ui helmsdeep-api-schemas
helmsdeep-client helmsdeep-db-models (SQLAlchemy ORM)
Packages in this repo:
| Package | Description |
|---|---|
helmsdeep-db-models |
SQLAlchemy ORM models + Alembic migrations |
helmsdeep-api-schemas |
Pydantic validation schemas (request/response) |
helmsdeep-api |
FastAPI backend |
helmsdeep-client |
Python client library used by the UI |
helmsdeep-ui |
Streamlit web application |
git clone https://github.com/TalusBio/helmsdeep-open.git
cd helmsdeep-open
cp .env.example .env
# Edit .env if you want to change the default passwordsThe Docker path does not require a local Python environment. For host-based
development, uv sync uses the committed uv.lock dependency snapshot.
If you have docker installed, use the command below, otherwise skip to "Manual Start (without Docker)" section
sh scripts/docker-start.shThis builds and starts three containers in the background:
- mysql — MySQL 8.0 database on port 3306
- api — FastAPI server on port 8000
- ui — Streamlit app on port 8501
The startup flow runs automatically in this order:
- MySQL starts and passes its health check
- API container waits for MySQL, then:
- Applies Alembic database migrations (
alembic upgrade head) - Seeds prerequisite example data (
scripts/seed_example_data.py) — idempotent, safe to re-run - Starts the FastAPI server and passes its health check
- Applies Alembic database migrations (
- UI container waits for the API, then starts Streamlit
- The script waits for Streamlit and opens http://localhost:8501
# Follow logs
docker compose logs -f
# Stop the stack
docker compose downTo start without opening a browser:
HELMSDEEP_OPEN_BROWSER=0 sh scripts/docker-start.shAn example Excel file is included at:
tests/data/input/Public helmsdeep metadata example.xlsx
All prerequisite database records (operators, cell lines, instruments, compounds, etc.) needed to validate this file are inserted automatically during startup. To try it:
- Open the Metadata page in the UI
- Upload
tests/data/input/Public helmsdeep metadata example.xlsx - The file should pass validation and register successfully
- Web UI: http://localhost:8501
- API docs: http://localhost:8000/docs
- Python 3.12+
- uv package manager
- MySQL 8.0 (or use SQLite for quick testing)
make installOn first run this creates .env from .env.example and exits — edit DATABASE_URL if needed, then re-run. Subsequently it installs packages, applies migrations, seeds example data, and starts both the API (:8000) and the UI (:8501). Press Ctrl+C to stop both.
Warning —
make installresets the database schema on every run. It runsalembic downgrade basebeforealembic upgrade head, which drops and recreates all tables. Any existing data will be permanently deleted. To apply new migrations without destroying data, useuv run alembic upgrade headdirectly frompackages/helmsdeep-db-models/.
export DATABASE_URL="sqlite:///$PWD/helmsdeep.db"
cd packages/helmsdeep-db-models
uv run alembic upgrade head
cd ../helmsdeep-api
uv run uvicorn main:app --port 8000The schema reflects the proteomics experimental workflow:
Operator ──────────────────────────────────────────────────────┐
Project / Grant / Program │
│
Cell Type │
└─ Cell Culture Registry ────────────────────────────────┐ │
└─ Cell Fraction │ │
└─ Peptide Digest │ │
└─ MS Run ◀── Instrument, Protocol ─┘ ┘
└─ Experiment (groups MS runs)
Key tables:
| Table | Description |
|---|---|
operators |
Lab personnel |
cell_culture_registry |
Frozen and active cell cultures |
cell_fraction |
Sub-cellular fractions |
peptide_digest |
Digestion step metadata |
ms_run |
Mass spectrometry acquisition |
wellplate |
Microplate tracking across workflow steps |
experiment |
Groups of related MS runs |
protocols |
Standard operating procedures |
instruments |
Mass spectrometers and LC systems |
Migrations are managed with Alembic.
# From packages/helmsdeep-db-models/
# Apply all migrations
DATABASE_URL=... uv run alembic upgrade head
# Create a new migration
DATABASE_URL=... uv run alembic revision --autogenerate -m "description"
# Check current version
DATABASE_URL=... uv run alembic currentThe full OpenAPI spec is available at http://localhost:8000/docs when the server is running.
Key endpoint groups:
| Prefix | Description |
|---|---|
/operators/ |
Lab personnel CRUD |
/cell_cultures/ |
Cell culture registry |
/cell_fractions/ |
Fractionation metadata |
/peptide_digests/ |
Digest metadata |
/mass_spectrometry/ |
MS run metadata |
/experiments/ |
Experiment groups |
/instruments/ |
Instruments |
/protocols/ |
Protocols |
/wellplates/ |
Microplate registry |
/sdrf/ |
SDRF export |
| Variable | Required | Description |
|---|---|---|
DATABASE_URL |
Yes | SQLAlchemy connection string |
ENV |
No | development / production / testing (default: development) |
HELMSDEEP_DEV_USER |
No | Username displayed in development mode (bypasses auth; local only) |
DATABASE_RO_URL |
No | Optional read-only database replica URL |
The Docker Compose setup has no authentication. For local development, HELMSDEEP_DEV_USER sets the active user without any login flow. In production you must place an authenticating reverse proxy in front of the UI that injects an X-Auth-Request-Email header with the logged-in user's email address. The original deployment used AWS Cognito via an Application Load Balancer, but any identity provider that can inject that header (oauth2-proxy, Nginx with auth_request, Cloudflare Access, etc.) will work. The injected email must match an operator name registered in the database.
See CONTRIBUTING.md for the public snapshot policy and local development notes.
See LICENSE.