Instructions for AI coding agents (OpenAI Codex, etc.) working in this repository.
mdpp — a Python 3.12+ library for molecular dynamics simulation pre- and post-processing, plus small-molecule cheminformatics. Supports GROMACS, AMBER, OpenFE, and BrownDye workflows.
conda create -n mdpp python=3.12 -y && conda activate mdpp
bash setup.sh- Use the
mdppconda environment for any Python command that relies on project dependencies. - When running non-interactively, prefer
conda run -n mdpp ...instead of creating a separate virtualenv or usinguv run. - Treat the workspace
.venv/anduv.lockas agent-created artifacts to avoid for this repository unless the user explicitly asks foruv.
pytest # run all tests (auto-parallel)
ruff check src/ tests/ --fix # lint
ruff format src/ tests/ # format
pre-commit run --all-files # full check suiteThree custom markers gate optional test subsets (--strict-markers enforced):
| Marker | Purpose |
|---|---|
benchmark |
Performance timing tests with printed reports |
slow |
Resource-intensive tests (>10s runtime) |
gpu |
Tests that exercise GPU backends cupy/torch/jax |
Combine markers with boolean expressions:
pytest -m "not benchmark" # skip all benchmarks
pytest -m "benchmark and not slow" # fast benchmarks only
pytest -m "not gpu" # CPU-only run
pytest -m "benchmark and gpu" # all GPU-exercising benchmarks
pytest -m "not (slow or benchmark)" # minimum fast CI subset
pytest -m "gpu and not slow" -n 0 # GPU agreement (serial -- see note)Always pass
-n 0to GPU-marked runs. The default 24-worker pytest-xdist parallelism opens 24 simultaneous CUDA contexts on the single visible GPU and triggers spuriousCUBLAS_STATUS_ALLOC_FAILED/OutOfMemoryErrorfailures even on 96 GB cards. The same suite passes cleanly under-n 0(or-n 1).
Register any new markers in pyproject.toml [tool.pytest.ini_options].markers.
After modifying any production code under src/mdpp/, you MUST complete the following loop before considering the task done. Repeat until no CRITICAL issues remain:
- Write tests -- add or update tests for every changed function. Tests live under
tests/mirroring thesrc/mdpp/layout. - Run tests --
conda run -n mdpp pytest <relevant scope>(use fullpytestwhen multiple areas are affected). - Run pre-commit --
conda run -n mdpp pre-commit run --all-files(covers ruff lint, ruff format, mypy type checking, shellcheck). - Run AI review -- request an independent AI code review of the changes (e.g. Codex review, Claude code-reviewer, or equivalent).
- Fix issues -- address any CRITICAL or HIGH issues from tests, pre-commit, or review.
- Repeat from step 2 until all checks pass and no CRITICAL issues are found.
Prefer conda run -n mdpp ... for all non-interactive checks.
Source is under src/mdpp/ using the src-layout convention:
| Subpackage | Purpose | Key patterns |
|---|---|---|
core/ |
Trajectory I/O, file parsers | load_trajectory, load_trajectories, read_xvg, read_edr |
constants.py |
Physical constants | GAS_CONSTANT_KJ_MOL_K, DEFAULT_TEMPERATURE_K |
analysis/ |
Compute functions | compute_*(traj, *, ...) -> FrozenDataclass |
analysis/_backends/ |
Private backend subpackage | BackendRegistry[F], require_torch/jax/cupy, DistanceBackend/RMSDBackend/DCCMBackend Literals, clean_torch_cache/clean_cupy_cache decorators |
chem/ |
Small-molecule cheminformatics | MolSupplier, calc_descs, gen_fp, calc_sim, is_pains |
plots/ |
Visualization (2D, 3D, molecules) | plot_*(result, *, ax=None) -> Axes, draw_mol, view_mol_3d |
prep/ |
System preparation | fix_pdb, strip_solvent, run_propka, ligand tools |
Shell scripts (analysis wrappers, runtime helpers, build scripts, etc.) live in the top-level scripts/ directory (not packaged).
Examples (notebooks and data) live in examples/ (GROMACS, OpenFE RBFE, BrownDye).
SLURM submission scripts for running OpenFE RBFE transformations on Sherlock.
Requires OpenFE >= 1.10.0 for --resume checkpoint support.
| Script | Purpose |
|---|---|
quickrun/quickrun.sh |
Submit all transformations/*.json as SLURM array jobs (-r N for repeats) |
quickrun/quickrun.sbatch |
Batch script: starts CUDA MPS, runs openfe quickrun --resume via Apptainer |
runtime/check_status.sh |
Check transformation replica status and optionally restart failed replicas |
runtime/monitor.sbatch |
Periodic monitor: runs check_status, emails report, self-resubmits via SLURM |
- CUDA MPS is required for Sherlock's
Exclusive_ProcessGPU mode (openmmtools needs multiple CUDA contexts). --resumeenables checkpoint-based resumption after preemption onownerspartition.- Output goes to
results/<name>/replica_<id>/.
Tests live in tests/analysis/, tests/plots/, and tests/chem/, mirroring the source tree.
- Absolute imports only —
from mdpp.core.trajectory import load_trajectory. - Google docstrings — all public functions must have Args/Returns/Raises sections.
- Frozen dataclasses — analysis results use
@dataclass(frozen=True, slots=True). - Keyword-only args — after the first positional arg in compute/plot functions.
- No builtin shadowing — do not create modules named
io,pdb,types, etc. - Type aliases — shared aliases live in
mdpp._types(StrPath,PathLike). - Exports — every
__init__.pyhas an__all__list. New public functions must be added. - Units — internal arrays use nm/ps (MDTraj convention); display properties convert to Å/ns.
- Chem functions — take
Chem.rdchem.Molor SMILES strings; fingerprint generators are inFP_GENERATORSdict. - 3D visualization —
plots/three_d.pyuses py3Dmol and nglview for notebook-based interactive views.
- Analysis modules:
src/mdpp/analysis/<topic>.py - Plot modules:
src/mdpp/plots/<topic>.py - Helper utilities within a subpackage:
utils.py - MDP config templates:
scripts/gromacs/mdps/<ff>/<step>.mdp - Shell scripts (not packaged):
scripts/<engine>/<category>/<script>.sh - SLURM scripts:
scripts/<engine>/<category>/<script>.sbatch
mdpp.analysis.clustering exposes seven sklearn-style callable classes:
| Class | Input | Backend / notes |
|---|---|---|
Gromos |
RMSD matrix | Numba JIT (greedy largest-first, Daura 1999) |
Hierarchical |
RMSD matrix | scipy linkage + fcluster |
DBSCAN |
RMSD matrix | Numba JIT (default) or sklearn metric="precomputed" |
HDBSCAN |
RMSD matrix | sklearn metric="precomputed" |
KMeans |
Feature matrix | scikit-learn |
MiniBatchKMeans |
Feature matrix | scikit-learn |
RegularSpace |
Feature matrix | deeptime |
Each class is @dataclass(frozen=True, slots=True) with parameters at construction and a __call__(data) -> ClusteringResult | FeatureClusteringResult invocation.
result = Gromos(cutoff_nm=0.15)(rmsd_matrix.rmsd_matrix_nm)
result = KMeans(n_clusters=10)(pca.projections)Do not add the old function-form wrappers (compute_gromos_clusters, etc.) -- they were removed and there is no backward-compat shim.
- Create/extend a file in
src/mdpp/analysis/. - Define result dataclass(es) with frozen=True, slots=True.
- Write
compute_*function following the existing signature pattern. - Add exports to
src/mdpp/analysis/__init__.py. - If visual output makes sense, add
plot_*insrc/mdpp/plots/and export it. - Write tests in
tests/analysis/.
Default backend rule: every public compute function that accepts a backend= argument MUST default to "mdtraj" when mdtraj provides a native kernel for that computation. Other backends (numba, torch, jax, cupy) are performance options that callers must opt into explicitly. The only current exception is compute_dccm, which defaults to "numpy" because mdtraj has no native covariance kernel -- numpy's BLAS GEMM is multi-threaded and works without any optional dependency.
Reasons:
- Only
mdtrajsupports periodic boundary conditions -- defaulting to anything else would silently drop PBC for users who don't read the backend parameter. - All PBC-relevant analysis functions sharing the same default keeps API behavior consistent across
compute_distances,compute_rmsd_matrix,featurize_ca_distances, etc. - The optional GPU backends (
[gpu]extra) must never be required for the common path.
When reviewing or writing code, never silently change a public function's default backend away from its current value ("mdtraj", or "numpy" for DCCM).
Uniform signature rule: every backend registered in a given BackendRegistry MUST accept the exact same call signature as the Protocol type parameter on that registry. If one backend needs an extra keyword argument (e.g. periodic on mdtraj), every other backend in the same registry MUST also accept that keyword, silently ignoring it if unused (mark # noqa: ARG001 and document as "accepted for Protocol uniformity, ignored"). This keeps the dispatcher free of per-backend branching and preserves type inference for callers.
Registry typing rule: every BackendRegistry[F] instance MUST be parameterised with an explicit Protocol type F:
from typing import Protocol
class RMSDMatrixBackendFn(Protocol):
def __call__(
self,
traj: md.Trajectory,
atom_indices: NDArray[np.int_],
) -> NDArray[np.floating]: ...
rmsd_matrix_backends: BackendRegistry[RMSDMatrixBackendFn] = BackendRegistry(default="mdtraj")Never declare a bare BackendRegistry without a type parameter -- registry.get(backend) would return an unbound F and the dispatcher would lose the signature of compute_fn at the call site. The Protocol lives in the same _backends/_<kind>.py file as the backends it describes (not in the shared _registry.py) so the registry module stays decoupled from any particular backend signature.
Backend dtype rule: Protocols return NDArray[np.floating] (not NDArray[np.float64]) and every backend returns its native dtype -- float32 for mdtraj and the GPU backends (torch/jax/cupy), float64 for numba. Public compute_* wrappers then cast with astype(resolved, copy=False) / np.asarray(result, dtype=resolved) so when the backend's native dtype already matches the user's resolved dtype (the float32 default), no redundant copy is made. This is essential at large N: forcing float64 on an N^2 matrix would cost 115 GB at n=120k purely for a type contract. Never add an unconditional .astype(np.float64) at a backend boundary.
GPU cache cleanup rule: torch and cupy GPU-backed compute kernels MUST be decorated with the matching framework-specific cleanup decorator from _backends/_imports.py:
| Backend | Decorator |
|---|---|
torch |
@clean_torch_cache |
cupy |
@clean_cupy_cache |
jax |
(none) |
The decorators call the framework's cache-clear API (torch.cuda.empty_cache(), cp.get_default_memory_pool().free_all_blocks()) in a finally block so pooled device memory is returned to the driver on both normal return and exceptions. Apply decorators to inner kernel functions (e.g. rmsd_torch, distances_cupy), never the outer CPU-side compute_* wrappers. The decorators use PEP 695 generic syntax ([**P, T]) so mypy preserves the Protocol signature at registry call sites.
JAX kernels are deliberately not decorated. jax.clear_caches() clears JIT compilation caches, not device memory -- trashing the compilation cache after every call forces a multi-second recompile on the next invocation. JAX has no public API for returning pooled device memory to the driver anyway (XLA manages it directly).
For existing multi-backend functions (e.g. compute_rmsd_matrix, pairwise distances):
- Add the implementation in the matching
src/mdpp/analysis/_backends/_<kind>.pyfile, matching theProtocoltype defined at the top of that file exactly. - Use
require_torch()/require_jax()/require_cupy()from_backends/_imports.pyfor optional GPU libraries -- never import them at module top-level. - Decorate torch/cupy GPU kernels with
@clean_torch_cache/@clean_cupy_cachefrom_backends/_imports.pyso pooled memory is released in afinallyblock after the kernel runs. Do not apply any cleanup decorator to JAX kernels --jax.clear_caches()trashes JIT compilation caches and forces slow recompiles. - If you introduce a new keyword argument, also retrofit every existing backend in the same registry to accept it (silently ignoring when unused, marked
# noqa: ARG001). - Register in the module's
BackendRegistryat the bottom of the file. - Add the backend name to the corresponding
Literalalias (DistanceBackend/RMSDBackend/DCCMBackend) in_backends/_registry.py. - Add agreement tests in
tests/analysis/test_<kind>.pyguarded by the relevantrequires_*skip marker and@pytest.mark.gpu(if GPU-only). - Do not change the public function's default backend -- keep
compute_distances/compute_rmsd_matrix/featurize_ca_distancesdefaulting to"mdtraj"andcompute_dccmdefaulting to"numpy".
To introduce a registry for a new multi-backend compute function:
- Create
src/mdpp/analysis/_backends/_<kind>.pywith aProtocolclass defining the shared call signature. - Declare the registry as
<kind>_backends: BackendRegistry[<Kind>BackendFn] = BackendRegistry(default="mdtraj")(or another sensible no-optional-dep default if mdtraj has no kernel for that computation -- e.g._dccm.pyusesdefault="numpy"). Always parameterise with the Protocol so callers get typedcompute_fnfromregistry.get(). - Add a
Literalalias to_backends/_registry.py(type <Kind>Backend = Literal["mdtraj", "numba", ...]) and re-export it from_backends/__init__.py. - The public wrapper in
src/mdpp/analysis/<kind>.pyimports the registry and delegates viacompute_fn = <kind>_backends.get(backend), letting mypy infer the Protocol type at the call site.
- Create/extend a file in
src/mdpp/chem/. - Functions take
Chem.rdchem.Molor SMILES strings as input. - Add exports to
src/mdpp/chem/__init__.py. - Write tests in
tests/chem/.
- Do not remove dependencies from
pyproject.toml[project.dependencies]. - Do not use relative imports.
- Do not put test
__init__.pyfiles in test directories. - Do not modify files in
results/(untracked temporary directory). - Do not write custom parsers when a library exists (use panedr, MDAnalysis, etc.).