Skip to content

Omkarrb/Geno-Shield

Repository files navigation

Geno-Shield — Genomic AMR Decision Support

Semi-production machine learning framework for genomic antimicrobial resistance (AMR) decision support.
The project provides a full pipeline from phenotype-linked isolate metadata to packaged models and API inference.

Project Highlights

  • End-to-end AMR pipeline: manifest build, download, feature extraction, train, evaluate, package.
  • Two implemented clinical tasks:
    • ecoli_ciprofloxacin for Escherichia coli
    • staph_oxacillin for Staphylococcus aureus
  • FastAPI inference service with OpenAPI/Swagger docs at /docs.
  • Containerized runtime via Docker for reproducibility and deployment parity.
  • Structured quality controls and test suite (pytest, mypy, ruff).

Architecture

The system is organized in modular layers:

  • amr_pipeline/core: data ingestion, manifest normalization, FASTA QC, AMRFinder integration.
  • amr_pipeline/modeling: train/select models, inference utilities, artifact packaging.
  • apps/api: production-style HTTP API for inference and model readiness checks.
  • configs: per-task reproducible configs.
  • docs: architecture, data provenance, model cards, threat model, reproducibility notes.
  • tests: API and pipeline tests.

API Outputs

POST /predict returns:

  • organism and antibiotic context
  • binary prediction (Resistant or Susceptible)
  • calibrated probability_resistant
  • detected AMR gene/mutation signals
  • model/data version metadata
  • inference mode (full or fallback)
  • assembly QC block and clinical caution message

Quick Start

1) Local setup

make setup
make demo

2) Run API

make api

3) Open docs and test

  • Open http://localhost:8000/docs
  • Or use:
curl -X POST "http://localhost:8000/predict?task=ecoli_ciprofloxacin" \
  -F "fasta_file=@data/demo/sample_ecoli.fna"

Docker Run

docker compose up --build

Reproducible Pipeline Commands

amr_pipeline build-manifest --organism ecoli --antibiotic ciprofloxacin --out data/processed/ecoli_manifest.csv --source-csv <ncbi_ast_export.csv>
amr_pipeline download --manifest data/processed/ecoli_manifest.csv --out data/raw
amr_pipeline featurize --manifest data/processed/ecoli_manifest.csv --fasta-dir data/raw --out data/processed/ecoli_features.parquet --config configs/ecoli_ciprofloxacin.yaml
amr_pipeline train --task ecoli_ciprofloxacin --features data/processed/ecoli_features.parquet --out models/ecoli_ciprofloxacin --config configs/ecoli_ciprofloxacin.yaml
amr_pipeline evaluate --task ecoli_ciprofloxacin --features data/processed/ecoli_features.parquet --model models/ecoli_ciprofloxacin/model.joblib --out models/ecoli_ciprofloxacin/evaluation.json
amr_pipeline package-model --task ecoli_ciprofloxacin --config configs/ecoli_ciprofloxacin.yaml --model-dir models/ecoli_ciprofloxacin --manifest data/processed/ecoli_manifest.csv

Engineering Quality

  • Type and style checks: make lint
  • Unit tests with coverage: make test
  • CLI/API smoke checks: make smoke

Safety and Limits

  • This repository is for clinical decision support research/prototyping.
  • It is not a regulated diagnostic device.
  • Predictions must be confirmed with laboratory phenotypic AST and local policy.

Documentation

  • Architecture: docs/architecture.md
  • Data provenance: docs/data_provenance.md
  • API usage: docs/usage_api.md
  • Threat model: docs/threat_model.md
  • Model cards: docs/model_card_ecoli_cipro.md, docs/model_card_staph_oxacillin.md

About

Genomic antimicrobial-resistance (AMR) decision support — a FastAPI + Docker pipeline turning bacterial genome assemblies into calibrated resistance predictions (E. coli, S. aureus) with AMR gene/mutation evidence.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors