Shared development toolbox for engineers. Provides reusable data + modeling pipelines and a unified packaging/deployment client for
ml-deployment-ecosystem. Not for storage of models/data or high-frequency production extraction.
ml_packaging_toolbox/
├── README.md
├── requirements.txt
├── requirements-test.txt
├── examples/ # Usage examples (start here)
│ └── end_to_end.py # Full pipeline → package (deploy optional)
├── ml_packaging_toolbox/
│ ├── data_pipeline/
│ │ ├── extraction/ # LT query builders + RT normalisation
│ │ ├── preprocess/ # sklearn-compatible preprocessors
│ │ └── assembly/ # DataAssemblyPipeline (LT + RT paths)
│ ├── model_pipeline/
│ │ ├── filters/ # Pre-modeling data filtering
│ │ ├── model_selection/ # CV / HPO utilities
│ │ └── modeling/ # Model wrappers (sklearn estimators)
│ ├── model_deployment/
│ │ └── packaging/ # Integration seam with ml-deployment-ecosystem
│ └── utils/
└── tests/ # Unit tests guarding the packaging contract
model_deployment/packaging/packager.py produces a zip with exactly:
| File | Consumed by deployment server |
|---|---|
model.pkl |
POST /deploy/ → stored in MongoDB; loaded for scoring |
preprocessors.pkl |
reconcile step (optional, depends on server wiring) |
context.yml |
POST /deploy/ registration + context store |
features.json |
ordered feature alignment during inference |
test_data.json |
POST /deploy/validate/<model_id> smoke test |
name: my_model
version: 1.0.0
device_id: deviceA # primary device tag (informational)
required_contexts:
- device_id: deviceA
event_code: EVT_001
features: [feat1, feat2, feat3]
- device_id: deviceB # optional: multi-device models list multiple entries
event_code: EVT_002
features: [feat4, feat5]required_contexts drives the server's dispatch and inference logic:
- Each entry names a
device_id,event_code, and the feature names that device contributes. - The deployment server waits until all listed entries have arrived for the same
material_idbefore scoring. - Feature vectors are assembled in the order
required_contextsis listed.
Any changes to data_pipeline/ or packaging.py require matching changes to ml_deployment_server/webservice/mod/deploy/ (per README warning).
py -m venv .venv
.\.venv\Scripts\python -m pip install -U pip
.\.venv\Scripts\python -m pip install -r requirements.txt -r requirements-test.txt
.\.venv\Scripts\python -m pip install -e .
.\.venv\Scripts\python -m pytest -q.\.venv\Scripts\python examples\end_to_end.pyThe demo always builds the artifact under ./packages/.
The deploy step is optional and requires a running ml-deployment-ecosystem service.
- Keep changes small (one bug / one feature per branch)
- One new feature = one branch → PR to main
- Changes to data_pipeline or packaging.py → notify deployment server team