Skip to content

rhoai-mlops/lab-runner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lab Runner — AI500 MLOps Enablement

Automated exercise runner for the AI500 MLOps workshop. Mirrors the architecture of the AI501 lab-runner, adapted for the Jukebox (music hit prediction) MLOps exercises.

Modules

ID Name Dependencies Description
1 When the Music Starts Data exploration, model training, model serving (OpenVINO)
2 In the Rhythm of Data 1 KFP training pipelines
3 From Studio to Stage 2 GitOps, ArgoCD, CT pipeline, webhooks
4 The Sound Check 3 TrustyAI, Grafana, alerting
5 The Data Tracks 3 DVC data versioning
6 The Headliner 3 Autoscaling, canary, music-transformer
7 The Feature Playlist 5 Feast feature store
8 The Supporting Acts 3 Unit tests, linting, SonarQube, model scanning, image signing, SBOMs
Module 1 (Inner Loop)
  └─ Module 2 (Pipelines)
      └─ Module 3 (MLOps)
          ├─ Module 4 (Monitoring)
          ├─ Module 5 (Data Versioning)
          │   └─ Module 7 (Feature Store)
          ├─ Module 6 (Advanced Deployments)
          └─ Module 8 (Security)

Dependencies are resolved automatically — requesting Module 7 will also run 1 → 2 → 3 → 5.

Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Requires oc, helm, and git on the PATH.

CLI Usage

# List available modules
lab-runner list

# Dry run (preview steps without executing)
lab-runner run -u user1 -p <password> -c apps.cluster.example.com --up-to 3 --dry-run

# Run specific modules
lab-runner run -u user1 -p <password> -c apps.cluster.example.com -m 1,3

# Run all modules up to a given ID
lab-runner run -u user1 -p <password> -c apps.cluster.example.com --up-to 8 --verbose

# Verify module state without making changes
lab-runner verify -u user1 -p <password> -c apps.cluster.example.com -m 1,2,3

# Show cluster status for all modules
lab-runner status -u user1 -p <password> -c apps.cluster.example.com

Web UI

lab-runner-web
# Opens at http://localhost:8080

The web UI provides:

  • Module selection with dependency visualization
  • Real-time progress streaming via SSE
  • Cluster status checks

Container Build

podman build -t lab-runner -f Containerfile .
podman run -p 8080:8080 lab-runner

Helm Deployment

helm install lab-runner chart/ -n <namespace>

Architecture

lab_runner/
├── cli.py              # Click CLI (run, verify, status, list)
├── config.py           # Config dataclass (namespaces, URLs, labels)
├── defaults.py         # YAML templates, Helm chart paths
├── runner.py           # Orchestrator (dependency resolution, execution)
├── web.py              # FastAPI + SSE web interface
├── clients/            # oc, helm, git, gitea wrappers
├── modules/            # 8 exercise modules (m01–m08)
├── steps/              # Reusable step types (git, helm, kube, notebook, webhook, verify)
└── templates/          # Web UI HTML

Each module defines an ordered list of steps. Steps are idempotent — they check if their outcome already exists before running, so re-running a module safely skips completed work.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors