MLRun is an open-source AI orchestration platform for rapidly building and managing continuous (gen) AI and ML applications across their lifecycle. MLRun automates the delivery of production data pipelines, ML workflows, and online applications, significantly reducing engineering efforts, time to production, and computational resources. It integrates into development and CI/CD environments, breaks silos between data, ML, software, and DevOps/MLOps teams, and supports both community edition (CE) deployments and enterprise features when running in Iguazio systems.
This repository contains two major Python codebases plus Go services:
mlrun/– SDK and client library (what end-users import); includes projects, runtimes, feature store, model monitoring, data store, serving, launchers, and frameworks integrations.server/py/– Python server components (FastAPI-based API service + alerts service); includesservices/api/(main MLRun API),services/alerts/(events processing), andframework/(DB sessions, auth, utilities, rundb implementation).server/go/– Go services; includesservices/logcollector/(gRPC microservice for streaming logs from Kubernetes pods).tests/– SDK unit tests, integration tests, and system tests (CE and enterprise-marked); uses pytest with shared fixtures intests/common_fixtures.py.server/py/services/api/tests/– Server-side unit tests (note:pyproject.tomlconfigures pytestpythonpath=./server/pyto enable running server tests from repo root).docs/– Sphinx-based documentation (API reference, tutorials, guides, architecture); builds to ReadTheDocs.pipeline-adapters/– Pipeline integration packages (mlrun-pipelines-kfp-common, kfp-v1-8, kfp-v2) with independentpyproject.tomlfiles.automation/– CI/CD automation scripts (deployment, system tests, release notes generation, version management).dockerfiles/– Dockerfile definitions for mlrun, mlrun-api, mlrun-gpu, mlrun-kfp, jupyter, test, test-system images.hack/– Local development environment configurations (.env files for various setups, local k8s manifests, benchmarks).examples/– Jupyter notebooks and example scripts demonstrating MLRun features..github/– GitHub workflows (build, CI, system tests, security scans, release pipelines), issue templates, PR template, CODEOWNERS.
Set up Python environment:
# Create virtual environment (using venv or uv)
uv venv venv --python 3.11 --seed
source venv/bin/activate
# Install all dependencies
export MLRUN_PYTHON_PACKAGE_INSTALLER=uv
make install-requirements
uv pip install -e '.[complete]'Configure PYTHONPATH:
# Required for server-side development
export PYTHONPATH="$(pwd):$(pwd)/server/py"# Format code (Ruff formatter)
make fmt
# Lint code (Ruff linter)
make lint# Create a new database migration (MySQL)
MLRUN_MIGRATION_MESSAGE="Add new column" make create-migration# Install docs dependencies
make install-docs-requirements
# Build docs locally
cd docs
make html
# Output in docs/_build/html/index.html# Update Python lock files for mlrun-api image
make upgrade-mlrun-api-deps-lock
# Update specific package only
MLRUN_UV_UPGRADE_FLAG="--upgrade-package <package-name>" make upgrade-mlrun-api-deps-lock- Ruff (v0.8.0+) for both formatting and linting (configured in
pyproject.toml). - Run
make fmtbefore every commit. - CI enforces linting via
make lint.
- snake_case: functions, variables, modules, parameters.
- CamelCase: classes.
- Prefer explicit, readable names; avoid unclear abbreviations.
- Internal (repo) imports: SHOULD prefer module imports (
import pkg.mod) or aliases (import pkg.mod as mod) to reduce circular-import risk and make boundaries explicit.- Example (preferred):
import mlrun.utils - Example (avoid):
from mlrun.utils import logger
- Example (preferred):
- External packages:
from X import Yis acceptable when it improves readability. - Import boundaries:
mlrun.commonmust NOT import higher-levelmlrun.*modules (enforced by import-linter inpyproject.toml). - Forbidden imports:
- In
mlrun/, do NOT importkfpdirectly; usemlrun_pipelinesadapters instead. - In
mlrun/, do NOT importserver/py(server-side code).
- In
- Use type hints for complex data structures and public APIs.
- Public API functions/classes require docstrings (triple-quotes
""" """). - Docstring format: brief description,
:paramtags for parameters,:returntag for return value.
- Use structured logging (variables as fields, not f-strings):
from mlrun.utils import logger
# GOOD
logger.debug("Storing function", project=project, name=name, tag=tag)
# BAD: f-string in logs
logger.debug(f"Storing function {project}/{name}:{tag}")- NEVER log credentials: passwords, tokens, API keys, secret values, auth headers, session cookies.
- Stringify exceptions using
mlrun.errors.err_to_str(exc)instead ofstr(exc). - Use
mlrun.errors.raise_for_status(...)for HTTP response validation.
In async def contexts (server endpoints, async utilities):
- NEVER block the event loop with synchronous I/O.
- Run blocking I/O in a threadpool:
import mlrun.utils
async def handler():
# GOOD: run blocking work in threadpool
result = await mlrun.utils.run_in_threadpool(sync_db_call, "arg")
return result- Follow conventional commit format:
[<Scope>] Verb changes made(e.g.,[API] Add endpoint to list runs). - Use imperative verbs (Add, Fix, Update, Refactor).
- Include
fixorbugkeywords for bugfix PRs (auto-categorized in release notes).
┌─────────────────────────────────────────────────────────────────┐
│ User / IDE / CI/CD │
└───────────────────────────┬─────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────┐
│ MLRun SDK (mlrun/) │
│ - Projects, Runtimes, Feature Store │
│ - Data Store, Model Monitoring │
│ - Launchers (Local/Remote/Server) │
└───────────────┬───────────────────────────┘
│ HTTP (via mlrun.db.httpdb.HTTPRunDB)
▼
┌───────────────────────────────────────────┐
│ MLRun API Server (server/py/services/ │
│ api/) │
│ - FastAPI endpoints (api/endpoints/) │
│ - CRUD operations (crud/) │
│ - ServerSideLauncher (launcher.py) │
└───────────┬───────────────────────────────┘
│\
│ \ gRPC
│ ▼
│ ┌──────────────────┐
│ │ Log Collector │
│ │ (Go service) │
│ │ (gRPC + K8s API) │
│ └──────────────────┘
│
┌───────────┴────────────┬──────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ Database │ │ Kubernetes API │ │ Alerts │
│ (MySQL/ │ │ (Job/Pod/CRD │ │ Service │
│ PostgreSQL) │ │ orchestration) │ │ (alerts/) │
└──────────────┘ └──────────────────┘ └──────────────┘
1. SDK ↔ Server Boundary
- SDK communicates via
mlrun.db.httpdb.HTTPRunDB(implementsmlrun.db.base.RunDBInterface). - Project-scoped interface:
mlrun.projects.project.MlrunProject(high-level wrapper with project context and enrichment).
2. Launcher Pattern
- BaseLauncher (
mlrun/launcher/base.py): abstract interface for running functions. - ClientLocalLauncher (
mlrun/launcher/local.py): runs locally (user machine or remote with local semantics). - ClientRemoteLauncher (
mlrun/launcher/remote.py): submits jobs to API server/Kubernetes. - ServerSideLauncher (
server/py/services/api/launcher.py): server-side run submission with auth context. - Launcher selection: automatic based on
mlrun.config.is_running_as_api(),runtime._is_remote, andlocal=flag.
3. Shared Layer (mlrun.common)
- Foundation layer with minimal dependencies.
- Must NOT import higher-level
mlrun.*modules (enforced by import-linter).
4. Server Framework (server/py/framework/)
- DB sessions, middlewares, auth utilities, base services.
- Should NOT import specific services (some exceptions exist, tracked in
pyproject.toml).
Function Execution (Remote):
- User calls
function.run()in SDK. ClientRemoteLauncher(orServerSideLauncher) submits run viaHTTPRunDB.submit_run(...).- Server endpoint (
/projects/{project}/functions/{name}) receives request. - Server launcher creates Kubernetes job/pod with runtime spec.
- Log collector gRPC service streams logs from pod to persistent storage.
- Run results/artifacts stored in DB and object storage.
- Unit Tests: pytest, mocks (
pytest-mock), fixtures. - Integration Tests: pytest with Docker containers (MySQL, Postgres via
pytest-mock-resources), K8s interactions (via test clusters). - System Tests: end-to-end tests against live MLRun CE or Iguazio system (marked with
@pytest.mark.enterprisefor enterprise-only features). - Coverage: tracked via
coverage.py(configured inpyproject.toml).
Every new FastAPI endpoint MUST:
- Authenticate requests via
framework.api.deps.authenticate_request:
from fastapi import Depends
import mlrun.common.schemas
import framework.api.deps
@router.get("/projects/{project}/resource")
async def my_endpoint(
project: str,
auth_info: mlrun.common.schemas.AuthInfo = Depends(
framework.api.deps.authenticate_request
),
):
# endpoint logic
pass- Authorize access via
framework.utils.auth.verifier.AuthVerifier:
import framework.utils.auth.verifier
import mlrun.common.schemas
async def verify_permissions(
project: str,
resource_name: str,
auth_info: mlrun.common.schemas.AuthInfo,
):
# Project-level permission
await framework.utils.auth.verifier.AuthVerifier().query_project_permissions(
project_name=project,
action=mlrun.common.schemas.AuthorizationAction.read,
auth_info=auth_info,
)
# Resource-level permission
await framework.utils.auth.verifier.AuthVerifier().query_project_resource_permissions(
resource_type=mlrun.common.schemas.AuthorizationResourceTypes.function,
project_name=project,
resource_name=resource_name,
action=mlrun.common.schemas.AuthorizationAction.update,
auth_info=auth_info,
)- NEVER log credentials: passwords, tokens, API keys, secret values, auth headers, session cookies.
- Use
mlrun.secretsmodule for secret management (integrates with Kubernetes secrets, Vault). - Sanitize request/response payloads before logging.
mlrun/utils/version/version.json– Auto-generated by release automation; manual edits will be overwritten.- Database migration files (
server/py/services/api/migrations/versions/*.py) – Never edit existing migrations; create new ones viamake create-migration. **/locked-requirements.txt– Auto-generated by uv; usemake upgrade-mlrun-deps-lockto update..github/workflows/*.yaml– CI/CD pipelines; changes require maintainer review.
- API schema changes (FastAPI endpoint modifications, new fields in schemas) – Require backward compatibility review.
- Database schema migrations – Require DB team review.
- Import boundary changes (
pyproject.tomlimport-linter rules) – Require architecture review. - Deprecations/removals – Follow
DEPRECATION.mdprocess; update Jira ticket.
- DRY: Extract helpers instead of copy/pasting logic across modules.
- KISS: Prefer straightforward solutions; minimize abstractions.
- YAGNI: Don't add "future-proof" frameworks without concrete need.
Key configuration variables (see mlrun/config.py and hack/*.env for examples):
MLRUN_PYTHON_PACKAGE_INSTALLER- eitherpiporuv. preferuv.MLRUN_DBPATH– MLRun API server URL (e.g.,http://localhost:8080).MLRUN_VERSION– Version override for builds.MLRUN_DOCKER_REGISTRY– Docker registry prefix (default: DockerHub).MLRUN_NO_CACHE– Disable build/pip caching (set to any value).MLRUN_SKIP_COMPILE_SCHEMAS– Skip protobuf schema compilation during build.MLRUN_SYSTEM_TESTS_COMPONENT– Filter system tests by component (or prefix withno_to exclude).COVERAGE_FILE– Coverage data file path (forRUN_COVERAGE=true).
- Custom runtimes: extend
mlrun.runtimes.base.BaseRuntimeand register inmlrun.runtimes/__init__.py. - Data sources: implement
mlrun.datastore.base.DataStoreinterface for custom storage backends. - Serving steps: extend
mlrun.serving.server.GraphStepfor custom serving graph nodes. - Model monitoring apps: implement custom monitoring apps in
mlrun.model_monitoring.applications/.
- Import-time feature flags: check
mlrun.mlconf.<feature>(e.g.,mlrun.mlconf.igz_versionfor Iguazio integration). - Runtime feature flags: controlled via
mlrun.common.schemas.FeatureFlags(server-side).
- Architecture:
docs/architecture.md– high-level system design. - Contributing:
CONTRIBUTING.md– dev environment setup, coding conventions, PR guidelines. - Deprecation process:
DEPRECATION.md– how to deprecate APIs/parameters/endpoints. - Cheat sheet:
docs/cheat-sheet.md– quick reference for common SDK operations.
- SDK ↔ server boundary:
mlrun/db/httpdb.py,mlrun/db/base.py. - Project layer:
mlrun/projects/project.py,mlrun/projects/pipelines.py. - Launcher selection:
mlrun/launcher/factory.py,mlrun/launcher/,server/py/services/api/launcher.py. - Server endpoints:
server/py/services/api/api/endpoints/. - Server DB/session infra:
server/py/framework/db/,server/py/framework/rundb/. - Go log collector:
server/go/services/logcollector/.
- MLRun Docs: docs.mlrun.org (stable version on ReadTheDocs).
- Tutorials: Tutorials (Jupyter notebooks).
- API Reference: API Reference (auto-generated from docstrings).
- Pipeline adapters:
pipeline-adapters/README.md(KFP integration details). - Automation scripts:
automation/(release notes, version management, deployment). - Local dev setup:
hack/local/README.md(running MLRun locally on Kubernetes).