AIR CLI Integration: Adding support for air run configuration by riddhibhagwat-db · Pull Request #5657 · databricks/cli

riddhibhagwat-db · 2026-06-18T18:42:42Z

Changes

Ports the air run YAML config schema and its structural validation from the Python CLI (cli/sdk/config.py) to Go, under experimental/air/cmd/.

Schema (runconfig.go): the top-level runConfig plus the nested environment (with docker_image), code_source/snapshot/git, and permission blocks. Reuses the compute model from the parent branch. Includes custom YAML unmarshalers for the three polymorphic fields that don't map to a single Go type: environment.dependencies (string path or inline list), environment.version (string or int), and git.remote (bool or remote-name string).
Loader (runconfig_load.go): loadRunConfig decodes a YAML file with KnownFields(true) — mirroring pydantic's extra="forbid" so unknown keys are rejected — then runs the validation pass.
Validation: every structural rule from the Python schema — required fields, the experiment_name/mlflow_run_name task-key regex and length caps, secret-ref scope/key format, the environment docker-image/dependencies/version exclusivity rules, git branch-xor-commit and remote-requires-branch rules, code_source snapshot requirements, and include_paths relative/no-traversal checks.

Two deliberate divergences from the Python schema, both following from the training-service-only port:

The compute.node_pool_id / compute.pool_name fields were already dropped on the parent branch.
The top-level priority field is dropped here: it's a node-pool queue-ordering knob (it requires a pool in Python) with no meaning for serverless workloads.

Why

"Structural" validation (types, required fields, format/cross-field rules) needs no workspace access, so it's a self-contained, fully unit-testable unit that's worth landing on its own ahead of the launch logic. Splitting it out keeps the upcoming handle_run PR focused on orchestration rather than mixing in ~900 lines of schema.

The extra="forbid" / KnownFields behavior is load-bearing: it's what turns a typo'd or stale config key into an actionable error instead of a silently-ignored field, so it's preserved faithfully. This is stacked on air-integration-m2-1 (the compute model).

Tests

New unit tests in runconfig_test.go (62 subtests, table-driven), covering:

Loading a minimal config and a full-featured config (all blocks populated).
Each polymorphic union decoding both of its forms (dependencies string vs list, git.remote bool vs string, default-unset).
Unknown-field rejection at top level and nested — including explicit cases asserting the dropped priority field and the not-yet-ported bases key surface as errors.
Every validation rule's failure mode, plus file-level errors (missing file, empty file).

go test ./experimental/air/... passes; ./task lint-q reports 0 issues.

eng-dev-ecosystem-bot · 2026-06-18T19:13:09Z

Integration test report

Commit: f764fa9

Run: 28468312832

	Env	🟨KNOWN	💚RECOVERED	🙈SKIP	✅pass	🙈skip	Time
🟨	aws linux	7	1	13	263	1015	7:49
🟨	aws windows	7	1	13	265	1013	12:17
💚	aws-ucws linux		8	13	359	929	8:12
💚	aws-ucws windows		8	13	361	927	10:50
💚	azure linux		2	15	266	1013	7:09
💚	azure windows		2	15	268	1011	10:50
💚	azure-ucws linux		2	15	364	925	7:59
💚	azure-ucws windows		2	15	366	923	11:54
💚	gcp linux		2	15	262	1016	7:41
💚	gcp windows		2	15	264	1014	10:41

21 interesting tests: 13 SKIP, 7 KNOWN, 1 RECOVERED

	Test Name	aws linux	aws windows	aws-ucws linux	aws-ucws windows	azure linux	azure windows	azure-ucws linux	azure-ucws windows	gcp linux	gcp windows
🟨	TestAccept	🟨K	🟨K	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R
🙈	TestAccept/bundle/invariant/no_drift	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/permissions	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions	🟨K	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions	🟨K	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	🟨K	🟨K	💚R	💚R
🙈	TestAccept/bundle/resources/postgres_branches/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/replace_existing	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/update_protected	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/without_branch_id	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_projects/update_display_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/synced_database_tables/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/ssh/connection	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
💚	TestFetchRepositoryInfoAPI_FromRepo	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R

Top 28 slowest tests (at least 2 minutes):

duration	env	testname
5:09	gcp windows	TestAccept
4:56	azure-ucws windows	TestAccept
4:55	azure windows	TestAccept
4:55	aws-ucws windows	TestAccept
4:35	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:17	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:16	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:11	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:08	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:04	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:38	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:26	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:18	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:15	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:57	azure linux	TestAccept
2:56	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:55	azure-ucws linux	TestAccept
2:52	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:48	gcp linux	TestAccept
2:48	aws-ucws linux	TestAccept
2:47	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:38	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:37	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:35	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:35	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:34	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:32	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:22	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

riddhibhagwat-db · 2026-06-25T00:21:25Z

+
+// validate runs structural validation over the whole config, returning the first
+// failure. Fields are checked in declaration order to keep error output stable.
+func (c *runConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:21:44Z

+	DockerImage  *dockerImageConfig `yaml:"docker_image"`
+}
+
+func (e *environmentConfig) validate() error {


@ben-hansen-db @vinchenzo-db

this will change slight as Maggie figures out name for image url but this is good for now

Just FYI: In the future DockerImage will not be mutually exclusive with Dependencies. And the we might give it a new name.

riddhibhagwat-db · 2026-06-25T00:21:55Z

+	URL string `yaml:"url"`
+}
+
+func (d *dockerImageConfig) validate() error {


@ben-hansen-db @vinchenzo-db

this is fine for now

riddhibhagwat-db · 2026-06-25T00:22:05Z

+	Snapshot *snapshotSourceConfig `yaml:"snapshot"`
+}
+
+func (c *codeSourceConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:16Z

+	IncludePaths []string `yaml:"include_paths"`
+}
+
+func (s *snapshotSourceConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:25Z

+	Remote gitRemote `yaml:"remote"`
+}
+
+func (g *gitRef) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:35Z

+	Level string `yaml:"level"`
+}
+
+func (p *permission) validate() error {


@ben-hansen-db @vinchenzo-db

I wonder if there's a struct here for permissions already? This mirrors permissions field for dabs

riddhibhagwat-db · 2026-06-25T00:23:57Z

+	assert.Equal(t, 1, cfg.Compute.NumAccelerators)
+}
+
+func TestLoadRunConfig_FullFeatured(t *testing.T) {


@vinchenzo-db @ben-hansen-db

vinchenzo-db

maybe a question for @maggiewang-db or @ben-hansen-db, is there a way to make sure this run config doesn't diverge from the run config in the python cli?

vinchenzo-db · 2026-06-25T01:39:35Z

+type gitRef struct {
+	Branch *string   `yaml:"branch"`
+	Commit *string   `yaml:"commit"`
+	Remote gitRemote `yaml:"remote"`


I'm not very good at Go but is the gitRemote flag used anywhere outside of tests?

gitRemote is used in production code, but only inside runconfig.go itself and no other file references it as of now. It is just that the feature it validates (code_source) is itself deferred, so gitRemote (and the rest of gitRef/snapshotSourceConfig) currently is there only as part of config-parsing/validation, not the run mechanics (yet)

vinchenzo-db · 2026-06-25T01:41:52Z

+	EnvVariables   map[string]string  `yaml:"env_variables"`
+	Secrets        map[string]string  `yaml:"secrets"`


Make sure you check that these aren't pointing to the same env variable.... and if so I imagine EnvVariable should take precedence, but ask @maggiewang-db

Reasoning is that if user accidentally sets env var and secret, and doesn't know which one, if secret takes precedence and they try to read it they make accidentally leak their secret.

This makes sense, I can implement this change if @maggiewang-db can sign off on the precedence order

If there's a collision we should just reject the request at validation with a clear error message.

I'm fine with that

vinchenzo-db · 2026-06-25T01:42:38Z

+	MaxRetries                *int           `yaml:"max_retries"`
+	TimeoutMinutes            *int           `yaml:"timeout_minutes"`
+	IdempotencyToken          *string        `yaml:"idempotency_token"`
+	Parameters                map[string]any `yaml:"parameters"`


Are you sure map[string]any is best here? Maybe some validation to make sure people aren't injecting bad inputs.

this is a free form map without content validation right now (and mirrors python cli which says parameters: Optional[Dict[str, Any]] = Field(default=None) and does not have a validator (sdk/config.py:443) ).

I dont think we need to add an explicit validation here since there is no real privilege boundary (the parameters themselves are the user's own hyperparameters and they are not interpolated into another user's context or into a shell command. The yaml file that they land in is read by the users own training script. Also, the yaml.Marshal escapes/quotes keys and values, so a value like "; rm -rf /" would just become a quoted yaml string and nothing that is actualy executable.

vinchenzo-db · 2026-06-25T01:43:52Z

+// validateExperimentName enforces the Databricks Jobs API task_key constraints:
+// the experiment_name becomes a task key, which caps at 100 characters and allows
+// only alphanumerics, hyphens, and underscores.


do we need to do something similar with mlflow_run_name?

I have this right now for mlflow_run_name (see line 112)

if c.MLflowRunName != nil { v := strings.TrimSpace(*c.MLflowRunName) if v == "" { return errors.New("mlflow_run_name cannot be empty") } if !taskKeyRe.MatchString(v) { return fmt.Errorf("invalid mlflow_run_name %q: only alphanumeric characters, hyphens, and underscores are allowed", v) } }

If we want to also validate for 100 chars or less, I can add that in.

That's a @ben-hansen-db question :P

ben-hansen-db

Nice PR! Very clean

ben-hansen-db · 2026-06-30T04:20:05Z

+	MLflowRunName             *string        `yaml:"mlflow_run_name"`
+	MLflowExperimentDirectory *string        `yaml:"mlflow_experiment_directory"`
+	Permissions               []permission   `yaml:"permissions"`
+	UsagePolicyName           *string        `yaml:"usage_policy_name"`


small update, we take usagepolicyname or usage policy id, they are mutually exclusive. underthehood we resolve to the id

ben-hansen-db · 2026-06-30T04:20:53Z

+		return err
+	}
+
+	if err := validateSecretRefs(c.Secrets); err != nil {


I see, you are validating everything here already

ben-hansen-db · 2026-06-30T04:23:49Z

+	DockerImage  *dockerImageConfig `yaml:"docker_image"`
+}
+
+func (e *environmentConfig) validate() error {


this will change slight as Maggie figures out name for image url but this is good for now

ben-hansen-db · 2026-06-30T04:24:39Z

+	URL string `yaml:"url"`
+}
+
+func (d *dockerImageConfig) validate() error {


this is fine for now

ben-hansen-db · 2026-06-30T04:26:39Z

+	Level string `yaml:"level"`
+}
+
+func (p *permission) validate() error {


I wonder if there's a struct here for permissions already? This mirrors permissions field for dabs

ben-hansen-db · 2026-06-30T04:27:18Z

+    git:
+      branch: main
+      remote: origin
+    include_paths:


nice example!

Port the run YAML schema and its structural validation from the Python CLI's sdk/config.py: the top-level runConfig plus the environment, docker_image, code_source/snapshot/git, and permission blocks. loadRunConfig decodes a YAML file with KnownFields (mirroring pydantic extra="forbid") and runs the validation pass. "Structural" covers types, required fields, and format/cross-field rules that need no workspace access. Online checks (compute pool resolution, GPU availability), git/filesystem checks, _bases_ composition, and CLI --override handling are deferred to later milestones. Two deliberate divergences from the Python schema, both following from the training-service-only port: the compute pool fields were already dropped, and the top-level priority field is dropped here since it is a node-pool queue-ordering knob with no meaning for serverless workloads. Co-authored-by: Isaac

## Changes Implements the `air run` happy path on top of the config schema (#5657), submitting a one-time training run through the Jobs API. Five commits, one per phase: 1. run config launch accessors: flatten the validated config into launch values (timeout seconds, retry default, requirements file-vs-inline, runtime version). 2. wire run command (load, validate, dry-run): air run -f <config> loads + structurally validates the YAML; `--dry-run` validates offline (no workspace/auth) and returns; `--override/--watch` are rejected for now with clear errors (ported in future PR). 3. pre-submit resolution: resolve current user / workspace home / a unique cli_launch dir, and ensure a custom `experiment_directory` exists. 4. upload launch artifacts: write training_config.yaml (1 MB cap), command.sh, requirements.yaml (file or synthesized from inline deps), `env_vars.json` / `secret_env_vars.json`, and hyperparameters.yaml into the launch dir via a workspace filer. 5. assemble + submit: build the native `ai_runtime_task` payload and `POST /api/2.2/jobs/runs/submit` directly, then print the run id + dashboard URL (or a JSON envelope). Submission uses the **native `ai_runtime_task`** task (BYOT task type) and it talks only to the Jobs API (which internally routes to training service endpoint) and has no genai-mapi forwarding (the MAPI path is deprecated). It isn't modeled by the typed SDK in go, so the payload is a custom struct posted to the raw endpoint. The proto is lean: env vars and secrets ship as co-located `env_vars.json` / `secret_env_vars.json` files rather than inline, and `requirements.yaml` / `hyperparameters.yaml` are derived server-side from the command directory. **Deferred, with explicit "not yet supported" errors (no silent drops):** `code_source` snapshot packaging, `--watch` log streaming, and `usage_policy_name`. `environment.docker_image` is accepted by the schema as scaffolding but not conveyed in the payload (the native path has no docker field). `node_pool_id` / `pool_name` / `priority` remain dropped (new AIR CLI does not support pool placement). ## Why `air run` is the core of the migration for AIR CLI. Splitting it into per-phase commits keeps each reviewable in isolation, and stacking on the schema PR keeps that PR focused. Regarding some specific decisions: - We maintain the native ai_runtime_task (and not the genai_compute_task interfacing with mapi) as a hand built struct posted to the raw endpoint. This is so that we can interface with jobs directly (and jobs.SubmitTask only knows gen_ai_compute_task and this typed struct also omits the env-vars/secrets/requirements fields that are needed for the run) and make sure we also stay off the deprecated genai-mapi forwarding path. - `--dry-run` is decoupled from auth. It validates the config locally and returns before any workspace call, so config validation works fully offline (matching the Python CLI). Only actual submission requires an authenticated workspace client. ## Tests - Unit tests for every phase: launch accessors, pre-submit resolution (incl. ensureExperimentDirectory create/exists/not-a-directory), artifact assembly + upload, payload assembly, and submitWorkload end-to-end against a fake workspace. - New acceptance/experimental/air/run test covering --dry-run (text + JSON), the --override/--watch guards, an invalid config, and missing --file. - Updated the unimplemented acceptance test (removed run, now implemented). `go test ./experimental/air/...`, `go test ./acceptance -run TestAccept/experimental/air`, and `./task lint-q` all pass. **Manual verification tests (all pass):** - Dry run (offline, no auth) > - command only > - full run config > - json output - actual run submission > - throws error when profile is not set > - submission loop: submitted, can see the run in `air list` and `air get` and mlflow environment was created > - same run id gets ouputted when run submitted with the SAME idempotency key > - new run gets created when run submitted with SAME config but DIFFERENT idempotency key - `--watch` and `--override` return an informative error message (since they are not supported yet, but are valid flags) - usage_policy_name set in config throws error: usage_policy_name is not yet supported - code_source set in config throws error: code_source is not yet supported - missing --file throws informative error: required flag(s) "file" not set - invalid config (e.g. experiment_name: bad.name, or num_accelerators not a multiple of the per-node count) throws field-specific validation error **How to test locally for manual verification:** Checkout & build: ```bash git fetch origin git checkout air-integration-m2-3 # this PR (stacked on air-integration-m2-2) ./task build ``` Sample configs: ```bash cat > /tmp/min.yaml <<'YAML' experiment_name: air-cuj command: python train.py compute: {accelerator_type: GPU_1xH100, num_accelerators: 1} YAML ``` ```bash cat > /tmp/full.yaml <<'YAML' experiment_name: full-run command: | pip install -r requirements.txt python train.py compute: {accelerator_type: GPU_8xH100, num_accelerators: 16} environment: {dependencies: [torch==2.3.0], version: 5} env_variables: {WANDB_PROJECT: demo} secrets: {HF_TOKEN: my_scope/hf_token} parameters: {lr: 0.001, epochs: 3} mlflow_run_name: full-run-v2 max_retries: 2 timeout_minutes: 120 YAML ``` Automated tests ```bash go test ./experimental/air/... # unit (incl. submitWorkload vs a fake workspace) go test ./acceptance -run TestAccept/experimental/air # acceptance (run + unimplemented) ./task lint-q # lint changed files ``` Dry run: ```bash ./cli experimental air run -f /tmp/min.yaml --dry-run # note that this command will, in the final version, be databricks experimental air run ./cli experimental air run -f /tmp/full.yaml --dry-run ./cli experimental air run -f /tmp/min.yaml --dry-run -o json ``` Actual run submission: ```bash PROFILE=<your-dev-profile> # no auth configured → fails fast (exit 1) env -u DATABRICKS_HOST -u DATABRICKS_TOKEN ./cli experimental air run -f /tmp/min.yaml #> Error: ... (cannot configure default credentials / auth) # submit → prints run_id + dashboard URL ./cli experimental air run -f /tmp/min.yaml -p $PROFILE -o json #> { "data": { "status":"SUBMITTED", "run_id":"<id>", "dashboard_url":"<host>/jobs/runs/<id>" } } # verify in the workspace: open dashboard_url (run exists), and the MLflow experiment was created. ./cli experimental air get <run_id> -p $PROFILE # run state ./cli experimental air list -p $PROFILE # run appears in the list # idempotency — SAME key returns the SAME run_id (no new run) ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json # run_id = X ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json # run_id = X (same) # idempotency — DIFFERENT key creates a NEW run ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-2 -o json # run_id = Y (new) ``` Unsupported flags (asserting that error is thrown): ```bash ./cli experimental air run -f /tmp/min.yaml --dry-run --watch #> Error: --watch is not yet supported ./cli experimental air run -f /tmp/min.yaml --dry-run --override compute.num_accelerators=8 #> Error: --override is not yet supported # usage_policy_name (needs a workspace to reach the submit guard) printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\nusage_policy_name: my-policy\n' > /tmp/policy.yaml ./cli experimental air run -f /tmp/policy.yaml -p $PROFILE #> Error: usage_policy_name is not yet supported # code_source printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\ncode_source: {type: snapshot, snapshot: {root_path: .}}\n' > /tmp/code.yaml air run -f /tmp/code.yaml -p $PROFILE #> Error: code_source is not yet supported ``` Validation errors for field-specific message (exit 1, offline): ```bash # missing --file air run --dry-run #> Error: required flag(s) "file" not set # invalid experiment_name + num_accelerators not a multiple of the per-node count printf 'experiment_name: bad.name\ncommand: x\ncompute: {accelerator_type: GPU_8xH100, num_accelerators: 3}\n' > /tmp/bad.yaml air run -f /tmp/bad.yaml --dry-run #> Error: invalid experiment_name "bad.name": only alphanumeric characters, hyphens (-), and underscores (_) are allowed # (and, once the name is fixed: compute.num_accelerators for GPU_8xH100 must be a multiple of 8, got 3) ```

riddhibhagwat-db temporarily deployed to test-trigger-is June 18, 2026 18:48 — with GitHub Actions Inactive

riddhibhagwat-db requested review from ben-hansen-db and maggiewang-db June 18, 2026 21:10

riddhibhagwat-db self-assigned this Jun 18, 2026

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from 73088e7 to 373988d Compare June 23, 2026 17:10

riddhibhagwat-db mentioned this pull request Jun 24, 2026

AIR CLI Integration: air run end to end command #5710

Merged

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from 373988d to 226d41a Compare June 25, 2026 00:01

riddhibhagwat-db temporarily deployed to test-trigger-is June 25, 2026 00:02 — with GitHub Actions Inactive

riddhibhagwat-db commented Jun 25, 2026

View reviewed changes

Comment thread experimental/air/cmd/runconfig_load.go Outdated

riddhibhagwat-db commented Jun 25, 2026

View reviewed changes

vinchenzo-db reviewed Jun 25, 2026

View reviewed changes

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from 226d41a to c992624 Compare June 25, 2026 21:37

riddhibhagwat-db temporarily deployed to test-trigger-is June 25, 2026 21:39 — with GitHub Actions Inactive

ben-hansen-db approved these changes Jun 30, 2026

View reviewed changes

maggiewang-db approved these changes Jun 30, 2026

View reviewed changes

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from c992624 to f764fa9 Compare June 30, 2026 18:50

riddhibhagwat-db temporarily deployed to test-trigger-is June 30, 2026 18:50 — with GitHub Actions Inactive

riddhibhagwat-db merged commit 60adcaa into air-cli Jun 30, 2026
26 checks passed

riddhibhagwat-db deleted the air-integration-m2-2 branch June 30, 2026 21:35

		EnvVariables map[string]string `yaml:"env_variables"`
		Secrets map[string]string `yaml:"secrets"`

Uh oh!

Conversation

riddhibhagwat-db commented Jun 18, 2026

Changes

Why

Tests

Uh oh!

eng-dev-ecosystem-bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration test report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vinchenzo-db left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riddhibhagwat-db Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ben-hansen-db left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

eng-dev-ecosystem-bot commented Jun 18, 2026 •

edited

Loading

riddhibhagwat-db Jun 25, 2026 •

edited

Loading