aiter test workflow enhance by kiran-thumma · Pull Request #2905 · ROCm/aiter

kiran-thumma · 2026-04-24T09:19:50Z

Motivation

Add a hard wheel smoke gate so downstream GPU suites fail fast when the published AITER wheel is broken.
Normalize GPU labels and expose skip toggles so each suite only consumes the GPUs it needs.

Technical Details

.github/workflows/nightly.yaml: introduced skip_wheel_smoke, skip_sglang, skip_vllm, skip_atom, routed the smoke gate through test-whl.yaml, and short-circuited matrices when their suite is disabled while keeping the dependency chain intact.
.github/workflows/test-whl.yaml: added published-wheel fallback download, MI300X/MI35X × Python 3.10/3.12 coverage, and a no-op path when callers skip smoke.
.github/configs/vllm_models.json, .github/configs/vllm_tests.json, .github/scripts/run_vllm.sh, .github/scripts/run_vllm_test.sh, .github/configs/vllm_pins.json: normalized runner labels and enforced vLLM pin usage.
index.html: refreshed the dashboard to reflect the wheel gate, job counts, and new skip knobs.

Test Plan

Static workflow inspection.

Test Result

Not run (workflow-only change).

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests

- Add run_sglang, run_vllm, run_atom workflow_dispatch toggles - Create modular scripts: run_sglang.sh, run_vllm.sh, run_atom.sh - Wheels go to devreleases from non-schedule triggers - Promote only when all selected integration tests pass

- Add run_sglang, run_vllm, run_atom workflow_dispatch toggles - Create modular scripts for Docker setup and test execution - Model configs in JSON files for easy maintenance - ATOM: all 15 accuracy models loaded from atom_models.json - vLLM: 7 latency benchmarks loaded from vllm_models.json - SGLang: dispatches full scout to sgl-project/sglang - Wheels go to devreleases from non-schedule triggers - Promote only when all selected integration tests pass

- Add skip_build toggle to bypass build for faster testing - Add wheel_url input to use pre-built wheel directly - Add JSON model configs (atom_models.json, vllm_models.json) - Fix integration tests not depending on build when skipped - Add sglang_job_filter dropdown

- SGLang: run on aiter-1gpu-runner instead of dispatching to external repo - ATOM: fix accuracy results path, log file path, workspace mount - Fix artifact names with illegal characters - Add cleanup traps to all scripts - Add atom_models.json and vllm_models.json configs

…versions)

no results

test source install + tblib

github-actions · 2026-04-24T09:20:16Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests
`ci:atom`	ATOM benchmark (DeepSeek-R1 + GPT-OSS)
`ci:vllm`	vLLM benchmark
`ci:all`	All of the above

Add labels via the sidebar or gh pr edit 2905 --add-label <label>

gyohuangxin · 2026-04-24T09:41:12Z

@kiran-thumma Can we reuse current test workflows instead of adding so many new tests?
cc @valarLip

github-actions · 2026-04-24T16:15:36Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests
`ci:atom`	ATOM benchmark (DeepSeek-R1 + GPT-OSS)
`ci:vllm`	vLLM benchmark
`ci:all`	All of the above

Add labels via the sidebar or gh pr edit 2905 --add-label <label>

kiran-thumma · 2026-04-24T16:15:44Z

@kiran-thumma Can we reuse current test workflows instead of adding so many new tests? cc @valarLip

I'm reusing the test-workflow.yml and nightly.yml workflows adding more tests and its not yet for review.

kiran-thumma added 16 commits April 16, 2026 20:35

Add missing JSON model configs for atom and vllm

4b328dc

Retrigger CI

e05405b

Fix SGLang to run on GPU runner, fix artifact names, add cleanup traps

16baa9f

Full scout: 90 jobs across SGLang, vLLM, ATOM

5355d33

Use pip index URL instead of direct wheel URL

0e2a726

Add aiter_version input to pin specific wheel version

8e34797

fix: Auto-detect AITER version from pip index (handles PEP 440 local …

f90c41d

…versions)

Fix vLLM tests (clone repo for test files), pip install (direct URL)

0d0bf3f

feat: update scout scripts

0de65d2

fix: ATOM rm existing dir, vLLM source install + tblib + fail on

be142e6

no results

fix: Add --nightly flag to 14 SGLang suites, remove glm51, vLLM

75e43d2

test source install + tblib

Gate wheel smoke tests and align GPU runners

684220a

kiran-thumma added 3 commits April 24, 2026 04:45

Gate wheel smoke tests and align GPU runners

93fdafe

Fix nightly workflow gating and skip controls

f45b1bf

Fix nightly workflow gating and skip controls

bd6527c

gyohuangxin closed this Apr 24, 2026

test run for python 3.12 whls

02f6c7d

kiran-thumma reopened this Apr 24, 2026

kiran-thumma and others added 4 commits April 24, 2026 11:16

Merge branch 'main' into kithumma/aiter-test-workflow-enhance

3604ebd

Expose wheel smoke python selector

9113131

Adjust nightly/test-whl gating inputs

66af33f

Add matrix python filter

149de44

kiran-thumma added 5 commits April 25, 2026 16:12

Fix python filter without expressions

35f6485

Use tr for python filter normalization

60fd311

Syntax fixes

adfcf53

removed smoke tests, update after matrix tests

e8aab4d

removed smoke tests, update after matrix tests

493970c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aiter test workflow enhance#2905

aiter test workflow enhance#2905
kiran-thumma wants to merge 29 commits intomainfrom
kithumma/aiter-test-workflow-enhance

kiran-thumma commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

gyohuangxin commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

kiran-thumma commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kiran-thumma commented Apr 24, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

github-actions Bot commented Apr 24, 2026

🏷️ CI Guide

Uh oh!

gyohuangxin commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

🏷️ CI Guide

Uh oh!

kiran-thumma commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants