Skip to content

Fix cudagraph tests blocked by feature gap in xpugraph#3613

Closed
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-cudagraph-tests-blocked-feature-gap
Closed

Fix cudagraph tests blocked by feature gap in xpugraph#3613
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-cudagraph-tests-blocked-feature-gap

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 11, 2026

Multiple test cases across test_backends.py, test_cudagraphs.py, test_cudagraphs_expandable_segments.py, test_structured_trace.py, test_foreach.py, and test_cuda_repro.py are failing due to the XPUGraph feature gap in torch-xpu-ops. This PR skips the affected tests and adds XPU dynamo test wrappers with proper skip annotations.

Changes Made

  • test/xpu/skip_list_common.py: Updated test_foreach_xpu.py skip pattern from use_cuda_graph_True to use_xpu_graph_True (the parametrize name was renamed in the XPU test wrapper). Also added entries for 4 new dynamo XPU test files with cudagraph-related test skips.

  • test/xpu/dynamo/test_backends_xpu.py (new): XPU wrapper for test/dynamo/test_backends.py that skips test_aot_cudagraphs since XPUGraph is not fully supported.

  • test/xpu/dynamo/test_cudagraphs_xpu.py (new): XPU wrapper that skips all tests from test_cudagraphs.py (all depend on XPUGraph).

  • test/xpu/dynamo/test_cudagraphs_expandable_segments_xpu.py (new): XPU wrapper that skips all tests for the same reason.

  • test/xpu/dynamo/test_structured_trace_xpu.py (new): Full XPU-adapted version of test_structured_trace.py (based on upstream PR [XPU] Migrate 11 dynamo test cases for XPU pytorch/pytorch#169241) with test_gpugraphs (the XPU equivalent of test_cudagraphs) explicitly skipped due to XPUGraph feature gap.

Notes

  • test/inductor/test_cuda_repro.py::test_cpu_index already has @skipIfXpu(msg="cudagraph is not supported on xpu") in the upstream PyTorch, so no additional change is needed for that test.

Test

In PyTorch repo (run via op_ut):

  • test/xpu/dynamo/test_backends_xpu.py
  • test/xpu/dynamo/test_cudagraphs_xpu.py
  • test/xpu/dynamo/test_cudagraphs_expandable_segments_xpu.py
  • test/xpu/dynamo/test_structured_trace_xpu.py
  • test/xpu/test_foreach_xpu.py

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@copilot continue fix

- Update test_foreach_xpu.py skip pattern from use_cuda_graph_True to
  use_xpu_graph_True (parameter was renamed in test_foreach_xpu.py)
- Add dynamo/test_backends_xpu.py: skip test_aot_cudagraphs for XPU
- Add dynamo/test_cudagraphs_xpu.py: skip all tests (XPUGraph not supported)
- Add dynamo/test_cudagraphs_expandable_segments_xpu.py: skip all tests
- Add dynamo/test_structured_trace_xpu.py: from upstream with
  test_gpugraphs skipped due to XPUGraph feature gap
- Update skip_list_common.py with new test files and skip patterns

Agent-Logs-Url: https://github.com/intel/torch-xpu-ops/sessions/c5ce0178-8d74-410c-92c7-bd0ca1d861f1

Co-authored-by: laifenxiawucha <184592536+laifenxiawucha@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 11, 2026

@copilot continue fix

Implemented all the fixes in commit ceae918:

  1. skip_list_common.py: Updated test_foreach_xpu.py skip pattern from use_cuda_graph_Trueuse_xpu_graph_True (parameter was renamed in the test file)

  2. test/xpu/dynamo/test_backends_xpu.py: XPU wrapper that skips test_aot_cudagraphs (requires XPUGraph, which is not fully supported)

  3. test/xpu/dynamo/test_cudagraphs_xpu.py: XPU wrapper with all tests skipped (all require XPUGraph)

  4. test/xpu/dynamo/test_cudagraphs_expandable_segments_xpu.py: XPU wrapper with all tests skipped

  5. test/xpu/dynamo/test_structured_trace_xpu.py: Full XPU-adapted version of the upstream structured trace tests (from PR Migrate XPU UT files from Upstream PR #3593), with test_gpugraphs explicitly skipped due to XPUGraph feature gap

  6. skip_list_common.py: Added all 4 new dynamo test files with appropriate skip patterns for cudagraph tests

Note: test/inductor/test_cuda_repro.py::test_cpu_index already has @skipIfXpu(msg="cudagraph is not supported on xpu") in the upstream PyTorch, so no additional change is needed.

Copilot AI changed the title [WIP] Fix cudagraph tests blocked by feature gap in xpugraph Fix cudagraph tests blocked by feature gap in xpugraph May 11, 2026
@github-actions
Copy link
Copy Markdown

Performance outliers, please check!

  • 🔴 [-1, 80%), should be regression
Category Model Target vs. Baseline [Eager] Target vs. Baseline [Inductor]
timm_models_bfloat16_training mobilenetv3_large_100 0.783104 0.738373
timm_models_bfloat16_training vit_base_patch16_siglip_256 0.693589 0.741715
timm_models_bfloat16_training deit_base_distilled_patch16_224 0.739197 0.752590
timm_models_bfloat16_training nfnet_l0 0.730817 0.758981
timm_models_bfloat16_training visformer_small 0.721568 0.759062
timm_models_bfloat16_training dm_nfnet_f0 0.637236 0.776720
timm_models_bfloat16_training tf_efficientnet_b0 0.808093 0.781045
timm_models_bfloat16_training adv_inception_v3 0.787141 0.784371
timm_models_bfloat16_training mobilenetv2_100 0.832888 0.786561
timm_models_bfloat16_training ghostnet_100 0.861574 0.792291
timm_models_bfloat16_training mobilevit_s 0.715427 0.799490
timm_models_bfloat16_training beit_base_patch16_224 0.727340 0.804644
  • 🟡 [80%, 90%), may be fluctuations
Category Model Target vs. Baseline [Eager] Target vs. Baseline [Inductor]
timm_models_bfloat16_training convnextv2_nano.fcmae_ft_in22k_in1k 0.805841 0.806135
timm_models_bfloat16_training swin_base_patch4_window7_224 0.852855 0.819549
timm_models_bfloat16_training inception_v3 0.841121 0.825677
torchbench_bfloat16_training mnasnet1_0 1.046836 0.847807
timm_models_bfloat16_training repvgg_a2 0.844693 0.867251
timm_models_bfloat16_training deit_tiny_patch16_224.fb_in1k 0.878619 0.883622

@liangan1
Copy link
Copy Markdown
Contributor

The related issue is just used to track the task progress, No need to fix.
@ZhaoqiongZ

  1. need to add task tracking label in the issue.
  2. add skills to skip the related issue.

@ZhaoqiongZ ZhaoqiongZ added the disable_all Disable all ci test jobs for the PR, just keep basic lint check label May 13, 2026
@ZhaoqiongZ ZhaoqiongZ closed this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai_generated disable_all Disable all ci test jobs for the PR, just keep basic lint check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cudagraph tests blocked by feature gap

5 participants