Skip to content

[ci] fix: Add more test cases in e2e_ppo_trainer_megatron_sglang_ascend.yml#6877

Merged
wucong25 merged 3 commits into
verl-project:mainfrom
xiazhahe:ci
Jun 30, 2026
Merged

[ci] fix: Add more test cases in e2e_ppo_trainer_megatron_sglang_ascend.yml#6877
wucong25 merged 3 commits into
verl-project:mainfrom
xiazhahe:ci

Conversation

@xiazhahe

@xiazhahe xiazhahe commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review.

Fixed some problems of e2e_ppo_trainer_megatron_sglang.yml on Ascend and added more CI cases for maintenance.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, veomni, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward, fully_async, one_step_off
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@xiazhahe xiazhahe changed the title [ci] fix: Add more test cases in e2e_ppo_trainer_megatron_sglang_asce… [ci] fix: Add more test cases in e2e_ppo_trainer_megatron_sglang_ascend.yml Jun 29, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the configuration enabling flash attention for the reference model on NPU in the PPO trainer Megatron end-to-end test. The reviewer notes that this removal could lead to severe performance degradation or Out-Of-Memory (OOM) errors, and suggests keeping the configuration by correcting the double plus (++) prefix to a single plus (+) prefix to avoid Hydra parsing issues.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

"${common_params[@]}" \
+actor_rollout_ref.actor.megatron.override_transformer_config.context_parallel_size=${ACTOR_CP} \
+actor_rollout_ref.actor.megatron.override_transformer_config.use_flash_attn=True \
++actor_rollout_ref.ref.megatron.override_transformer_config.use_flash_attn=True \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Removing use_flash_attn=True for the reference model (ref) on NPU can lead to severe performance degradation or Out-Of-Memory (OOM) errors during PPO training, as the reference model will run without flash attention. If this was removed due to a Hydra parsing error caused by the double plus (++), please consider keeping it with a single plus (+) to match the actor's configuration style.

Suggested change
++actor_rollout_ref.ref.megatron.override_transformer_config.use_flash_attn=True \
+actor_rollout_ref.ref.megatron.override_transformer_config.use_flash_attn=True \
References
  1. Use + instead of ++ as the prefix for overriding configuration values in Hydra.

@wucong25 wucong25 merged commit d15a9d3 into verl-project:main Jun 30, 2026
35 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants