[rollout] fix: support SGLang FP8 ignored layers for Qwen3.x GatedDeltaNet in rollout by gem-mint · Pull Request #6905 · verl-project/verl

gem-mint · 2026-07-01T03:35:36Z

Summary

Share SGLang rollout FP8 quantization config between server launch and weight sync.
Preserve SGLang's ignored_layers / modules_to_not_convert rules and SGLANG_FP8_IGNORED_LAYERS in verl-side FP8 weight conversion.
Add unit coverage for default config behavior and ignored-layer matching.

Motivation

SGLang 0.5.12 supports FP8 ignored_layers, but verl's SGLang rollout path built its FP8 config inline and the verl-side SGLangFP8QuantizerHelper did not honor the same ignore rules before update_weights.

This can break rollout FP8 for models with small projection layers, such as Qwen GatedDeltaNet linear_attn projections, where block-wise FP8 expects dimensions divisible by the 128x128 weight block size.

In our Qwen3.5/3.6-35B-A3B GRPO rollout, the failure was triggered by GatedDeltaNet linear_attn.in_proj_ba: after tensor-parallel sharding, the local projection dimension is not divisible by SGLang block-wise FP8's weight_block_size=[128, 128], so verl-side FP8 conversion must apply the same ignored-layer rules before update_weights.

With this change, users can pass ignored layers through the existing SGLang mechanisms, for example:

SGLANG_FP8_IGNORED_LAYERS=linear_attn

Duplicate Check

Before opening this PR, I checked for existing open PRs with:

gh pr list --repo verl-project/verl --state open --search "SGLang FP8 ignored_layers"
gh pr list --repo verl-project/verl --state open --search "rollout fp8 ignored_layers"
gh pr list --repo verl-project/verl --state open --search "SGLang quantization_config ignored_layers"

No duplicate open PRs were found.

Tests

python -m py_compile \
  verl/utils/sglang/sglang_fp8_utils.py \
  verl/workers/rollout/sglang_rollout/async_sglang_server.py \
  verl/workers/rollout/sglang_rollout/sglang_rollout.py \
  tests/utils/test_sglang_fp8_utils.py

git diff --check

PYTHONIOENCODING=utf-8 python tests/special_sanity/validate_structure.py \
  --allow-files tests/test_protocol_on_cpu.py tests/test_base_config_on_cpu.py tests/test_protocol_v2_on_cpu.py

PYTHONIOENCODING=utf-8 python tests/special_sanity/check_license.py \
  --directories verl/utils/sglang verl/workers/rollout/sglang_rollout

PYTHONIOENCODING=utf-8 python tests/special_sanity/check_device_api_usage.py --directory ./verl/utils/sglang
PYTHONIOENCODING=utf-8 python tests/special_sanity/check_device_api_usage.py --directory ./verl/workers/rollout/sglang_rollout

Also smoke-tested on a verl v0.7.1-based Qwen3.5/3.6-35B-A3B GRPO run with SGLang 0.5.12, rollout quantization=fp8, and SGLANG_FP8_IGNORED_LAYERS=linear_attn.

AI Assistance

This PR was prepared with AI assistance. I reviewed the generated changes and validated them on Qwen3.5/3.6-35B-A3B rollout FP8 training run.

Co-authored-by: OpenAI Codex <codex@openai.com>

gemini-code-assist

Code Review

This pull request refactors the SGLang FP8 quantization configuration by extracting hardcoded settings into a centralized utility module (sglang_fp8_utils.py). It introduces helper functions to normalize, deduplicate, and match ignored layers from Hugging Face configurations and environment variables, and adds corresponding unit tests. Feedback on the changes highlights that _get_config_value should check for the presence of a .get() method (e.g., hasattr(config, "get")) rather than strictly checking isinstance(config, dict) to ensure compatibility with OmegaConf DictConfig objects commonly used in the codebase.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Co-authored-by: OpenAI Codex <codex@openai.com>

gem-mint · 2026-07-01T05:52:11Z

This PR has been replaced by #6906 after renaming the source branch. The code diff is unchanged.

fix: honor SGLang FP8 ignored layers in rollout

279699b

Co-authored-by: OpenAI Codex <codex@openai.com>

gemini-code-assist Bot reviewed Jul 1, 2026

View reviewed changes

Comment thread verl/utils/sglang/sglang_fp8_utils.py Outdated

gem-mint changed the title ~~fix: honor SGLang FP8 ignored layers in rollout~~ [rollout] fix: support SGLang FP8 ignored layers in rollout Jul 1, 2026

gem-mint changed the title ~~[rollout] fix: support SGLang FP8 ignored layers in rollout~~ [rollout] fix: support SGLang FP8 ignored layers for Qwen3.x GatedDeltaNet in rollout Jul 1, 2026

gem-mint marked this pull request as ready for review July 1, 2026 03:43

gem-mint requested review from ArronHZG and chenhaiq as code owners July 1, 2026 03:43

fix: support mapping-like FP8 quant configs

dae71c9

Co-authored-by: OpenAI Codex <codex@openai.com>

gem-mint closed this Jul 1, 2026

gem-mint deleted the codex/sglang-rollout-fp8-ignored-layers branch July 1, 2026 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rollout] fix: support SGLang FP8 ignored layers for Qwen3.x GatedDeltaNet in rollout#6905

[rollout] fix: support SGLang FP8 ignored layers for Qwen3.x GatedDeltaNet in rollout#6905
gem-mint wants to merge 2 commits into
verl-project:mainfrom
gem-mint:codex/sglang-rollout-fp8-ignored-layers

gem-mint commented Jul 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gem-mint commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gem-mint commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Duplicate Check

Tests

AI Assistance

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gem-mint commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gem-mint commented Jul 1, 2026 •

edited

Loading