Update excluded modules for Qwen3.5 dense PTQ by amukkara · Pull Request #1284 · NVIDIA/Model-Optimizer

amukkara · 2026-04-17T00:23:45Z

What does this PR do?

Type of change: Bug fix

For Qwen3.5 dense models, in_proj modules in linear attention are to be left unquantized.
Example in Qwen3.5-27B-FP8: https://huggingface.co/Qwen/Qwen3.5-27B-FP8/blob/main/config.json#L148
This PR updates _default_disabled_quantizer_config so that all Qwen3.5 dense models are quantized with the same exclusion pattern.

Usage

bash examples/llm_ptq/scripts/huggingface_example.sh --model Qwen/Qwen3.5-4B  --quant fp8 --tasks quant

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅
Did you write any new necessary tests?: ❌
Did you update Changelog?: ❌

Additional Information

Summary by CodeRabbit

Bug Fixes
- Updated default quantization settings to explicitly disable quantization for certain linear attention projection components, preventing unintended quantization and improving model accuracy and stability.

copy-pr-bot · 2026-04-17T00:23:49Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-17T00:23:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1e876f6e-9ea0-4f64-bfb7-92c0bc293cc9

📥 Commits

Reviewing files that changed from the base of the PR and between 92c0472 and b652987.

📒 Files selected for processing (1)

modelopt/torch/quantization/config.py

✅ Files skipped from review due to trivial changes (1)

modelopt/torch/quantization/config.py

📝 Walkthrough

Walkthrough

Two deny-list entries were appended to the default disabled quantizer configuration to explicitly disable quantizers matching *linear_attn.in_proj_a* and *linear_attn.in_proj_b*.

Changes

Cohort / File(s)	Summary
Quantization Configuration `modelopt/torch/quantization/config.py`	Appended two entries to `_default_disabled_quantizer_cfg` to set `enable: False` for quantizer name patterns `linear_attn.in_proj_a` and `linear_attn.in_proj_b`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: updating excluded modules for Qwen3.5 dense PTQ by disabling quantizers for linear_attn.in_proj_a/b modules.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	PR adds minimal quantizer configuration exclusions with no security-sensitive patterns, unsafe deserialization, eval/exec, or new dependencies.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Review rate limit: 8/10 reviews remaining, refill in 10 minutes and 26 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

modelopt/torch/quantization/config.py (1)
231-232: Add a focused regression test for these new exclusion patterns.

Line 231-232 update global default exclusions; please add a test that verifies quantizers matching *linear_attn.in_proj_a* and *linear_attn.in_proj_b* are disabled after config application. This helps lock in the intended Qwen3.5 PTQ behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/config.py` around lines 231 - 232, Add a unit
test that verifies the new exclusion patterns "*linear_attn.in_proj_a*" and
"*linear_attn.in_proj_b*" actually disable matching quantizers: import the
exclusion patterns from modelopt.torch.quantization.config (e.g.,
DEFAULT_EXCLUSIONS or the global exclusions variable), create mock quantizer
names like "encoder.linear_attn.in_proj_a.weight" and
"decoder.linear_attn.in_proj_b.bias" and then use the module's
exclusion-matching helper (e.g., matches_exclusion, is_excluded, or the function
that decides quantizer enablement) to assert those names are considered
excluded/disabled after applying the config; fail the test if any of those
quantizers remain enabled.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@modelopt/torch/quantization/config.py`:
- Around line 231-232: Add a unit test that verifies the new exclusion patterns
"*linear_attn.in_proj_a*" and "*linear_attn.in_proj_b*" actually disable
matching quantizers: import the exclusion patterns from
modelopt.torch.quantization.config (e.g., DEFAULT_EXCLUSIONS or the global
exclusions variable), create mock quantizer names like
"encoder.linear_attn.in_proj_a.weight" and "decoder.linear_attn.in_proj_b.bias"
and then use the module's exclusion-matching helper (e.g., matches_exclusion,
is_excluded, or the function that decides quantizer enablement) to assert those
names are considered excluded/disabled after applying the config; fail the test
if any of those quantizers remain enabled.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3344bc77-0606-4684-a620-e45bc3886169

📥 Commits

Reviewing files that changed from the base of the PR and between 04fcf24 and f532a9b.

📒 Files selected for processing (1)

modelopt/torch/quantization/config.py

Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>

amukkara force-pushed the qwen3.5-fix branch from b0eadcd to f532a9b Compare April 17, 2026 00:25

amukkara marked this pull request as ready for review April 17, 2026 00:26

amukkara requested a review from a team as a code owner April 17, 2026 00:26

amukkara requested a review from meenchen April 17, 2026 00:26

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

meenchen approved these changes May 1, 2026

View reviewed changes

amukkara force-pushed the qwen3.5-fix branch from f532a9b to 92c0472 Compare May 1, 2026 18:56

Skip quant for linear_attn.in_proj_a/b

b652987

Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>

amukkara force-pushed the qwen3.5-fix branch from 92c0472 to b652987 Compare May 1, 2026 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update excluded modules for Qwen3.5 dense PTQ#1284

Update excluded modules for Qwen3.5 dense PTQ#1284
amukkara wants to merge 1 commit intoNVIDIA:mainfrom
amukkara:qwen3.5-fix

amukkara commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amukkara commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

amukkara commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading