[NVBug 6102977] Disable use_cache during export dummy forward for custom configs#1402
[NVBug 6102977] Disable use_cache during export dummy forward for custom configs#1402
Conversation
The export step's collect_shared_input_modules() runs a dummy forward pass to trace shared input modules for resmoothing/requantization. This forward call hits the same AttributeError as PR #1324 on configs that don't assign use_cache (e.g., stepfun-ai/Step-3.5-Flash's Step3p5Config): AttributeError: 'Step3p5Config' object has no attribute 'use_cache' Wrap the dummy forward with the _disable_use_cache context manager so it sets/restores config.use_cache around the call, mirroring the calibration fix from #1324. Covers all callsites since both unified_export_hf and plugins/vllm_fakequant_hf funnel through collect_shared_input_modules. Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA single import is added for a cache-disabling utility, and the dummy forward probe in ChangesCache Disabling During Input Module Collection Probe
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1402 +/- ##
==========================================
+ Coverage 76.73% 76.82% +0.09%
==========================================
Files 476 476
Lines 51306 51307 +1
==========================================
+ Hits 39368 39419 +51
+ Misses 11938 11888 -50
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, well-targeted bug fix (+5 -1). Reuses the existing _disable_use_cache context manager from dataset_utils to wrap the dummy forward pass in collect_shared_input_modules, mirroring the calibration fix from PR #1324. The context manager correctly handles configs that lack use_cache (sets it to False, restores/deletes on exit). No new code to speak of — just adding one more context manager to an existing with block.
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, correct bug fix (+5 -1). Reuses the existing _disable_use_cache context manager to wrap the dummy forward in collect_shared_input_modules, matching the established pattern from the calibration fix in PR #1324. Handles configs lacking use_cache correctly. No new code, no design concerns.
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, targeted bug fix (+5 -1). Reuses the existing _disable_use_cache context manager from modelopt.torch.utils.dataset_utils to wrap the dummy forward pass in collect_shared_input_modules, mirroring the calibration fix from PR #1324. The context manager correctly handles configs that lack use_cache (e.g., Step3p5Config) by setting it to False and cleaning up on exit. Import is at the top of the file. No design or correctness concerns.
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, well-targeted bug fix (+5 -1). Reuses the existing _disable_use_cache context manager from modelopt.torch.utils.dataset_utils to wrap the dummy forward pass in collect_shared_input_modules, mirroring the calibration fix from PR #1324. The context manager correctly handles configs that lack use_cache (e.g., Step3p5Config) by setting it to False and cleaning up on exit (deletes the attr if it wasn't there, restores it otherwise). Import is at the top of the file. Fix covers both unified_export_hf and plugins/vllm_fakequant_hf since both funnel through collect_shared_input_modules. No correctness or design concerns.
The export step's collect_shared_input_modules() runs a dummy forward pass to trace shared input modules for resmoothing/requantization. This forward call hits the same AttributeError as PR #1324 on configs that don't assign use_cache (e.g., stepfun-ai/Step-3.5-Flash's Step3p5Config):
AttributeError: 'Step3p5Config' object has no attribute 'use_cache'
Wrap the dummy forward with the _disable_use_cache context manager so it sets/restores config.use_cache around the call, mirroring the calibration fix from #1324. Covers all callsites since both unified_export_hf and plugins/vllm_fakequant_hf funnel through collect_shared_input_modules.
What does this PR do?
Type of change: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit