-
Notifications
You must be signed in to change notification settings - Fork 390
[OMNIML-3252][ONNX] Add real Q/DQ scales in Autotune #951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
gcunhase
merged 43 commits into
NVIDIA:main
from
gcunhase:dev/gcunhasergio/autotune_real_qdq_scales
Mar 11, 2026
Merged
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
94c1347
Initial autotune codebase
gcunhase 7c00172
Add more tests
gcunhase bd85051
Refactor: PR #702
gcunhase ce35655
Remove python path in tests
gcunhase ab3ea21
Recover docstrings and simplify code (->, , )
gcunhase 0d4166c
Added unittest for workflows.py (failing)
gcunhase 5201fae
Fix: 'Autotuning failed: 'PatternSchemes' object has no attribute 'no…
gcunhase 33ece8c
Updated workflow test to test TRT and PythonTRT benchmarking
gcunhase 0e2ed76
Fix test: use_trtexec flag
gcunhase 6a97bea
Add real scales to Q/DQ nodes
gcunhase 7f93106
fix precommit failures
gcunhase b6c7ac3
Fix: Add->Q/DQ->Activation(Relu)
gcunhase 29c209d
Fix: correctly dequantize Add input with shared Q/DQ
gcunhase ecf6148
[5916893] Fix weighted ops quantization logic: both input and weights…
gcunhase 12b4d6b
Changed keep_output_dir to True as default
gcunhase 30f8df8
test_workflow was moved to 'tests/gpu/onnx'
gcunhase c72862d
Removed cli.py, moved into __main__.py
gcunhase 6ca4878
Removed PatternSchemes import from region_pattern.py: no longer needed.
gcunhase 97aed99
Added intermediate Autotune model to be removed at the end of the qua…
gcunhase e422b85
Removed _MUTATION_SPECS from autotuner.py: moved to autotuner_base.py
gcunhase 0dfffe0
Removed test_config and test_pattern_cache. Should be added in the or…
gcunhase 2eea491
Fixed minor coderabbit suggestions
gcunhase 4e788c5
Moved autotune imports to the top of the file
gcunhase 8e5430c
Eliminate intermediate ONNX export in _find_nodes_to_quantize_autotun…
gcunhase be54609
Add support for Add->Q/DQ->Relu patterns by including those 'Add' nod…
gcunhase 9d95481
Add integration test
gcunhase 7d29c91
Remove 'keep_output_dir' arg (no longer needed due to tmp path)
gcunhase 10d816e
Remove 'get_quantized_nodes' and other comments that are no longer ne…
gcunhase 24029b2
Added docstring for 'default_dq_dtype' in workflows.py
gcunhase 11480df
Added mode presets and additional autotune configurations
gcunhase d4f19c2
Fixed tmp_path in test
gcunhase 5e413f5
Fixed copilot comments
gcunhase c573ccd
Fix: skip rewiring in graph_utils if no index is found. This prevents…
gcunhase b5eaf17
Match args for preset mode default
gcunhase 631eaa0
Exposed _StoreWithExplicitFlag
gcunhase c2f1d05
Renamed new_ips to new_insertion_points
gcunhase e953ec5
Address coderabbit and copilot issues + other minor issues
gcunhase c23eb70
Address additional coderabbit and copilot issues
gcunhase b03a170
Added real scales test in the integration workflow
gcunhase 45d375c
Address additional copilot issues: includes fix for op_types_to_quant…
gcunhase b5550f3
nit: added docstring and comment
gcunhase 9ca4e96
Created autotune utils
gcunhase de2ef53
Added 'bf16' as an option in workflows
gcunhase File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.