Skip to content

Pull requests: NVIDIA/Model-Optimizer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fixes for fused moe (qwen3.6, GLM5.1 + MSE calibration
#1382 opened May 2, 2026 by Fridah-nv Contributor Loading…
AutoQuant for VLM
#1381 opened May 1, 2026 by meenchen Contributor Draft
[DeepSeek] Default to top-k calibration with peer-max input amax sync
#1380 opened May 1, 2026 by cjluo-nv Collaborator Loading…
3 tasks done
feat(launcher): add DFlash support for DeepSeek-V4-Flash target model
#1379 opened Apr 30, 2026 by ChenhanYu Collaborator Loading…
Use trtexec_safe on safety platforms when using remoteAutoTuning
#1378 opened Apr 30, 2026 by dthienan-nv Contributor Loading…
Enable active-param and memory based Minitron pruning constraint
#1377 opened Apr 30, 2026 by kevalmorabia97 Collaborator Loading…
Fix sparsity-only export emitting empty hf_quant_config.json cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1375 opened Apr 29, 2026 by kaix-nv Contributor Loading…
fix: guard against None chat_template in _post_process_chat_template cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1371 opened Apr 29, 2026 by yeyu-nvidia Contributor Loading…
fix: include medusa in data_module assignment in main.py cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1370 opened Apr 29, 2026 by yeyu-nvidia Contributor Loading…
Added fallback to load extra cudnn dlls in the site packages cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1369 opened Apr 29, 2026 by hthadicherla Contributor Loading…
k25 dflash hardcode support
#1367 opened Apr 29, 2026 by h-guo18 Contributor Draft
Experiment: MXFP4 -> NVFP4 conversion MSE study (scratch)
#1364 opened Apr 28, 2026 by cjluo-nv Collaborator Draft
3 tasks
Enable runtime optimization
#1358 opened Apr 28, 2026 by grzegorz-k-karch Contributor Draft
Add pre-built evaluation recipes for common benchmarks
#1357 opened Apr 27, 2026 by kaix-nv Contributor Loading…
[6106576] Restore llm_export_utils as deprecated shim for edgellm 0.6.1 compat cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1356 opened Apr 27, 2026 by ajrasane Contributor Loading…
2 tasks done
[6110209] Patch zero FP16 scales in INT4_AWQ ONNX export cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1353 opened Apr 27, 2026 by ajrasane Contributor Loading…
[minor] fixes for layerwise calib + MSE
#1344 opened Apr 24, 2026 by Fridah-nv Contributor Loading…
DSV4 dequant on the fly
#1341 opened Apr 24, 2026 by mxinO Contributor Draft
Update
#1338 opened Apr 23, 2026 by jingyu-ml Contributor Draft
[Refactor] speculative decoding: use mto config subsystem
#1328 opened Apr 23, 2026 by h-guo18 Contributor Loading…
ProTip! Filter pull requests by the default branch with base:main.