Add QAT (Quantization-Aware Training) Support Callback by y2logic · Pull Request #8042 · modelscope/ms-swift

y2logic · 2026-02-12T08:03:18Z

PR type

New Feature

PR information

This PR introduces a new QatCallback implementation to support Quantization-Aware Training (QAT) using TorchAO .

TorchAO is a PyTorch native library with support for custom high performance data types, quantization, and sparsity.
Quantization-Aware Training significantly improves post-training quantized model performance compared to PTQ by simulating quantization effects during training.

This callback integrates seamlessly with the existing Trainer framework and enables:

Automatic fake-quant insertion at training start
Post-training quantized model export
Support for several quantization configurations

Experiment results

gemini-code-assist · 2026-02-12T08:04:53Z

Summary of Changes

Hello @y2logic, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the training framework by introducing a new callback for Quantization-Aware Training (QAT). This feature leverages the TorchAO library to seamlessly integrate quantization effects during the training process, leading to more efficient and performant models. It automates the setup for QAT and handles the export of the final quantized model.

Highlights

New QAT Callback: Introduced a new QatCallback class to enable Quantization-Aware Training (QAT) using the TorchAO library.
Callback Integration: Integrated the QatCallback into the system's callbacks_map for seamless activation within the training framework.
Automatic QAT Setup: Implemented automatic fake-quantizer insertion at the beginning of training to simulate quantization effects.
Quantized Model Export: Added functionality to export the post-training quantized model at the end of the training process.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

swift/callbacks/mapping.py
- Added 'qat' entry to the callbacks_map dictionary, linking to the new QatCallback.
swift/callbacks/qat.py
- Created a new file implementing the QatCallback class, which manages QAT initialization and quantized model saving.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a QatCallback for Quantization-Aware Training, which is a valuable addition. The implementation provides a solid foundation. My review focuses on improving flexibility and correctness. Specifically, the quantization configuration is currently hardcoded, which limits the callback's utility. I've suggested making this configurable. Additionally, there's a minor issue with a log message. Addressing these points will make the new callback more robust and user-friendly.

gemini-code-assist · 2026-02-12T08:06:17Z

swift/callbacks/qat.py

+        from torchao.quantization import Int4WeightOnlyConfig
+        self.quant_config = Int4WeightOnlyConfig()


This section can be improved in two ways:

Flexibility: The quantization configuration is hardcoded to Int4WeightOnlyConfig. This limits the callback's utility and contradicts the PR description's goal of supporting 'several quantization configurations'. It should be made configurable via training arguments to allow users to select different torchao quantization schemes.

Code Style: The import statement is inside a method. For better readability and consistency with PEP 8, all imports (including those in on_train_begin and on_train_end) should be at the top of the file.

Here's a suggestion for the logic inside __init__, assuming import torchao.quantization as ao_quant is moved to the top:

Suggested change

from torchao.quantization import Int4WeightOnlyConfig

self.quant_config = Int4WeightOnlyConfig()

import torchao.quantization as ao_quant

# This assumes a new training argument `qat_config` is added, e.g. with a value like 'Int4WeightOnlyConfig'

qat_config_name = getattr(self.args, 'qat_config', 'Int4WeightOnlyConfig')

quant_config_cls = getattr(ao_quant, qat_config_name, None)

if not quant_config_cls:

raise ValueError(f"Unknown QAT config: '{qat_config_name}'")

self.quant_config = quant_config_cls()

swift/callbacks/qat.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Jintao-Huang · 2026-02-13T07:44:25Z

Please merge the main branch, then run the following code:

pip install pre-commit
pre-commit run --all-files

y2logic · 2026-02-13T08:47:27Z

Code style improved.

[feat]SFT Quantization-Aware Training support

57b1e83

gemini-code-assist bot reviewed Feb 12, 2026

View reviewed changes

Apply suggestions from code review

7e56089

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

y2logic marked this pull request as ready for review February 12, 2026 08:40

Jintao-Huang mentioned this pull request Feb 13, 2026

[CI] fix ci temporary #8045

Merged

鸡腿 added 2 commits February 13, 2026 16:14

improve code format

c475772

Merge remote-tracking branch 'upstream/main' into qat_support

2ff7300

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QAT (Quantization-Aware Training) Support Callback#8042

Add QAT (Quantization-Aware Training) Support Callback#8042
y2logic wants to merge 4 commits intomodelscope:mainfrom
y2logic:qat_support

y2logic commented Feb 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 12, 2026

Uh oh!

Uh oh!

Jintao-Huang commented Feb 13, 2026

Uh oh!

y2logic commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from torchao.quantization import Int4WeightOnlyConfig
		self.quant_config = Int4WeightOnlyConfig()

-        from torchao.quantization import Int4WeightOnlyConfig
-        self.quant_config = Int4WeightOnlyConfig()
+        import torchao.quantization as ao_quant
+        # This assumes a new training argument `qat_config` is added, e.g. with a value like 'Int4WeightOnlyConfig'
+        qat_config_name = getattr(self.args, 'qat_config', 'Int4WeightOnlyConfig')
+        quant_config_cls = getattr(ao_quant, qat_config_name, None)
+        if not quant_config_cls:
+            raise ValueError(f"Unknown QAT config: '{qat_config_name}'")
+        self.quant_config = quant_config_cls()

Conversation

y2logic commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot commented Feb 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jintao-Huang commented Feb 13, 2026

Uh oh!

y2logic commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

y2logic commented Feb 12, 2026 •

edited

Loading