Skip to content

[feat] Support ProFit: Extend DFT with Probability Threshold-based Token Filtering#7921

Open
maybefunctionname wants to merge 5 commits intomodelscope:mainfrom
maybefunctionname:feature_20260128
Open

[feat] Support ProFit: Extend DFT with Probability Threshold-based Token Filtering#7921
maybefunctionname wants to merge 5 commits intomodelscope:mainfrom
maybefunctionname:feature_20260128

Conversation

@maybefunctionname
Copy link

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

Overview

This PR extends the existing DFT (Dynamic Fine-Tuning) implementation to support the hard gating mechanism proposed in ProFit, which directly masks low-value tokens via probability thresholds to improve model generalization on reasoning tasks.

Key Improvements

DFT vs ProFit - Core Differences:

  • DFT (Soft Gating): Continuously reweights loss by token probability p (loss *= p), all tokens still contribute to gradients
  • ProFit (Hard Gating): Sets threshold τ to directly mask tokens with p < τ (loss *= mask), retaining only high-probability core tokens

Advantages of ProFit:

  1. Prevents Surface-level Overfitting: Low-probability tokens often represent non-essential expressions (e.g., filler words, redundant modifiers). Forcing alignment with them distracts the model from core logic
  2. Stronger Generalization: Experiments show ProFit achieves 3.0%-10.9% average improvement over standard SFT on reasoning benchmarks like GPQA-Diamond and MATH-500
  3. Theoretical Guarantee: The paper proves low-probability tokens induce larger gradients that can overshadow key semantic signals

Implementation Details

To maintain code simplicity, this implementation reuses the enable_dft_loss parameter and switches modes via the environment variable HARD_GATING_PROBABILITY_THRESHOLD:

  • Variable not set: Uses original DFT soft gating
  • Set to float (e.g., 0.1): Enables ProFit hard gating, masking tokens below threshold
# Core modification
if hard_gating_probability_threshold := os.getenv("HARD_GATING_PROBABILITY_THRESHOLD"):
    mask = (target_probs > float(hard_gating_probability_threshold)).float()
    target_probs = mask * (labels != -100).float()  # Hard gating

Usage Example

# Enable ProFit (threshold 0.1)
HARD_GATING_PROBABILITY_THRESHOLD=0.1 swift sft --enable_dft_loss true ...

# Use original DFT
swift sft --enable_dft_loss true ...

References

- 在SFT训练中实现[DFT](https://arxiv.org/abs/2508.05629)损失的硬门控功能
- 添加环境变量HARD_GATING_PROBABILITY_THRESHOLD用于设置概率阈值
- 实现基于token概率的硬门控机制,低概率token将被排除在损失计算之外
- 更新文档说明DFT软门控和Profit硬门控机制的区别
- 在训练器中添加概率阈值验证和错误处理逻辑
- 提供示例脚本展示硬门控配置方法
- 在SFT训练中实现[DFT](https://arxiv.org/abs/2508.05629)损失的硬门控功能
- 添加环境变量HARD_GATING_PROBABILITY_THRESHOLD用于设置概率阈值
- 实现基于token概率的硬门控机制,低概率token将被排除在损失计算之外
- 更新文档说明DFT软门控和Profit硬门控机制的区别
- 在训练器中添加概率阈值验证和错误处理逻辑
- 提供示例脚本展示硬门控配置方法
@maybefunctionname maybefunctionname marked this pull request as draft January 28, 2026 06:19
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @maybefunctionname, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new feature that extends the Dynamic Fine-Tuning (DFT) framework with support for ProFit's hard gating mechanism. This enhancement allows for more targeted supervised fine-tuning by masking out low-probability tokens, which can lead to improved model generalization and prevent overfitting to less significant linguistic elements. The implementation is designed to be flexible, enabling users to easily activate ProFit's hard gating via an environment variable while retaining the option for DFT's original soft gating.

Highlights

  • ProFit Integration: Implemented support for ProFit's hard gating mechanism, which filters tokens based on a probability threshold, as an extension to the existing DFT (Dynamic Fine-Tuning) loss.
  • Configurable Gating: Users can now switch between DFT's soft gating and ProFit's hard gating by setting the HARD_GATING_PROBABILITY_THRESHOLD environment variable when enable_dft_loss is active.
  • Documentation Updates: Updated command-line parameter documentation (both Chinese and English versions) to reflect the new ProFit feature and its configuration.
  • Core Logic Modification: Modified loss calculation functions in swift/megatron/trainers/trainer.py and swift/trainers/utils.py to apply the hard gating mask based on the specified probability threshold.
  • Example Script Update: Added a comment to the dft.sh example script demonstrating how to enable ProFit.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@maybefunctionname maybefunctionname marked this pull request as ready for review January 28, 2026 06:20
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for ProFit, a hard-gating mechanism for token filtering, as an extension to the existing DFT loss. The implementation correctly uses an environment variable to switch between soft and hard gating. The documentation and examples have also been updated accordingly.

My main feedback is regarding code duplication. The logic for applying the gating mechanism is repeated in three different places. I've left comments with suggestions to refactor this into a single helper function to improve code quality and maintainability.

- 将DFT损失中的门控因子计算逻辑提取到get_dft_gating_factor函数
- 移除trainer.py中的重复门控因子计算代码
- 统一处理软门控(DFT)和硬门控(ProFit)模式
- 保持原有的环境变量HARD_GATING_PROBABILITY_THRESHOLD支持
- 在多个损失函数中复用相同的门控因子计算逻辑
- 简化代码结构并提高可维护性
- 修正了 get_dft_gating_factor 的导入路径
- 从相对导入改为绝对导入方式
- 确保了模块引用的一致性
- 移除 trainer.py 中未使用的 os
- 调整 trainer.py 中的模块导入顺序以提高代码可读性
- 在 utils.py 中为硬门控概率阈值环境变量使用单引号字符串
- 重新组织 trainer.py 中的导入语句位置以符合代码风格
- 修复了环境变量获取时的字符串引号格式问题
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant