Skip to content

[bugfix] fix qwen3.5 gpt_bridge lora#28

Merged
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:fix_qwen3_5_gpt_bridge
Apr 14, 2026
Merged

[bugfix] fix qwen3.5 gpt_bridge lora#28
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:fix_qwen3_5_gpt_bridge

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the weight loading and exporting logic in gpt_bridge.py to support interleaved layouts for linear attention components, specifically affecting QKV, Z, B, and A projections. The changes include refactoring dimension calculations and updating tensor reshaping and splitting logic. Feedback identifies a critical issue where FP8 scale factors (in_scale_inv) are not interleaved, which will lead to incorrect scaling in Megatron-Core. Additionally, the reviewer suggests adding safeguards against ZeroDivisionError when num_key_heads is zero and handling edge cases in tensor slicing when a_dim is zero to prevent incorrect weight exports.

Comment thread src/mcore_bridge/bridge/gpt_bridge.py Outdated
Comment thread src/mcore_bridge/bridge/gpt_bridge.py
Comment thread src/mcore_bridge/bridge/gpt_bridge.py
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the weight conversion logic in gpt_bridge.py to support an interleaved layout for linear attention, specifically adjusting how key and value dimensions are handled across multiple heads for both standard and LoRA weights. The review identified a critical NameError where in_scale_inv is referenced at line 1271 despite its definition being removed in the same patch. Additionally, the reviewer noted that scale_inv has become an unused variable following the removal of FP8 scale export logic and suggested using an underscore to ignore the return value.

Comment thread src/mcore_bridge/bridge/gpt_bridge.py Outdated
Comment thread src/mcore_bridge/bridge/gpt_bridge.py Outdated
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the weight loading and export logic in _set_linear_attn_state by adjusting the calculation of key_dim and value_dim and updating tensor splitting and reshaping operations to account for the number of key heads. However, the changes remove the handling of scale inversion factors (hf_scale_inv and scale_inv), which are essential for maintaining FP8 quantization support. It is recommended to restore this logic to ensure compatibility with quantized models during both weight loading and export.

Comment thread src/mcore_bridge/bridge/gpt_bridge.py
Comment thread src/mcore_bridge/bridge/gpt_bridge.py
@Jintao-Huang Jintao-Huang merged commit b95b5df into modelscope:main Apr 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants