Skip to content

[template] Support HunyuanMT1.5-1.8B and HunyuanMT1.5-7B templates#7351

Open
rinne1998 wants to merge 1 commit intomodelscope:mainfrom
rinne1998:main
Open

[template] Support HunyuanMT1.5-1.8B and HunyuanMT1.5-7B templates#7351
rinne1998 wants to merge 1 commit intomodelscope:mainfrom
rinne1998:main

Conversation

@rinne1998
Copy link

  • Add hunyuan_mt1_5_1_8b and hunyuan_mt1_5_7b template types to LLMTemplateType
  • Move hunyuan_moe and hunyuan template definitions from llm.py to tencent.py

参考对应模型的chat template编写:
https://huggingface.co/tencent/HY-MT1.5-1.8B/blob/main/chat_template.jinja
https://huggingface.co/tencent/HY-MT1.5-7B/blob/main/chat_template.jinja

如果HY-MT1.5-7B使用默认的hunyuan的模板,会生成不正常的结束符/重复片段,训练时需指定 --template hunyuan_mt1_5_7b,HY-MT1.5-1.8B需指定 --template hunyuan_mt1_5_1_8

- Add hunyuan_mt1_5_1_8b and hunyuan_mt1_5_7b template types to LLMTemplateType
- Move hunyuan_moe and hunyuan template definitions from llm.py to tencent.py
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @rinne1998, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the system's compatibility with new large language models by integrating specific chat templates for HunyuanMT1.5-1.8B and HunyuanMT1.5-7B. This ensures that these models can process and generate text accurately according to their unique formatting requirements. Concurrently, the change improves the codebase's structure by relocating existing Hunyuan-related template definitions to a more appropriate module, streamlining maintenance and future expansions.

Highlights

  • New Model Template Support: Added support for HunyuanMT1.5-1.8B and HunyuanMT1.5-7B models by introducing dedicated chat templates, ensuring correct prompt formatting and preventing issues like abnormal terminators or repeated segments.
  • Template Refactoring: Refactored existing hunyuan_moe and hunyuan template definitions by moving them from llm.py to tencent.py to centralize Tencent-related templates and improve code organization.
  • Custom Prompt Formatting: Implemented a new HunyuanMT15_18B_Template class with a custom format_prompt method to handle the specific requirements for the HunyuanMT1.5-1.8B model's conversational structure.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次 PR 添加了对 HunyuanMT1.5-1.8B 和 HunyuanMT1.5-7B 模板的支持,并将现有的 Hunyuan 模板重构到了 tencent.py 文件中。代码重构做得很好。然而,两个新模型的模板定义在处理多轮对话时存在问题。chat_sep 的定义不正确,这会在多轮对话场景下导致问题。此外,对于 hunyuan_mt1_5_1_8b 模板,存在一个未被使用的类和方法,应当被移除。我提供了修复这些问题的建议。

Comment on lines +133 to +157
class HunYuanMT15_18B_Template(Template):
def format_prompt(self, messages, add_generation_prompt: bool = False, **kwargs):
s = "<|hy_begin▁of▁sentence|>"
for m in messages:
if m["role"] == "user":
s += "<|hy_User|>" + m["content"]
elif m["role"] == "assistant":
s += "<|hy_Assistant|>" + m["content"] + "<|hy_place▁holder▁no▁2|>"
if add_generation_prompt:
s += "<|hy_Assistant|>"
else:
s += "<|hy_place▁holder▁no▁8|>"
return s


register_template(
TemplateMeta(
LLMTemplateType.hunyuan_mt1_5_1_8b,
template_cls=HunYuanMT15_18B_Template,
prefix=["<|hy_begin▁of▁sentence|>"],
prompt=["<|hy_User|>{{QUERY}}<|hy_Assistant|>"],
chat_sep=[""],
suffix=["<|hy_place▁holder▁no▁2|><|hy_place▁holder▁no▁8|>"],
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

HunYuanMT15_18B_Template 类及其 format_prompt 方法似乎未被 Swift 模板框架使用,该框架依赖 TemplateMeta 进行编码。同时,当前为 hunyuan_mt1_5_1_8b 提供的 TemplateMeta 在处理多轮对话时也是不正确的,因为它会遗漏对话轮次之间的分隔符。

为了修复这个问题,应该移除未使用的类,并修正 TemplateMeta 以符合模型的官方 jinja 模板。具体来说,需要正确设置 chat_sep,并且 suffix 也需要调整以正确处理对话结束。

register_template(
    TemplateMeta(
        LLMTemplateType.hunyuan_mt1_5_1_8b,
        prefix=["<|hy_begin▁of▁sentence|>"],
        prompt=["<|hy_User|>{{QUERY}}<|hy_Assistant|>"],
        chat_sep=["<|hy_place▁holder▁no▁2|>"],
        suffix=["<|hy_place▁holder▁no▁2|>", "<|hy_place▁holder▁no▁8|>"],
    )
)

Comment on lines +159 to +167
register_template(
TemplateMeta(
LLMTemplateType.hunyuan_mt1_5_7b,
prefix=["<|startoftext|>"],
prompt=["{{QUERY}}<|extra_0|>"],
chat_sep=[""],
suffix=["<|eos|>"],
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

hunyuan_mt1_5_7b 模板的 chat_sep 当前为空字符串,这对于多轮对话是不正确的。根据模型的官方 jinja 模板,每个 assistant 的回复之后都应该有一个 <|eos|> 令牌作为分隔。这应该在 chat_sepsuffix 中都得到体现。

register_template(
    TemplateMeta(
        LLMTemplateType.hunyuan_mt1_5_7b,
        prefix=["<|startoftext|>"],
        prompt=["{{QUERY}}<|extra_0|>"],
        chat_sep=["<|eos|>"],
        suffix=["<|eos|>"],
    )
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant