-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Checklist / 检查清单
- I have searched existing issues, and this is a new question or discussion topic. / 我已经搜索过现有的 issues,确认这是一个新的问题与讨论。
Question Description / 问题描述
swift rlhf
--rlhf_type dpo
--train_type lora
--model /cpfs_fundata/baolujia.blj/models/Qwen3-8B
--resume_from_checkpoint v0-20260212-195919/checkpoint-4578
--resume_only_model
--ignore_data_skip \
和 直接在合并之后的模型上 lora
swift rlhf
--rlhf_type dpo
--train_type lora
--model v0-20260212-195919/checkpoint-4578-merged
--resume_only_model
--ignore_data_skip \
loss完全不一样
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested