Skip to content

RFPO Discussion #3

@catqaq

Description

@catqaq

https://github.com/catqaq/OpenLLMAI-Research/blob/main/docs/RFPO.md
知错就改RFPO:显式的从错误中学习以解锁LLM在RL中的上限 - OpenLLMAI的文章 - 知乎
https://zhuanlan.zhihu.com/p/1926393705562083577

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions