feat: Support Flux2 in other common components#1463
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements a Mistral prompt-to-message conversion flow and extends the Tokenizer interface to support a max_sequence_length parameter with padding logic. Feedback highlights several critical issues: a potential out-of-bounds crash and debug code in EmbedWorkerImpl, a violation of the repository style guide regarding global flag registration in help_formatter.h, and multiple safety concerns in LLMMaster including missing error callbacks, exception-unsafe JSON parsing, and the failure to clear stale prompt tokens after a prompt update.
wang-shuibin
left a comment
There was a problem hiding this comment.
This PR adapts common components for Flux2 and completes accuracy verification for the Flux2 model. Additionally, it fixes a bug in xllm/core/layers/common/add_matmul.cpp.
5a578fa to
a9e211e
Compare
c1e2d2f to
af0649b
Compare
This PR implements adaptation for Flux2 in common components. Meanwhile, accuracy verification for the Flux2 model has been finished Flux2 model.