refactor: remove xattention one-stage decode path. by LMX-xin · Pull Request #1504 · jd-opensource/xllm

LMX-xin · 2026-05-21T07:39:00Z

No description provided.

gemini-code-assist

Code Review

This pull request removes the one-stage decode execution path for XAttention, standardizing the implementation on the two-stage approach. The changes include the removal of the enable_xattention_one_stage configuration flag from the C API and core framework, the deletion of the run_single_stage_decode method, and the simplification of the REC worker's asynchronous input preparation logic. Feedback for this PR highlights a critical issue in the updated unit tests where the max_tokens_per_batch configuration is inconsistent with the sequence lengths used in the test input, potentially leading to memory corruption or incorrect attention results during validation.

gemini-code-assist · 2026-05-21T07:46:12Z

-  torch::Tensor run_decode_once(DecodeTestInput& input, bool enable_two_stage) {
-    RecConfig::get_instance().enable_xattention_one_stage(!enable_two_stage);
+  torch::Tensor run_two_stage_decode_once(DecodeTestInput& input) {
    SchedulerConfig::get_instance().max_tokens_per_batch(kSharedSeqLen);


The max_tokens_per_batch configuration is set to kSharedSeqLen (300), but the test setup in create_decode_test_input (lines 237-241) defines kv_cu_seq_lens reaching up to kBatchSize * kSharedSeqLen (1200). In the two-stage decode implementation (xattention.cpp), max_tokens_per_batch is used as the boundary between shared and unshared KV cache. This mismatch causes the shared prompts to overlap with the unshared beam-specific cache area, which will lead to incorrect attention results or memory corruption during testing.

refactor: remove xattention one-stage decode path.

300dd28

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: remove xattention one-stage decode path.#1504

refactor: remove xattention one-stage decode path.#1504
LMX-xin wants to merge 1 commit into
jd-opensource:mainfrom
LMX-xin:refactor/xattention

LMX-xin commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LMX-xin commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant