bugfix: reduce acl graph memory overhead. by RobbieLeung · Pull Request #1457 · jd-opensource/xllm

RobbieLeung · 2026-05-15T03:37:24Z

size acl graph persistent_mask_ by decode/spec-verify graph capacity instead of max_tokens_per_batch.

gemini-code-assist

Code Review

This pull request optimizes memory allocation for persistent buffers in the ACL graph executor by sizing the attention mask based on decode graph capacity rather than the prefill budget. It also refines the capacity calculation for speculative decoding. Feedback indicates that the updated capacity function is incorrectly used for sequence-indexed metadata, causing significant memory over-allocation for tensors like block tables, which undermines the PR's memory reduction goals.

RobbieLeung requested review from DongheJin, JimHsiung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners May 15, 2026 03:37

RobbieLeung changed the title ~~bugfix: reduce qwen3.5 acl graph memory overhead.~~ bugfix: reduce acl graph memory overhead. May 15, 2026

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

Comment thread xllm/core/runtime/acl_graph_executor_impl.cpp

bugfix: reduce acl graph memory overhead.

3e05202

RobbieLeung force-pushed the bugfix/qwen35-acl-graph-memory-overhead branch from 5e744c2 to 3e05202 Compare May 21, 2026 09:48

RobbieLeung requested review from Clement-Wang26, DragonFive, Kang-Meng, liujinguang0125, xiao-yu-chen, yingxudeng and zhang-minchao as code owners May 21, 2026 09:48

XuZhang99 approved these changes May 21, 2026

View reviewed changes

zhang-minchao approved these changes May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: reduce acl graph memory overhead.#1457

bugfix: reduce acl graph memory overhead.#1457
RobbieLeung wants to merge 1 commit into
jd-opensource:mainfrom
RobbieLeung:bugfix/qwen35-acl-graph-memory-overhead

RobbieLeung commented May 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RobbieLeung commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RobbieLeung commented May 15, 2026 •

edited

Loading