Skip to content

feat:remove unused func and support deepseek_v4_mtp graph on npu.#1517

Open
panxua wants to merge 10 commits into
jd-opensource:preview/deepseek-v4-npufrom
panxua:preview/deepseek-v4-npu-0522-graph
Open

feat:remove unused func and support deepseek_v4_mtp graph on npu.#1517
panxua wants to merge 10 commits into
jd-opensource:preview/deepseek-v4-npufrom
panxua:preview/deepseek-v4-npu-0522-graph

Conversation

@panxua
Copy link
Copy Markdown
Contributor

@panxua panxua commented May 22, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors DeepSeek V4 and MTP model implementations to support ACL graph forward metadata, introducing actual_metadata_rows to handle MTP validation scenarios. Key changes include replacing the is_deepseek_v4_model_type utility with explicit string comparisons and updating GraphPersistentParam logic. Review feedback highlights several critical issues: a missing tokens argument in mtp_block_ calls that will cause compilation errors, the removal of layer_index capping which risks out-of-bounds access, and a potential device mismatch when initializing runtime_device. Additionally, the reviewer recommended using emplace_back instead of push_back for torch::Tensor elements to align with the style guide.

Comment thread xllm/models/llm/mtp_model_base.h
Comment thread xllm/models/llm/mtp_model_base.h
Comment thread xllm/models/llm/deepseek_v4_mtp.h
Comment thread xllm/models/llm/deepseek_v4_mtp.h Outdated
Comment on lines +547 to +548
params.multi_block_tables.push_back(
torch::zeros({1, 1}, cpu_int_options));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Prefer emplace_back over push_back to avoid unnecessary copies of torch::Tensor objects.

References
  1. Prefer emplace_back over push_back to construct elements in-place and avoid unnecessary copies. (link)

@panxua panxua changed the title feat:remove unused func and support deepseek_v4 mtp on npu. feat:remove unused func and support deepseek_v4_mtp graph on npu. May 22, 2026
@panxua panxua force-pushed the preview/deepseek-v4-npu-0522-graph branch from 1bbd1b8 to a8120fb Compare May 23, 2026 05:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant