feat:remove unused func and support deepseek_v4_mtp graph on npu.#1517
feat:remove unused func and support deepseek_v4_mtp graph on npu.#1517panxua wants to merge 10 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors DeepSeek V4 and MTP model implementations to support ACL graph forward metadata, introducing actual_metadata_rows to handle MTP validation scenarios. Key changes include replacing the is_deepseek_v4_model_type utility with explicit string comparisons and updating GraphPersistentParam logic. Review feedback highlights several critical issues: a missing tokens argument in mtp_block_ calls that will cause compilation errors, the removal of layer_index capping which risks out-of-bounds access, and a potential device mismatch when initializing runtime_device. Additionally, the reviewer recommended using emplace_back instead of push_back for torch::Tensor elements to align with the style guide.
| params.multi_block_tables.push_back( | ||
| torch::zeros({1, 1}, cpu_int_options)); |
There was a problem hiding this comment.
Prefer emplace_back over push_back to avoid unnecessary copies of torch::Tensor objects.
References
- Prefer
emplace_backoverpush_backto construct elements in-place and avoid unnecessary copies. (link)
…tadata_rows to num_sequeces.
1bbd1b8 to
a8120fb
Compare
No description provided.