Skip to content

Fuse RoPE across MLite attention implementations#71

Open
ISEEKYAN wants to merge 2 commits into
mainfrom
mlite-attention-rope-fusion
Open

Fuse RoPE across MLite attention implementations#71
ISEEKYAN wants to merge 2 commits into
mainfrom
mlite-attention-rope-fusion

Conversation

@ISEEKYAN

Copy link
Copy Markdown
Owner

Summary

  • add a single MLite kernel boundary for TE-backed generic and MLA fused RoPE providers
  • route dense/MRoPE, MLA, DSA, and CSA RoPE sites through the fused path while preserving the unfused reference path
  • enable RoPE fusion by default in the relevant typed implementation configs
  • add adapter, routing, runtime-config, and AST layering contract coverage

Validation

  • focused CPU suite: 43 passed
  • ruff and diff checks passed
  • eos Slurm 5538097: COMPLETED 0:0, 5 passed, 0 skipped (forward and backward coverage)

ISEEKYAN added 2 commits June 30, 2026 17:57
Signed-off-by: Yan Bai <bayan@nvidia.com>
Signed-off-by: Yan Bai <bayan@nvidia.com>
@ISEEKYAN ISEEKYAN force-pushed the mlite-attention-rope-fusion branch from fe86d0e to 1a4a1a3 Compare July 1, 2026 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant