Skip to content

Fix CK 2stages MoE (always use BK1 = 16)#2898

Open
ex-rzr wants to merge 1 commit intomainfrom
ex-rzr/fix-ck-moe-bk1
Open

Fix CK 2stages MoE (always use BK1 = 16)#2898
ex-rzr wants to merge 1 commit intomainfrom
ex-rzr/fix-ck-moe-bk1

Conversation

@ex-rzr
Copy link
Copy Markdown

@ex-rzr ex-rzr commented Apr 24, 2026

Motivation

Fix for AICK-1084: FP8 2stages CK MoE produces incorrect results for intermediate dim < 256

Technical Details

It is incorrect to pass AK1Value = BK1Value = 8 to work around static asserts triggered in thread transfers for some configs.
Preshuffling (shuffle_weight) uses KPack = 16 for FP8, so BK1 must be the same otherwise the kernel loads wrong values.
Instead other parameters should be decreased in such cases:

  • A/BBlockTransferThreadClusterLengths... so not all threads participate in loading (this commit)
  • or A/BBlockTransferSrcScalarPerVector... so each thread load less data.

Test Plan

Test Result

Submission Checklist

It is incorrect to pass AK1Value = BK1Value = 8 to work around static
asserts triggered in thread transfers for some configs.
Preshuffling (shuffle_weight) uses KPack = 16 for FP8, so BK1 must be
the same otherwise the kernel loads wrong values.
Instead other parameters should be decreased in such cases:
 * A/BBlockTransferThreadClusterLengths... so not all threads
   participate in loading (this commit)
 * or A/BBlockTransferSrcScalarPerVector... so each thread load less
   data.
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2898 --add-label <label>

@ex-rzr ex-rzr marked this pull request as ready for review April 24, 2026 10:07
@ex-rzr ex-rzr requested a review from a team April 24, 2026 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant