Enable shuffled KV cache layout for MiniMax vLLM by jiacao-amd · Pull Request #1199 · SemiAnalysisAI/InferenceX

jiacao-amd · 2026-04-27T17:43:05Z

Summary

Export VLLM_ROCM_SHUFFLE_KV_CACHE_LAYOUT=1 in the MiniMax-M2.5 FP8 MI355X vLLM benchmark.

Why

vLLM's ROCm AITER attention backend expects the shuffled KV cache layout for its fast attention path. Without this environment variable, the benchmark can run with the default KV cache layout and miss the
intended AITER attention kernels, leaving MI355X MiniMax-M2.5 FP8 throughput below the optimized path.

This script already enables VLLM_ROCM_USE_AITER=1 and launches vLLM with --attention-backend ROCM_AITER_FA; setting VLLM_ROCM_SHUFFLE_KV_CACHE_LAYOUT=1 makes the KV cache layout match that backend.

Testing

bash -n benchmarks/single_node/minimaxm2.5_fp8_mi355x.sh

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Enable shuffled KV cache layout for MiniMax vLLM

e7a81b7

jiacao-amd requested a review from a team April 27, 2026 17:43

github-project-automation Bot added this to InferenceMAX Board Apr 27, 2026

claude Bot reviewed Apr 27, 2026

View reviewed changes

cquil11 requested a review from chunfangamd April 27, 2026 20:23

cquil11 approved these changes Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable shuffled KV cache layout for MiniMax vLLM#1199

Enable shuffled KV cache layout for MiniMax vLLM#1199
jiacao-amd wants to merge 1 commit intoSemiAnalysisAI:mainfrom
jiacao-amd:add-minimax-shuffle-kv-layout

jiacao-amd commented Apr 27, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jiacao-amd commented Apr 27, 2026

Summary

Why

Testing

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants