Torch-based GEMM with fake dequantization for mxfp6 e2m3 by nirmie · Pull Request #762 · iree-org/wave

nirmie · 2026-01-21T05:20:06Z

Packs f32 into fp6 and unpacks back to f32 for matmul
Pytests for quantization error, matmul error, and LUT rounding
Does not use CDNA4 FP6, only fake dequantization

Will soon work on adding CDNA4 native mxfp6 operations

- Packs f32 into fp6 and unpacks back to f32 for matmul - Pytests for quantization error, matmul error, and LUT rounding - Does not use CDNA4 FP6, only fake dequantization Signed-off-by: Nirmal Senthilkumar <nirmalsent@gmail.com>

Giuseppe5 · 2026-01-27T14:52:37Z

tests/kernel/fake_dequant_mxfp6.py

+    # Calculate block-wise scales (Microscaling)
+    x_reshaped = x.view(-1, block_size)
+    amax = x_reshaped.abs().max(dim=1, keepdim=True).values
+    scales = amax / MAX_E2M3


This is supposed to be a power-of-two value

Torch-based GEMM with fake dequantization for mxfp6 e2m3

4dbf655

- Packs f32 into fp6 and unpacks back to f32 for matmul - Pytests for quantization error, matmul error, and LUT rounding - Does not use CDNA4 FP6, only fake dequantization Signed-off-by: Nirmal Senthilkumar <nirmalsent@gmail.com>

raikonenfnu requested review from harsh-nod and raikonenfnu January 21, 2026 17:13

Merge branch 'iree-org:main' into main

beba28f

Giuseppe5 reviewed Jan 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch-based GEMM with fake dequantization for mxfp6 e2m3#762

Torch-based GEMM with fake dequantization for mxfp6 e2m3#762
nirmie wants to merge 2 commits intoiree-org:mainfrom
nirmie:main

nirmie commented Jan 21, 2026

Uh oh!

Giuseppe5 Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nirmie commented Jan 21, 2026

Uh oh!

Giuseppe5 Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants