feat: upgrade MiniMax default model to M3#15
Open
octo-patch wants to merge 1 commit into
Open
Conversation
- Add MiniMax-M3 as the new default target for validate_minimax_m2.py - Keep MiniMax-M2.7 available via --model MiniMax-M2.7 (legacy fallback) - Refactor MODEL_NAME constant to SUPPORTED_MODELS / DEFAULT_MODEL pair - Update README compatibility section to highlight M3 as new flagship while documenting M2.7 as fallback - Update test docstrings to reflect M3 / M2.7 dual targeting The architecture constants (head_dim=128, num_kv_heads=8, num_layers=62) remain unchanged because the MiniMax-M3 MoE attention layout matches MiniMax-M2.7, so existing IsoQuant / PlanarQuant block-rotation tests continue to validate both checkpoints with no code changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrade the MiniMax KV-cache compression validation target to the new flagship
MiniMax-M3while keepingMiniMax-M2.7available as a fallback. M3 shares the same MoE attention layout (head_dim=128, num_kv_heads=8, num_layers=62, 204K context, GQA 48/8), so the existing IsoQuant / PlanarQuant / LiteratiQuant test surface covers both checkpoints without architecture changes.Changes
turboquant/validate_minimax_m2.pyMODEL_NAMEconstant withSUPPORTED_MODELS = ("MiniMax-M3", "MiniMax-M2.7")andDEFAULT_MODEL = "MiniMax-M3".--model {MiniMax-M3,MiniMax-M2.7}CLI arg (defaults to M3); thread it throughrun_model_validationandprint_memory_estimate.tests/test_minimax_m2.pyREADME.md--model MiniMax-M2.7for legacy validation.Why
MiniMax-M3is the new flagship in the MiniMaxAI MiniMax family and supersedes M2.7 as the default reference architecture. Surfacing it as the default validation target keeps the project's compatibility story aligned with the latest checkpoint while preserving the M2.7 path so existing users can keep validating against the previous flagship.Testing
python -m py_compile turboquant/validate_minimax_m2.py tests/test_minimax_m2.py— both files parse cleanly.SUPPORTED_MODELS,DEFAULT_MODEL,HEAD_DIM, etc.