Experimental MLX custom Metal kernels for Apple Silicon — fast attention, decode, KV-cache, and future Mac GPU inference primitives.
python macos machine-learning deep-learning metal transformers inference attention mps mlx gpu-kernels kv-cache apple-silicon custom-kernels apple-gpu llm llm-inference flashattention metal-kernels
-
Updated
Jun 21, 2026 - Python