metal-kernels

Here are 2 public repositories matching this topic...

Experimental MLX custom Metal kernels for Apple Silicon — fast attention, decode, KV-cache, and future Mac GPU inference primitives.

python macos machine-learning deep-learning metal transformers inference attention mps mlx gpu-kernels kv-cache apple-silicon custom-kernels apple-gpu llm llm-inference flashattention metal-kernels

A minimal, native Metal inference engine for Qwen3-30B-A3B on Apple Silicon Macs.

c macos metal objective-c transformer moe quantization inference-engine apple-silicon local-llm llm-inference gguf qwen3 chunked-prefill metal-kernels

Add a description, image, and links to the metal-kernels topic page so that developers can more easily learn about it.

To associate your repository with the metal-kernels topic, visit your repo's landing page and select "manage topics."