Everything you need to know about LLM inference
-
Updated
Apr 7, 2026 - TypeScript
Everything you need to know about LLM inference
A high-performance, QUIC-based protocol for streaming raw UTF-8 markdown from LLMs with minimal CPU overhead. Optimized for internal inference infrastructure.
Add a description, image, and links to the inference-infrastructure topic page so that developers can more easily learn about it.
To associate your repository with the inference-infrastructure topic, visit your repo's landing page and select "manage topics."