A retrieval-gated skill architecture for LLM agents that scales to hundreds of tools by exposing only the top-K relevant capabilities per request.
-
Updated
Mar 4, 2026 - Python
A retrieval-gated skill architecture for LLM agents that scales to hundreds of tools by exposing only the top-K relevant capabilities per request.
Fix Gemma 4, Llama 4, Qwen 3, Mistral tool-calling failures. Zero-latency regex skill router for Ollama & OpenClaw — 400,000x faster than LLM inference, 100% accurate. No GPU needed.
Add a description, image, and links to the skill-routing topic page so that developers can more easily learn about it.
To associate your repository with the skill-routing topic, visit your repo's landing page and select "manage topics."