Summary
Add GGUF model entries to BrowserAI's model registry so users can load models via the Flare engine without specifying URLs.
Proposed models
Tier 1 — Small (instant load, <200MB)
| Model ID |
GGUF File |
Size |
Notes |
smollm2-135m-flare |
SmolLM2-135M-Instruct Q8_0 |
138MB |
Fastest load, good for demos |
smollm2-135m-flare-q4 |
SmolLM2-135M-Instruct Q4_K_M |
~75MB |
Smaller download |
Tier 2 — Medium (good quality, <500MB)
| Model ID |
GGUF File |
Size |
Notes |
smollm2-360m-flare |
SmolLM2-360M-Instruct Q8_0 |
~350MB |
Better quality |
qwen2.5-0.5b-flare |
Qwen2.5-0.5B-Instruct Q4_K_M |
~350MB |
Multilingual |
Tier 3 — Large (best quality, ~1GB)
| Model ID |
GGUF File |
Size |
Notes |
llama-3.2-1b-flare |
Llama-3.2-1B-Instruct Q8_0 |
1.2GB |
Best quality |
llama-3.2-1b-flare-q4 |
Llama-3.2-1B-Instruct Q4_K_M |
~600MB |
Balanced |
Registry format
{
"smollm2-135m-flare": {
engine: "flare",
url: "https://huggingface.co/Qwen/SmolLM2-135M-Instruct-GGUF/resolve/main/smollm2-135m-instruct-q8_0.gguf",
architecture: "llama",
contextLength: 2048,
quantization: "Q8_0",
size: "138MB",
features: ["chat", "instruction-following"]
}
}
Advantage over MLC models
GGUF files are standard and available on HuggingFace without any conversion. Users can also load custom GGUF files not in the registry.
Tasks
Depends on
Related
Summary
Add GGUF model entries to BrowserAI's model registry so users can load models via the Flare engine without specifying URLs.
Proposed models
Tier 1 — Small (instant load, <200MB)
smollm2-135m-flaresmollm2-135m-flare-q4Tier 2 — Medium (good quality, <500MB)
smollm2-360m-flareqwen2.5-0.5b-flareTier 3 — Large (best quality, ~1GB)
llama-3.2-1b-flarellama-3.2-1b-flare-q4Registry format
Advantage over MLC models
GGUF files are standard and available on HuggingFace without any conversion. Users can also load custom GGUF files not in the registry.
Tasks
Depends on
Related