Skip to content

tabularis-ai/guidegen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GuideGen

A lightweight guided generation library for small (<1B) language models, optimized for local usage.

Typed guided generation for LLMs. Forces model outputs to conform to schemas using token-level constrained decoding.

import guidegen as gg

gen = gg.GuideGen("HuggingFaceTB/SmolLM2-360M-Instruct")

# Classification
sentiment = gen.generate("Great product!", ["Positive", "Negative", "Neutral"])
# => "Positive"
print(sentiment)

# Typed extraction
age = gen.generate("John is 25 years old", int)
# => 25
print(age)

# Pydantic models
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

person = gen.generate("John Doe, age 30", Person)
# => Person(name="John Doe", age=30)
print(person)

Features

  • Type-safe generation - int, float, bool, str, Literal, Pydantic models, List[T]
  • Three-stage decoding - Forced emission, accept-reject sampling, full mask fallback
  • Schema-compiled decoding - Deterministic structure emission for small models (~75% fewer model calls)
  • Adaptive strategies - Auto-detects model size and selects optimal decoding strategy
  • Classification with calibration - Full logprob scoring, label bias correction
  • Uncertainty detection - safe() API returns None when the model is uncertain
  • Streaming - Token-level streaming with commit events
  • Reason-then-render - Unconstrained reasoning followed by constrained output

Install

Install directly from GitHub (PyPI coming soon):

pip install git+https://github.com/tabularis-ai/guidegen.git

Or clone the repository and install in editable mode:

git clone https://github.com/tabularis-ai/guidegen.git
cd guidegen
pip install -e .

Note: Pydantic is included by default. If you encounter import errors, try installing without the -e flag or use the clone method above.

Quick Start

import guidegen as gg
from typing import Literal
from pydantic import BaseModel

# Load model
gen = gg.GuideGen("HuggingFaceTB/SmolLM2-360M-Instruct")

# Classification
label = gen.generate(
    "The movie was terrible and boring",
    Literal["Positive", "Negative", "Neutral"]
)

# Or pass a list directly
label = gen.generate("Great product!", ["Positive", "Negative", "Neutral"])

# Extraction with Pydantic
class Contact(BaseModel):
    name: str
    email: str
    age: int

contact = gen.generate(
    "Reach out to Alice Smith at alice@example.com, she's 28",
    Contact
)

# Classification with calibration and confidence scores
result = gen.generate(
    "Ich mag es",
    ["Positiv", "Negativ", "Neutral"],
    return_details=True,       # returns ClassificationResult
    calibrate=True,            # removes label prior bias
    prompt_suffix="Antwort:",  # multilingual prompt suffix
)
# result.label => "Positiv", result.score => 0.87, result.scores => {...}

# Safe mode - abstains if uncertain
result = gen.safe(
    "What color is the sky on Mars?",
    Literal["Red", "Blue", "Green"],
    uncertainty=gg.UncertaintyConfig(min_confidence=0.6)
)
# result is None if model is uncertain, otherwise the label

Strategies

GuideGen automatically selects the best decoding strategy based on model size:

Strategy Best for How it works
"standard" Models >= 3B Three-stage decoding with JSON grammar
"compiled" Models < 3B Schema-driven: emits structure, model fills values
"decomposed" Very weak models Generates each field independently

Override with:

options = gg.GuideGenOptions(strategy="compiled")
result = gen.generate(prompt, Person, options=options)

How It Works

  1. TypeSpec compilation - Your Python type is compiled into a language-independent schema
  2. Constraint engine - The schema drives a token-level constraint engine (JSON grammar + slot engines)
  3. Three-stage decoding:
    • Stage 0: Forced emission - only one valid token, emit it directly
    • Stage 1: Accept-reject - sample from model, validate against constraints
    • Stage 2: Full mask - compute all valid tokens, sample from masked distribution
  4. Schema-compiled mode (small models): Engine emits all structural tokens deterministically, model only generates values

Observability

options = gg.GuideGenOptions(return_trace=True)
result, trace = gen.generate("John is 25", Person, options=options)

print(f"Stage 0 forced: {trace.stage0_forced_count}")
print(f"Stage 1 accepted: {trace.stage1_accept_count}")
print(f"Stage 2 fallback: {trace.stage2_fallback_count}")
print(f"Avg logprob: {trace.average_logprob:.2f}")

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages