labelrag is a Python library for label-driven retrieval-augmented generation
pipelines built on top of paralabelgen.
- PyPI distribution:
labelrag - Python import package:
labelrag - Core dependency target:
paralabelgen==0.2.3 - Primary supported extraction path:
paralabelgenLLM concept extraction - First semantic-reranking embedding provider:
sentence-transformers
pip install labelragIf you want to use the spaCy-backed extraction path, install a compatible English pipeline such as:
python -m spacy download en_core_web_smen_core_web_sm is a convenient local option, but the semantic-retrieval
release line targets the paralabelgen==0.2.3 LLM concept-extraction pipeline
as the primary supported extraction path.
The upstream paralabelgen==0.2.3 runtime also supports DeepSeek-backed
extraction through its own configuration surface. labelrag does not introduce
DeepSeek-specific APIs; it continues to pass extraction configuration through to
paralabelgen.
The first shipped semantic-reranking provider uses sentence-transformers.
Its model weights may be downloaded on first use if they are not already cached
locally.
from labelrag import (
RAGPipeline,
RAGPipelineConfig,
)
paragraphs = [
"OpenAI builds language models for developers.",
"Developers use language models in production systems.",
"Production systems need monitoring and evaluation tooling.",
]
config = RAGPipelineConfig()
config.labelgen.extractor_mode = "heuristic"
config.labelgen.use_graph_community_detection = False
pipeline = RAGPipeline(config)
pipeline.fit(paragraphs)
retrieval = pipeline.build_context("How do developers use language models?")
print(retrieval.prompt_context)
print(retrieval.metadata)from labelrag import (
OpenAICompatibleAnswerGenerator,
OpenAICompatibleConfig,
RAGPipeline,
RAGPipelineConfig,
)
paragraphs = [
"OpenAI builds language models for developers.",
"Developers use language models in production systems.",
"Production systems need monitoring and evaluation tooling.",
]
config = RAGPipelineConfig()
config.labelgen.extractor_mode = "heuristic"
config.labelgen.use_graph_community_detection = False
pipeline = RAGPipeline(config)
pipeline.fit(paragraphs)
generator = OpenAICompatibleAnswerGenerator(
OpenAICompatibleConfig(
model="mistral-small-latest",
api_key_env_var="MISTRAL_API_KEY",
base_url="https://api.mistral.ai/v1",
)
)
answer = pipeline.answer_with_generator(
"How do developers use language models?",
generator,
)
print(answer.answer_text)
print(answer.metadata)The current retrieval layer is deterministic and still label-driven at the candidate-generation stage.
fit(...)delegates paragraph analysis tolabelgen.LabelGeneratorfit(...)also builds paragraph embeddingsbuild_context(...)maps the question into the fitted label space- retrieval uses greedy coverage over query label IDs
- semantic similarity is used as a secondary ranking signal inside greedy selection
- label-free queries can use configurable fallback strategies
require_full_label_coverage=Truesuppresses partial retrieval results while preserving attempted coverage trace in metadata
Greedy selection order is:
- larger overlap with remaining query labels
- larger semantic similarity
- larger overlap on query concept IDs
- larger total paragraph label count
- lexicographically smaller
paragraph_id
In 0.1.3, the default main-path strategy is slightly more practical when
coverage completes early:
- it first builds the label-overlap candidate universe
- it still runs greedy label coverage first
- if coverage finishes before
max_paragraphs, it backfills from the remaining label-overlap candidates by semantic similarity
This keeps the default path label-bounded while making top_k retrieval less
likely to collapse to a single paragraph for single-label queries.
0.1.2 supports two main-path retrieval strategies:
greedy_label_coverage_semantic_reranklabel_gate_semantic_rank
label_gate_semantic_rank keeps label overlap as a candidate gate but lets
semantic similarity become the primary ranking signal inside that gated set.
0.1.2 supports four label-free fallback strategies:
concept_overlap_onlyconcept_overlap_semantic_rerankconcept_gate_semantic_ranksemantic_only
The default is concept_overlap_semantic_rerank.
concept_gate_semantic_rank mirrors the main gated semantic-first behavior for
label-free queries:
- paragraph concepts must still intersect the query concepts to enter the candidate set
- semantic similarity is then the primary ranking signal inside that gated set
The default strategies remain unchanged in 0.1.3:
- main path:
greedy_label_coverage_semantic_rerank
- label-free fallback:
concept_overlap_semantic_rerank
The retrieval trace now also distinguishes the meaning of retrieval_score
per result through retrieval_score_kind, and the default greedy main path
reports whether semantic backfill ran through semantic_backfill_used.
The built-in answer-generation adapter targets a minimal OpenAI-compatible chat-completions API surface.
It supports:
- standard base URLs such as
https://api.openai.com/v1 - full endpoint URLs such as
https://api.mistral.ai/v1/chat/completions - API key injection through explicit config or optional environment-variable lookup
- non-streaming text generation for
answer_with_generator(...)
This adapter is intended to cover providers such as OpenAI, Mistral, and Qwen when they expose an OpenAI-compatible endpoint shape.
The main public entrypoints are:
RAGPipelineRAGPipelineConfig,RetrievalConfig,PromptConfigIndexedParagraph,LabelRecord,ConceptRecordQueryAnalysis,RetrievedParagraphRetrievalResult,RAGAnswerResultGeneratedAnswer,AnswerGeneratorOpenAICompatibleAnswerGenerator,OpenAICompatibleConfig- convenience re-export:
Paragraph
RAGPipeline also exposes record-oriented inspection helpers for
paragraph/label/concept lookup workflows:
get_paragraph(...)get_label(...)get_paragraph_labels(...)get_paragraph_concepts(...)get_label_paragraphs(...)get_concept_paragraphs(...)
Lower-level ID-oriented helpers remain available when you only need stable IDs:
get_label_paragraph_ids(...)get_paragraph_label_ids(...)get_paragraph_concept_ids(...)get_concept_paragraph_ids(...)
Detailed API notes are available in docs/public_api.md.
RAGPipeline.fit(...)now requires an embedding provider- the default runtime path is
RAGPipeline(config)and resolves the embedding provider fromconfig.embedding - explicit
embedding_provider=is still available as an advanced override - the first shipped provider is
SentenceTransformerEmbeddingProvider - the default model is
sentence-transformers/all-MiniLM-L6-v2 - the model may be downloaded on first use
- offline environments should pre-cache the embedding model before running
fit(...)
Common runtime failures:
- missing
sentence-transformerspackage:- reinstall project dependencies, for example
pip install -e .
- reinstall project dependencies, for example
- model load/download failure:
- verify the configured model name
- ensure the model is already cached locally or that the environment can reach Hugging Face
Runnable examples are available in examples/:
examples/basic_usage.pyexamples/custom_config.pyexamples/inspection_api.pyexamples/fallback_policies.pyexamples/gated_semantic_rank.pyexamples/greedy_backfill.pyexamples/semantic_rerank.pyexamples/save_and_load.pyexamples/provider_answer.py
Example note:
- the runnable example scripts use a tiny local demo embedding provider so they stay runnable offline
- production usage should prefer
SentenceTransformerEmbeddingProvider
save(path) produces a human-inspectable directory containing:
manifest.jsonconfig.jsonlabel_generator.jsoncorpus_index.jsonfit_result.jsonparagraph_embeddings.npz
The persistence layer now supports:
jsonjson.gz
Compression is applied to the full saved snapshot rather than mixing compressed and uncompressed artifacts in one directory.
Snapshots written by the current release include a lightweight manifest describing the saved version, persistence format, and expected artifacts.
Public guarantee:
- a saved and reloaded pipeline should preserve retrieval behavior for the same fitted state, question, and config
Current update boundary:
fit(...)is batch-only- adding new paragraphs currently requires a full refit
- save/load restores a static fitted state rather than an incrementally updateable corpus state
Legacy snapshot note:
- loading pre-embedding snapshots remains a best-effort compatibility path
- when older snapshots are missing derived concept inspection tables,
load()may rebuild them from paragraph-side concept data that is still present - when older snapshots predate
paragraph_embeddings.npz,load()may rebuild paragraph embeddings from persisted paragraph texts if an embedding provider is available - persisted manifests include a non-empty
labelrag_version save()fails explicitly if the current package version cannot be determined for manifest writing
RetrievalConfig.max_paragraphssets the hard retrieval limitRetrievalConfig.retrieval_strategyselects one of:greedy_label_coverage_semantic_reranklabel_gate_semantic_rank
RetrievalConfig.allow_label_free_fallbackenables deterministic concept fallback behavior for label-free queriesRetrievalConfig.label_free_fallback_strategyselects one of:concept_overlap_onlyconcept_overlap_semantic_rerankconcept_gate_semantic_ranksemantic_only
RetrievalConfig.require_full_label_coveragesuppresses partial retrieval output when not all query labels can be coveredPromptConfig.include_paragraph_idsincludes stable paragraph IDs in the rendered prompt contextPromptConfig.include_label_annotationsincludes paragraph label annotations in rendered prompt contextPromptConfig.max_context_charactersapplies a hard cap to rendered context length
.venv/bin/ruff check . --fix
.venv/bin/pyright
.venv/bin/pytest.venv/bin/python -m build
.venv/bin/python -m twine check dist/*