Standard Retrieval-Augmented Generation (RAG) pipelines retrieve information and synthesize answers without analyzing whether the retrieved documents contradict one another. When sources conflict (e.g., one document states a refund window is 30 days and another says 14 days), an LLM might arbitrarily pick one, synthesize an inaccurate middle ground, or output a generic response that hides the disagreement.
VS-RAG is an implementation of a conflict-aware RAG system. It inserts a structured conflict-detection step between retrieval and generation. By identifying and highlighting contradictions before drafting an answer, the system provides users with explicit conflict evidence alongside the generated output.
- Contradiction Transparency: Rather than attempting to silently resolve contradictions, VS-RAG detects them and presents conflicting quotes back to the user or application layer.
- Evidence-Based Auditing: The system identifies exact, non-paraphrased sentence pairs that conflict, linking each back to its originating chunk metadata (e.g., document source, URL, or ID).
- Flexible Action Policies: Developers can choose how the system handles contradictions: surfacing them alongside the answer, throwing an exception to halt processing, or only searching for conflicts without generating an answer.
The library utilizes Haystack 2.x for the underlying processing pipeline and:
- LanceDB: Serverless vector storage.
- LiteLLM: Standardized interface for calling LLMs.
- Instructor & Pydantic v2: For validated, structured JSON schema outputs from the LLM.
- SentenceTransformers (BGE-M3): To generate high-quality embeddings.
- Loguru: Structured logging with query-specific tracing.
You can install the dependencies and the project using uv or pip:
# Clone the repository
git clone https://github.com/your-repo/vs-rag.git
cd vs-rag
# Install with development dependencies
uv pip install -e ".[dev]"from vs_rag import ConflictRAG, Chunk
# Initialize with an LLM of choice (defaults to BGE-M3 for embeddings)
rag = ConflictRAG(
llm="openai/gpt-3.5-turbo",
db_path="./my_lancedb",
on_conflict="answer_with_conflicts"
)
# Index chunks with conflicting information
rag.index([
Chunk(
content="The basic refund policy allows returns within 30 days.",
metadata={"source": "policy_v1.pdf"}
),
Chunk(
content="The refund policy was updated: returns must be made within 14 days.",
metadata={"source": "policy_v2.pdf"}
),
])
# Query the pipeline
result = rag.query("What is the refund policy?")
# 1. Access the generated answer
print("Answer:", result.answer)
# 2. Access structured conflict pairs
for conflict in result.conflicts:
print(f"\nConflict detected between sources!")
print(f"Source A ({conflict.source_a}): '{conflict.quote_a}'")
print(f"Source B ({conflict.source_b}): '{conflict.quote_b}'")When initializing ConflictRAG, you can choose one of three behavioral policies to handle detected contradictions:
"answer_with_conflicts"(Default): Generates a final answer referencing both points of view, and attaches a list of populatedConflictPairobjects to the returnedQueryResult."raise_on_conflict": Throws aConflictDetectedErrorcontaining the detected conflict pairs immediately upon retrieval detection. Use this if your application should never display conflicting or unverified information."conflicts_only": Instructs the pipeline to run conflict detection and skip the answer generation phase entirely.result.answerwill beNone, reducing token usage and latency if you only need contradiction validation.
class Chunk(BaseModel):
content: str
metadata: dict[str, Any] # Checked for 'source', 'document_id', 'url', 'title'
class ConflictPair(BaseModel):
quote_a: str # Exact sentence from chunk A
quote_b: str # Exact contradicting sentence from chunk B
source_a: str | None # Human-readable source identifier extracted from metadata
source_b: str | None
chunk_index_a: int # Index in the retrieved chunk array
chunk_index_b: int
class QueryResult(BaseModel):
answer: str | None
conflicts: list[ConflictPair]
chunks: list[Chunk] # Full set of retrieved chunks
metadata: dict[str, Any] # Latency, token metrics, and timestampsTo use VS-RAG offline with local models, the package includes defaults for LiquidAI's LFM-2.5 (using llama-cpp-python or an external llama-server process):
from vs_rag.defaults import LFM25Local
# Downloads, caches, and prepares server configuration for LFM-2.5-Instruct GGUF
local_llm = LFM25Local()
# Start local server (requires llama-server binary in PATH)
local_llm.start_server()
# Use the local model in the pipeline
rag = ConflictRAG(
llm=local_llm.model_string,
llm_api_key=local_llm.api_key,
db_path="./local_db"
)Run the test suite using just or pytest:
# Run formatters, linters, and tests
just check
# Run tests directly
pytest tests/