TriSQL Framework

A three-stage Text-to-SQL framework inspired by the TriSQL research paper (Nature Scientific Reports, 2026), built entirely from scratch using open-source tools. Converts plain English questions into executable SQL queries — running free on a standard laptop with no GPU required.

Live Demo

Type a question. Get SQL. See the answer.

Q: How many singers are there?
→ SELECT COUNT(*) FROM singer

Q: List all concerts in 2014
→ SELECT concert_name, year FROM concert WHERE year = 2014

Q: What is the average age of singers from the USA?
→ SELECT AVG(age) FROM singer WHERE country = 'USA'

Benchmark Results

Evaluated on the Spider benchmark (Yale University) — the standard academic dataset for Text-to-SQL research.

Metric	Score
Execution Accuracy (EX)	70%
Executability Rate	100%
Model	SQLCoder (local, free)
Hardware	Standard CPU laptop
GPU required	No

100% executability means every single generated SQL query runs without crashing — zero syntax errors across all test questions.

Architecture — Three Stages

This framework implements a TriSQL-inspired three-stage pipeline:

Stage 1 — Question-Guided Schema Selector

Uses sentence-transformers (all-MiniLM-L6-v2) to compute semantic similarity between the user's question and every table in the database. Only relevant tables are forwarded — reducing noise and improving generation quality.

Stage 2 — Structure-Aware SQL Generator

Generates SQL in two steps instead of one:

Step 1 — identifies which SQL clauses are needed (SELECT, JOIN, GROUP BY, etc.)
Step 2 — generates complete SQL guided by those clauses

A syntax validator catches and automatically recovers from common model errors.

Stage 3 — Complexity-Aware Refiner

Classifies each generated SQL query as Easy, Medium, or Hard by counting JOINs, subqueries, GROUP BY, HAVING, and set operations. Applies appropriate refinement:

Easy — return directly, no extra work needed
Medium — validate and fix one error if needed
Hard — retry up to two times with specific error feedback sent back to the model

User question
    ↓
Stage 1: Schema Selector     → filters irrelevant tables
    ↓
Stage 2: SQL Generator       → clause identification → SQL generation
    ↓
Stage 3: Complexity Refiner  → classify → refine → validate
    ↓
Executable SQL + results

Tech Stack

Component	Technology
LLM	SQLCoder (via Ollama)
Semantic similarity	sentence-transformers (all-MiniLM-L6-v2)
Database	SQLite
Web interface	FastAPI + Uvicorn
Benchmark dataset	Spider (Yale University)
Language	Python 3.10+

All components are free and open source. No API keys required. No GPU needed.

Project Structure

trisql-framework/
├── app.py                     Web interface (FastAPI)
├── run_eval.py                Spider benchmark runner
├── quick_test.py              Quick sanity test (no Spider needed)
├── requirements.txt           Python dependencies
│
└── src/
    ├── schema.py              SQLite schema parser
    ├── schema_selector.py     Semantic table filter (Stage 1)
    ├── sql_generator.py       Two-step SQL generator (Stage 2)
    ├── complexity_refiner.py  Complexity classifier + refiner (Stage 3)
    ├── pipeline.py            Full pipeline orchestrator
    ├── data_loader.py         Spider dataset loader
    └── evaluator.py           Execution accuracy evaluator

Getting Started

Prerequisites

Python 3.10+
Ollama installed

Installation

# Clone the repository
git clone https://github.com/YOUR_USERNAME/trisql-framework.git
cd trisql-framework

# Install dependencies
pip install -r requirements.txt

# Download SQLCoder model
ollama pull sqlcoder

Quick Test (No dataset needed)

# Terminal 1 — start Ollama
ollama serve

# Terminal 2 — run quick test
python quick_test.py

Expected output:

QUICK TEST SCORE: 4/4 (100% EX)
All tests passed!

Web Interface

# Terminal 1
ollama serve

# Terminal 2
python app.py

Open your browser at http://localhost:8000

Spider Benchmark Evaluation

# Download Spider dataset from https://yale-lily.github.io/spider
# Place it at data/spider/

# Run evaluation (start small)
python run_eval.py --max 20
python run_eval.py --difficulty easy
python run_eval.py  # full 1034 questions

Inspiration

This framework is inspired by the TriSQL paper:

"TriSQL: A Three-Stage Text-to-SQL Framework with Complexity-Aware Refinement" Nature Scientific Reports, 2026

The original paper used Qwen as the underlying LLM with GPU-based fine-tuning. This implementation adapts the three-stage architecture to run entirely on a local CPU using SQLCoder through Ollama — making the approach accessible without research infrastructure.

Results Comparison

System	Model	EX Score
Single-prompt baseline	SQLCoder	~50%
This framework (Stage 1+2+3)	SQLCoder (CPU)	70%
TriSQL (paper)	Qwen (GPU, fine-tuned)	82%

The 20% improvement over the single-prompt baseline demonstrates the effectiveness of the three-stage approach even on modest hardware.

Author

Built by Gowtham Venkat Eathamokkala as part of independent research into Text-to-SQL frameworks.

LinkedIn: https://www.linkedin.com/in/gowtham-eathamokkala
GitHub: https://www.github.com/Gowthamch9

License

MIT License — free to use, modify, and distribute.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
quick_test.py		quick_test.py
requirements.txt		requirements.txt
run_eval.py		run_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TriSQL Framework

Live Demo

Benchmark Results

Architecture — Three Stages

Stage 1 — Question-Guided Schema Selector

Stage 2 — Structure-Aware SQL Generator

Stage 3 — Complexity-Aware Refiner

Tech Stack

Project Structure

Getting Started

Prerequisites

Installation

Quick Test (No dataset needed)

Web Interface

Spider Benchmark Evaluation

Inspiration

Results Comparison

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TriSQL Framework

Live Demo

Benchmark Results

Architecture — Three Stages

Stage 1 — Question-Guided Schema Selector

Stage 2 — Structure-Aware SQL Generator

Stage 3 — Complexity-Aware Refiner

Tech Stack

Project Structure

Getting Started

Prerequisites

Installation

Quick Test (No dataset needed)

Web Interface

Spider Benchmark Evaluation

Inspiration

Results Comparison

Author

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages