Skip to content

Transform natural language questions into SQL queries instantly. Supports OpenAI & Ollama (local LLMs), multiple databases (SQLite, PostgreSQL, MSSQL), with a REST API, CLI interface, and Docker deployment. Built with security-first design and comprehensive testing.

License

Notifications You must be signed in to change notification settings

jacobobb/natural-SQL-query

Repository files navigation

NaturalQuery

Turn Natural Language into SQL Queries instantly.

NaturalQuery is a Python-based tool that bridges the gap between human language and database systems. It leverages the power of Large Language Models (LLMs) to interpret user questions and execute SQL queries against various database backends.

Python Docker License


Docker Quick Start

You can run the entire system using Docker without installing Python locally.

1. Run the API (Default)

Starts the REST API and exposes it at http://localhost:8000.

docker-compose up

How to use it: Once the container is running, you can talk to it from your computer (Host) using curl or Postman, just like if it were running locally.

Test it:

curl http://localhost:8000/health

(You should see {"status":"ok", ...})

Generate SQL:

curl -X POST http://localhost:8000/query \
-H "X-API-Key: test-key-123" \
-H "Content-Type: application/json" \
-d '{"question": "How many products are there?"}'

2. Run the Interactive CLI

To chat with your database via command line:

docker-compose run natural-query python main.py --execute

3. Run Tests

To verify everything is working correctly:

docker-compose run natural-query pytest

Features

  • Dual LLM Support: seamlessly switch between OpenAI (GPT-4o) and Ollama (Llama 3, Mistral, etc.) for local inference.
  • Multi-Database Support: Connects to SQLite, PostgreSQL, and Microsoft SQL Server.
  • Context-Aware: Dynamically extracts database schema to generate accurate queries.
  • Containerized: Fully Dockerized for easy deployment and usage.
  • Interactive CLI: Simple and intuitive command-line interface.

⚠️ Security Warning

Caution

Use a Read-Only Database User

This application uses an LLM to generate and execute SQL queries. While we attempt to validate queries, LLMs can be unpredictable.

To guarantee the safety of your data, you MUST configure the application with a database user that has READ-ONLY (SELECT) permissions.

Do NOT use an admin or owner account (like sa or postgres) in production.

Software-Level Validation

The application performs a basic check before executing any query:

if "SELECT" not in sql_query.upper():
    # Blocks execution

Limitation: This is a simple keyword check. It prevents accidental non-SELECT queries but does NOT guarantee security against malicious prompt injection or complex SQL attacks (e.g., chained commands like SELECT 1; DROP TABLE users;).

Always rely on Database Permissions as your primary defense.

Architecture

The system follows a modular architecture designed for extensibility.

graph TD
    User([👤 User]) -->|Natural Language Question| App[💻 Main Application]
    
    subgraph Core Logic
        App --> DB[🗄️ DB Manager]
        App --> LLM[🧠 LLM Client]
    end
    
    subgraph Database
        DB -->|Extract Schema| SQL_DB[(Databases<br/>SQLite / Postgres / MSSQL)]
        DB -->|Execute SQL| SQL_DB
        SQL_DB -->|Results| DB
    end
    
    subgraph Intelligence
        LLM -->|Schema + Question| Provider{LLM Provider}
        Provider -->|OpenAI API| OpenAI["☁️ OpenAI"]
        Provider -->|Ollama API| Ollama["🏠 Ollama (Local)"]
        OpenAI -->|Generated SQL| LLM
        Ollama -->|Generated SQL| LLM
    end

    LLM -->|Return SQL| App
    DB -->|Return Data| App
    App -->|Display Results| User
Loading

Installation

Prerequisites

  • Python 3.12+ (for local run)
  • Docker & Docker Compose (for containerized run)
  • Ollama (optional, for local LLM)

Option 1: Local Setup

  1. Clone the repository:

    git clone https://github.com/jacobobb/natural-SQL-query.git
    cd naturalQuery
  2. Create a Virtual Environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install Dependencies:

    pip install -r requirements.txt

Option 2: Docker Setup

No manual installation required besides Docker.

Configuration

Create a .env file in the root directory by copying the sample:

cp .env.sample .env

Edit the file with your settings:

# --- LLM Settings ---
# Choose 'openai' or 'ollama'
LLM_PROVIDER=ollama

# OpenAI (required if provider is openai)
OPENAI_API_KEY=sk-proj-your-key-here
OPENAI_MODEL=gpt-4o

# Ollama (required if provider is ollama)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3

# --- Database Settings ---
# Choose 'sqlite', 'postgres', or 'mssql'
DB_TYPE=sqlite

# SQLite Path
SQLITE_DB_PATH=natural_query.db

# Postgres / MSSQL Credentials
# DB_HOST=localhost
# DB_PORT=5432
# DB_USER=postgres
# DB_PASSWORD=yourpassword
# DB_NAME=mydb

Usage

Running Locally

By default, the application runs in Safe Mode (Dry Run). It will generate SQL but NOT execute it.

  1. Start in Safe Mode:

    python main.py

    Output: Generated SQL only.

  2. Start in Execution Mode: To actually execute queries against the database, use the --execute flag:

    python main.py --execute

    Output: Generated SQL + Database Results.

Running with Docker

To run in Execution Mode with Docker:

docker-compose run natural-query python main.py --execute

API

The project includes a FastAPI application for integration.

Authentication

The API uses X-API-Key header authentication. Configure keys in .env:

API_KEYS='{"client_app": "secret-key"}'

Endpoints

  • POST /query: Generate SQL (and optionally execute).
    curl -X POST http://localhost:8000/query \
    -H "X-API-Key: secret-key" \
    -H "Content-Type: application/json" \
    -d '{"question": "How many users?", "execute": false}'

Example Access

Request:

POST /query
{
  "question": "Show top 5 products by price",
  "execute": true
}

Response (Success):

{
  "question": "Show top 5 products by price",
  "generated_sql": "SELECT product_name, price FROM products ORDER BY price DESC LIMIT 5;",
  "results": [
    {"product_name": "Laptop", "price": 1200.0},
    {"product_name": "Smartphone", "price": 800.0}
  ],
  "error": null,
  "execution_time_ms": 45.2
}

Response (Dry Run / No Execute):

{
  "question": "Delete all users",
  "generated_sql": "DELETE FROM users;",
  "results": null,
  "error": "Safety Block: Only SELECT queries are allowed.",
  "execution_time_ms": 12.5
}

Running the API

With Docker Compose (default command):

docker-compose up

Or locally:

uvicorn api:app --reload

Testing

Run the integration test to verify the full pipeline:

python test_integration.py

Unit Tests

Run the pytest suite to verify individual components (DB, API, Security):

pytest tests/

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

You are free to:

  • ✅ Use, copy, and modify the code for non-commercial purposes
  • ✅ Share your modifications under the same license
  • ✅ Use for personal, educational, or research projects

You cannot:

  • ❌ Use this software for commercial purposes without explicit written permission
  • ❌ Sell this software or services based on it

For commercial licensing inquiries, please contact the repository owner.

About

Transform natural language questions into SQL queries instantly. Supports OpenAI & Ollama (local LLMs), multiple databases (SQLite, PostgreSQL, MSSQL), with a REST API, CLI interface, and Docker deployment. Built with security-first design and comprehensive testing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published