Cursor for Data — a local-first AI workspace that understands your datasets, not just runs SQL on them.
QueryPad points an AI at a folder of CSV/Parquet/JSON files, profiles them, discovers how they connect, and helps you analyze them with DuckDB — locally, with no server-side data processing, no account, and no install.
The execution layer is solved (DuckDB does it well). The unsolved problem is that people don't understand their data: which tables exist, what each field means, how datasets connect, which join is correct. QueryPad is built to answer those questions first, then generate and run the SQL.
querypad-demo-readme.mp4
QueryPad ships as a CLI for dataset understanding and a browser app for
interactive analysis. Both share the same engine-agnostic discovery core; only the
DuckDB binding differs (native @duckdb/node-api for the CLI, DuckDB-Wasm for the web).
┌─────────────────────────┐
folder of → │ Discovery core │ → .querypad/ artifacts
data files │ profile → relationships│ (schema + relationships)
│ → semantic model │
└───────────┬─────────────┘
CLI (Node) │ Web (Wasm)
querypad │ querypad.io
inspect │ drop & query
querypad inspect ./dataScans a folder, profiles every file, and infers foreign-key relationships with confidence scores:
Tables: 3
Relationships: 2
payments.user_id ↳ users.id (100%, many-to-one)
events.user_id ↳ users.id (100%, many-to-one)
Wrote artifacts to ./data/.querypad
It writes machine-readable artifacts that an AI agent (Claude Code, Cursor, …) can read to reason about the dataset instead of guessing at pandas:
.querypad/
schema.json # tables, columns, types, per-column profiles
relationships.json # inferred joins with confidence + signals
semantic-model.yaml # named business entities (belongs_to / has_many)
inspect-summary.md # human- and agent-readable overview
inspect also rolls the relationships into a semantic model of named entities:
# .querypad/semantic-model.yaml
entities:
- name: User
table: users
has_many: [Payment, Event]
- name: Payment
table: payments
belongs_to: [User]Claude Code + QueryPad + DuckDB
For every table, QueryPad computes a statistical profile (row count, null %, distinct count, ranges, top values). It then identifies primary-key candidates (unique, non-null), prunes likely foreign-key pairs by name similarity and type compatibility, and runs a value-overlap query for each survivor. A confidence score blends four signals — value overlap (dominant), name similarity, type match, and cardinality shape — and competition disambiguation keeps a foreign column pointed at its single strongest target, so overlapping integer id ranges don't produce false positives.
| Layer | What it does | Status |
|---|---|---|
| 1 — Dataset Discovery | Scan folders; detect schema, types, statistics, uniqueness, cardinality | ✅ Built (profile) |
| 2 — Relationship Discovery | Infer joins automatically with confidence scores | ✅ Built (inspect) |
| 3 — Semantic Model | Roll relationships into named business entities (User ├ Payment ├ Event) |
✅ Built (inspect) |
| 4 — AI Analyst | Natural-language questions → SQL → execution → insight (ask) |
✅ Built (ask) |
See ROADMAP.md for the full plan.
export ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEY with --provider openai
querypad ask "total payment amount by user plan" ./dataask builds context from the inferred relationships (so the generated SQL joins on the
right keys), runs it on DuckDB, and explains the result:
-- SQL
SELECT u.plan, COUNT(*) AS payment_count, SUM(p.amount) AS total
FROM payments p JOIN users u ON p.user_id = u.id
GROUP BY u.plan ORDER BY u.plan
plan payment_count total
---- ------------- ------
paid 8 285.74
Insight: All payments come from paid-plan users.
Generated SQL is read-only-gated (only SELECT/WITH/… execute) and the DB is in-memory,
so source files are never modified. Use --show-sql to preview the SQL without running it.
querypad explain ./dataJustifies each inferred relationship from its stored signals, and lists caveats to verify:
payments.user_id ↳ users.id — 100% (many-to-one)
• 100% of distinct payments.user_id values are present in users.id
• column name strongly matches the target
• exact type match
• many-to-one (target key is unique)
Caveats (0)
None.
The browser app at querypad.io is the same OSS app running client-side. Your data stays on your machine unless you explicitly share or collaborate.
- Drag & drop anything — CSV, Parquet, JSON, Excel — drop multiple formats at once and JOIN them
- DuckDB-Wasm SQL — Full analytical SQL in the browser (JOIN, GROUP BY, window functions, …)
- Data profiles — Column-level nulls, distinct counts, ranges, averages, and top values
- Relationship verification — Discover inferred joins in-browser; Accept / Reject / Edit each with a per-signal "why" (verdicts persist)
- Agent context — Copy schema, profiles, active SQL, and latest results for Claude Code or Codex
- AI SQL assistant — Cmd+K for natural language to SQL with Claude or OpenAI BYOK
- Inline charts — One-click Bar, Line, Scatter, Pie from query results
- URL sharing — Compress data + query into a single shareable link
- Sample data on first visit — Start exploring immediately, drop your own files when ready
More web app features
- Monaco Editor — Table/column autocomplete, syntax highlighting, Cmd+Enter to run
- Virtualized table — Smooth rendering up to 10,000 rows
- IndexedDB persistence — Data and queries survive page refresh
- Multi-tab editor — IDE-style tabs with independent queries and results
- Export anywhere — CSV, JSON, Markdown, HTML, Excel, Parquet, clipboard
- S3/HTTP loading — Load remote Parquet/CSV/JSON files by URL
- Transform pipelines — Chain queries with DAG visualization
- Plugin system — Extend with visualizations, exporters, file loaders, SQL macros
- Real-time collaboration — PartyKit + Y.js CRDT with remote cursors
- File size guardrails — 100 MB per file limit with clear warnings
Web app:
npm install
npm run devOpen http://localhost:3000. Sample data is automatically loaded on first visit.
CLI:
npm install
npm run querypad -- inspect ./fixtures/data # discover relationships
ANTHROPIC_API_KEY=sk-ant-... npm run querypad -- ask "payments by plan" ./fixtures/data| Area | Technology |
|---|---|
| Query Engine | DuckDB-Wasm (web) · @duckdb/node-api (CLI) |
| Framework | Next.js + TypeScript + Tailwind CSS v4 |
| AI | Anthropic Claude + OpenAI BYOK |
| Editor | Monaco Editor |
| State | Zustand |
| Charts | Recharts |
| Persistence | IndexedDB (idb-keyval) |
| Collaboration | PartyKit + Y.js (optional) |
QueryPad is a local-first tool, not a hosted SaaS. Version numbers mark GitHub release milestones and public product updates. See CHANGELOG.md for release notes.
Contributions are welcome! Feel free to open issues and pull requests. See CONTRIBUTING.md.
MIT
Built by @vericontext