Skip to content

nu-bi/nubi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Nubi logo

Nubi

BI that runs in the browser β€” near-zero cost per dashboard view.

License Apache-2.0 Tests passing PRs welcome Python FastAPI React 19 Vite Stars

Docs Β· Compare vs Hex/Cube Β· Quickstart Β· Roadmap


Nubi dashboard screenshot


What is Nubi?

Nubi is a batteries-included BI and embedded-analytics platform. The structural bet is that the analytics kernel runs in the user's browser by default (DuckDB-WASM / Pyodide), so the marginal cost of a dashboard view is approximately zero β€” a server kernel (E2B / Modal Firecracker microVM) is only the escape hatch for native wheels and large jobs.

The data plane uses Arrow IPC at every boundary, so data moves between warehouse, edge, browser, and kernel with no serialization tax. The entry wedge is embedding: a host app signs short-lived JWTs, mounts <nubi-dashboard>, and gets live cross-filtering dashboards with server-enforced row-level security at near-zero cost per view.


✨ Why Nubi?

Hex Cube Nubi
Kernel Python per session, their cloud ($$$) n/a Pyodide in browser; on-demand server kernel only when needed
Result transport JSON via pandas JSON / SQL API Arrow IPC β€” zero serialization tax
Viz Plotly/SVG, chokes past ~50k rows bring-your-own WebGL/WebGPU on Arrow buffers, 1M+ points interactive
Caching Per-session Pre-aggregations in Cube Store Content-hashed edge cache + auto pre-aggregations
Modeling tax medium high (cubes first) low β€” point at a warehouse and go
Embedding separate product headless only core surface; editor embeddable, not just output
Free tier per-seat kernel billing infra/seat real free tier β€” compute is the user's browser

Key differentiators:

  • Arrow-native data plane β€” sqlglot planner β†’ PhysicalPlan β†’ executor β†’ Arrow IPC stream, with a frozen cache-key spec and conformance suite so a future Rust executor can swap in without touching call sites.
  • Content-hashed edge cache β€” N viewers of the same dashboard collapse to one warehouse hit. Cache key: sha256(canonical_json({sql, params, rls_claims})).
  • Auth-as-code + server-side RLS β€” JWT claims carry row/column policies; the planner injects them as AST-level predicates (never string-concat). Powers internal users, multi-tenant embedding, and Google OAuth from the same primitive.
  • LLM-authorable dashboards + MCP β€” a dashboard is a sanitized HTML/CSS document of declarative <nubi-kpi>, <nubi-table>, and <nubi-chart> custom elements. LLMs and MCP agents author layout and widget attributes; they never write WebGL or fetch code. Six MCP tools expose the full authoring surface to any agent.
  • Auto-WebGL rendering β€” <nubi-chart> switches to a regl WebGL scatter path automatically above 20,000 rows; SVG/HTML below. Up to ~1M points at interactive framerates reading Arrow columns directly.
  • SQL-first connector SDK β€” any fn(plan) -> pyarrow.Table is a first-class connector with declared capabilities. The capability gate enforces the security floor: a connector with predicate_rls=False is refused (501) when policies are active. Built-in connectors: postgres (ADBC), duckdb (in-memory demo and read-only file-backed), http_json, mysql, mariadb, and jdbc (optional drivers). Private databases reachable via a network_mode='bridge' WebSocket tunnel.
  • Real free tier β€” compute is the user's browser; Hex can't match it without absorbing kernel cost.

πŸš€ Quickstart

Docker Compose (fastest β€” one command)

The repo ships a docker-compose.yml with two services: db (postgres:16-alpine) and a combined app (root Dockerfile β€” builds the Vite SPA and runs FastAPI, serving the SPA and the /api/v1 API on a single origin at port 8000).

# 1. Clone and start the stack
git clone https://github.com/imranparuk/nubi.git
cd nubi
make up          # docker compose up -d --build

# 2. Open the app
#    App (SPA + API): http://localhost:8000
#    API docs:        http://localhost:8000/docs (dev only)

# 3. (Optional) seed a test user
cd backend && DATABASE_URL=postgresql://nubi:nubi@localhost:5432/nubi python seed.py
#    β†’ test@nubi.dev / nubitest123

# 4. Smoke test
make smoke       # scripts/smoke.sh β€” health + auth + query assertions

The compose stack runs against a local Postgres container. To connect to Neon or another managed Postgres, set DATABASE_URL in your environment before running make up.

Dev path β€” backend + frontend separately

Prerequisites: Python 3.11+, Node 20+

# ── Backend ───────────────────────────────────────────────────
python3.11 -m venv .venv && source .venv/bin/activate
pip install -r backend/requirements.txt

# Copy and edit env β€” at minimum set DATABASE_URL and JWT_SECRET
cp .env.example backend/.env

# Run migrations, then start the API
python database/migrate.py
cd backend && uvicorn main:app --reload
# API:  http://localhost:8000
# Docs: http://localhost:8000/docs

# ── Frontend (new terminal, repo root) ────────────────────────
npm install
cp .env.example .env          # set VITE_BACKEND_URL=http://localhost:8000
npm run dev
# Frontend: http://localhost:5173

Seed a test user (optional, with the venv active):

cd backend && DATABASE_URL=postgresql://user:pass@host/db python seed.py
# β†’ test@nubi.dev / nubitest123
Key environment variables (.env.example)
Variable Required Description
DATABASE_URL Yes postgresql://...?sslmode=require (Neon) or local Postgres
JWT_SECRET Yes HS256 signing secret β€” openssl rand -hex 32
VITE_BACKEND_URL Frontend Base URL of the FastAPI backend
GOOGLE_CLIENT_ID OAuth Google OAuth client ID
GOOGLE_CLIENT_SECRET OAuth Google OAuth client secret
GOOGLE_REDIRECT_URI OAuth Callback URL registered in Google Console
FRONTEND_URL Backend Where the backend redirects after Google OAuth
CORS_ORIGINS Backend Comma-separated allowed origins
ENV Backend development / production (disables /docs in prod)
KERNEL_LOCAL_ENABLED Backend true to allow local subprocess kernel (dev only)
LLM_PROVIDER Optional anthropic / openai / gemini + matching API key

πŸ—οΈ Architecture

                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚               Browser / Host page            β”‚
                     β”‚                                              β”‚
                     β”‚  <nubi-dashboard>  ←──  getToken()           β”‚
                     β”‚  <nubi-kpi> <nubi-table> <nubi-chart>        β”‚
                     β”‚  DuckDB-WASM  ←── Arrow IPC (streaming)      β”‚
                     β”‚  regl WebGL scatter (>20k rows auto-switch)  β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚ HTTPS / JWT
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚            FastAPI backend                   β”‚
                     β”‚                                              β”‚
                     β”‚  /auth/*     email+pw / Google OAuth / JWKS  β”‚
                     β”‚  /query      planner β†’ cache β†’ executor      β”‚
                     β”‚  /compute/run  kernel router                 β”‚
                     β”‚  /ai/*       grounding + dashboard gen       β”‚
                     β”‚  /lineage    SQL lineage graph               β”‚
                     β”‚  /jobs       cron + interval scheduler       β”‚
                     β”‚  REST CRUD   datastores/boards/queries/…     β”‚
                     β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚                      β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  Postgres / Neon     β”‚  β”‚  Connector registry            β”‚
          β”‚  (asyncpg, SSL)      β”‚  β”‚  postgres  (ADBC, native Arrow)β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  duckdb    (in-mem + file)     β”‚
                                    β”‚  http_json (post-fetch RLS)    β”‚
                                    β”‚  mysql Β· mariadb Β· jdbc (opt)  β”‚
                                    β”‚  + VPC bridge transport        β”‚
                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                 β”‚ Arrow IPC
                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚  Content-addressed cache (LRU + TTL)  β”‚
                          β”‚  X-Nubi-Cache: HIT | MISS header      β”‚
                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

 Compute kernel (first-party only β€” embed tokens β†’ 403):
   LocalSubprocessRunner  (dev; KERNEL_LOCAL_ENABLED=true, ENV!=production)
   E2BRunner / ModalRunner (prod; Firecracker microVM, no host network/secrets)

Tech stack

Layer Technologies
Backend FastAPI 0.131, Python 3.11+, uvicorn, pydantic-settings v2
DB asyncpg (connection pool, raw SQL); Postgres 16 / Neon (SSL required)
Auth argon2-cffi (argon2id), PyJWT HS256, cryptography RS256/ES256 JWKS
Data plane sqlglot (AST planner + RLS injection + dialect validation), pyarrow, DuckDB (in-mem + file), adbc-driver-postgresql; mysql/mariadb/jdbc connectors (optional drivers); VPC bridge tunnel
Cache In-process LRU + TTL (ContentAddressedCache); interface is Redis-swappable
Compute subprocess (dev); e2b-code-interpreter / modal (prod, lazy optional deps)
AI / LLM NullProvider (default, zero network); lazy Anthropic / OpenAI / Gemini via env
Frontend React 19, Vite 7, TailwindCSS, react-router-dom
Viz regl (WebGL scatter, ~1M pts), apache-arrow, @duckdb/duckdb-wasm, ECharts
Embed Custom elements (<nubi-dashboard>, <nubi-kpi>, <nubi-table>, <nubi-chart>), DOMPurify
SDK @nubi/sdk β€” framework-agnostic ESM, wraps auth + query + resource CRUD + embed
CLI Python typer (nubi login / deploy / run / diff / pull)
MCP Python mcp SDK, stdio transport, 6 tools
Self-host Docker Compose (docker-compose.yml); Makefile: make up/down/migrate/smoke

Monorepo layout

nubi/
β”œβ”€β”€ backend/          FastAPI app, connectors, planner, compute, auth, AI, jobs
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ auth/     argon2id, JWT HS256, Google PKCE, JWKS, sessions
β”‚   β”‚   β”œβ”€β”€ connectors/ sqlglot planner, Arrow executor, cache, pre-agg
β”‚   β”‚   β”œβ”€β”€ compute/  KernelRunner ABC, LocalSubprocessRunner, E2BRunner, ModalRunner
β”‚   β”‚   β”œβ”€β”€ ai/       LLMProvider, grounding, dashboard generation
β”‚   β”‚   β”œβ”€β”€ lineage/  sqlglot AST extractor, LineageGraph
β”‚   β”‚   β”œβ”€β”€ jobs/     cron + interval scheduler, executor, store
β”‚   β”‚   β”œβ”€β”€ repos/    asyncpg (prod) + in-memory (test) repository layer
β”‚   β”‚   └── routes/   auth, query, compute, embed, ai, lineage, jobs, resources
β”‚   └── tests/        ~27 test modules + conformance suite (golden Arrow + cache keys)
β”œβ”€β”€ database/         Forward-only SQL migration runner + 6 migrations
β”œβ”€β”€ src/              React 19 frontend (Vite + Tailwind) β€” pages, components, viz
β”œβ”€β”€ embed/            Web components: <nubi-dashboard>, <nubi-kpi>, <nubi-table>, <nubi-chart>
β”œβ”€β”€ sdk/              @nubi/sdk β€” createNubiClient ESM package
β”œβ”€β”€ cli/              nubi CLI (typer): login / deploy / run / diff / pull
β”œβ”€β”€ mcp/              MCP stdio server β€” 6 tools for agent authoring
β”œβ”€β”€ docs/             cache-key-spec.md, conformance.md, kernel-security.md, assets/
β”œβ”€β”€ Dockerfile          combined image: Vite SPA build + FastAPI (single origin)
β”œβ”€β”€ docker-compose.yml   db (postgres:16) + app (SPA + API on :8000)
β”œβ”€β”€ Makefile          up / down / migrate / logs / smoke
β”œβ”€β”€ scripts/smoke.sh  End-to-end health + auth + query assertions
└── .env.example      All env vars with comments

πŸ“Š Project status

Milestone Status What shipped
M0 β€” Foundation βœ… Done React + FastAPI rebuild on Neon Postgres, email/pw + Google OAuth, migrations
M1 β€” Connectors + conformance βœ… Done sqlglot planner, PhysicalPlan, Postgres/DuckDB connectors, frozen cache-key spec
M2 β€” Streaming + cache + pushdown βœ… Done Arrow IPC stream, content-hashed LRU cache, projection/predicate/LIMIT pushdown, pre-agg seed
M3 β€” Embed auth + <nubi-dashboard> βœ… Done HS256 + JWKS verifier, issuer registry, server-side RLS, origin pinning, web component
M4 β€” Local kernel + placement router βœ… Done KernelRunner ABC, LocalSubprocessRunner, ComputePlacementRouter, POST /compute/run
M4-REMOTE β€” E2B/Modal sandbox βœ… Done E2BRunner (Firecracker microVM), ModalRunner adapter
M5 β€” WebGL viz βœ… Done regl GPU scatter on Arrow buffers, <nubi-chart> auto-WebGL above 20k rows
M6 β€” REST API + SDK + CLI βœ… Done asyncpg repo layer, CRUD for datastores/boards/widgets/queries, @nubi/sdk, typer CLI
M7 β€” Lineage + AI + MCP βœ… Done sqlglot lineage extractor, deterministic grounding, LLMProvider, MCP server (6 tools)
M8 β€” LLM-authorable dashboards βœ… Done <nubi-kpi>, <nubi-table>, <nubi-chart> widget kit, DOMPurify renderer, POST /ai/dashboard
M9 β€” Connector SDK + HTTP/JSON βœ… Done FunctionConnector, apply_rls_postfetch, HttpJsonConnector, NoSQL deliberately out of scope
Connector breadth βœ… Done Registry ships 8 types: postgres, duckdb (in-mem + read-only file-backed), http_json, mysql, mariadb, jdbc, snowflake, bigquery (the last four via optional drivers, lazily imported)
VPC bridge βœ… Done network_mode='bridge' opens a WebSocket TCP tunnel via BridgeBroker, wired into the query path (resolve_network_async); other modes 501
Builder layer (M13–M22) βœ… Done Query workspace + typed params, filter/variable/route-param interactivity, TanStack table + conditional formatting, 9 chart types, exports, scheduled reports, AI-SQL, agentic chat, git sync
M10 β€” Docker self-host smoke test πŸ”„ In progress docker-compose.yml ships locally (db + combined app on :8000); live-infra CI smoke test is the remaining capstone
M11 β€” Scheduled jobs βœ… Done cron + interval scheduler (deterministic now), execute_job, CRUD + run-now + run-history routes
M12 β€” Capability-gated RLS βœ… Done connector resolution via datastore.config.type, 501 gate when predicate_rls=False + active policies

Tests: ~27 backend test modules + conformance suite (golden Arrow output + byte-identical cache keys), MCP tests, CLI tests, dashboard sanitizer (node --test), SDK tests.

Experimental / not production-hardened: LocalSubprocessRunner (dev-grade isolation β€” same OS user, host network); Docker Compose stack not yet smoke-tested against live external infra (Neon SSL, E2B, real Google OAuth).


πŸ”Œ Embedding quickstart

<!-- 1. Load the widget bundle -->
<script type="module" src="https://cdn.example.com/nubi-dashboard.js"></script>

<!-- 2. Mount the component β€” calls getToken() before each query -->
<nubi-dashboard
  get-token="getToken"
  query="demo_sales_by_region"
  backend="https://api.example.com"
></nubi-dashboard>

CSS custom properties control theming: --nubi-bg, --nubi-fg, --nubi-accent, --nubi-border.

Full embed integration steps

1. Register your issuer in app/auth/issuers.py:

{
  "iss": "https://your-app.example.com",
  "jwks_uri": "https://your-app.example.com/.well-known/jwks.json",
  "aud": "nubi:your-project-id",
  "allowed_origins": ["https://your-app.example.com"],
}

2. Mint short-lived JWTs (≀15 min, RS256 or ES256) from your backend:

// Reference: embed/getToken.reference.js
async function getToken() {
  const { token } = await fetch('/your-api/nubi-token').then(r => r.json())
  return token  // signed JWT from your backend
}
window.getToken = getToken

Required JWT claims: iss, sub, aud, org, project, roles[], scope[] (must include "read:*" or narrower), policies (RLS column-value pairs), embed_origin, exp (≀ now + 900), iat.

3. The component handles the rest β€” JWKS verification, RLS enforcement, Arrow IPC fetch, WebGL rendering.


πŸ§ͺ Running tests

# Backend β€” in-memory repo + DuckDB fixtures; no live DB required
cd backend && pytest

# MCP server tests
cd mcp && pytest tests/

# Dashboard sanitizer (Node built-in runner)
npm run test:dash

# JS SDK tests
cd sdk && node --test src/index.test.mjs

# CLI tests
cd cli && pytest tests/

The backend conformance suite (backend/tests/conformance/) asserts the planner produces golden Arrow output and byte-identical cache keys. A future Rust executor must pass the same suite to be swappable.


πŸ“¦ SDKs & tooling

Package Path Description
@nubi/sdk sdk/ Framework-agnostic ESM β€” .auth, .query(), .resources.*, .embed.mount()
nubi CLI cli/ login / deploy / run / diff / pull β€” with --dry-run
MCP server mcp/ stdio MCP β€” 6 tools for agent dashboard authoring
Embed bundle embed/ <nubi-dashboard> + widget kit custom elements

πŸ“– Documentation


🀝 Contributing

PRs are welcome. The fastest path:

  1. Fork, create a feature branch.
  2. Run the test suite (cd backend && pytest).
  3. Open a PR β€” describe the problem and solution; reference any relevant milestone or doc.

Please keep commits small and focused. The conformance suite must stay green; any new connector or planner change needs a corresponding test vector.


License

Apache License 2.0 β€” see the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors