Skip to content

jiajunl23/nlumination

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLumination — Tell your photos how to feel.

Natural-language color grading. In your browser. At full resolution.
Type a feeling. Get a Lightroom-grade edit.

Try NLumination live


Talk to your photos.

Color grading used to mean twelve sliders, three curve panels, and a lot of guesswork. Now you write:

moody, blue shadows, protect highlights, push the blues toward teal

…or:

cottagecore but a bit dreamier
holiday spirit
make it cozy

NLumination figures out which adjustments to move and by how much, picks a cinematic LUT from a 241-entry license-clean library when one fits, and renders the result on a WebGL2 pipeline at native resolution — on your device. Sliders are right there if you want to fine-tune.


Two ways to grade

Slider mode  no account

A deterministic compositional parser maps 42 intents (101 surface forms) onto your photo. Pure TypeScript, runs in < 1 ms client-side. "Warmer, less contrast in the shadows, push the blues toward teal" moves four sliders in the right directions — not one. Adaptive to the photo: "brighten" is gentle on bright photos, "protect highlights" only fires when there's clipping.

Agents + LUT mode  signed in

Three agents in parallel — an emotion analyst (Groq gpt-oss-20b), a vision analyst (Llama-4-Scout VLM looking at the photo), and a LUT retriever (Gemini 512-d embedding cosine match against 367 candidates). An action agent composes everything into a 30-field grading delta + optional LUT pick, with awareness of any LUT already on the photo so refinement prompts like "warmer still" don't break the look.

Same UI, same sliders, same gallery. The agents path unlocks plain-language prompts that have no slider-formula equivalent — "Halloween mood for this pumpkin photo", "Y2K aesthetic", "give it a chill vibe".


Why it's different

Compositional

Phrases compose. "Slightly warmer, less contrast in the shadows, push the blues toward teal" moves four sliders. "Cinematic + golden hour + bluer sky" layers three intents. No template matching, no preset roulette.

Local-first

Your pixels never leave your machine for grading. Decoding, grading, preview — all client-side via WebGL2. Only a final JPEG you choose to save ever touches the network. BYO API keys are header-only, never logged or persisted.

Reversible

Edits are stored as parameter deltas + LUT references, not flattened pixels. Re-open any saved edit. Keep grading. Undo a year later. Same result. The gallery thumbnails render the actual graded preview, not the raw original.


Examples

Prompt What happens
cinematic Split-tone shadows → teal, highlights → orange; subtle contrast bump
moody, blue shadows Exposure down · contrast up · split-tone shadow hue → blue
subtly warmer and a bit moody Warm at 0.45× strength + moody preset
protect highlights, lift shadows Highlights pulled down (more if clipping) · shadows opened
golden hour, warmer Sunset-glow HSL boost · WB warmer · stacked compositionally
cottagecore but dreamier (agents) Retrieves v4-cottagecore-soft-pastel, lifts shadows, mutes greens
Halloween mood (agents) Retrieves v4-halloween-orange-purple, deepens shadows, boosts orange + magenta
holiday spirit (agents) Retrieves v4-warm-fireside-holiday, warms WB, boosts reds + amber
cyberpunk (agents) Retrieves a high-saturation magenta-cyan look, crushes blacks

Type examples in the chat at any time to surface 14 more curated prompts including compound forms.


How it works

Slider mode (no account, every prompt):

┌─────────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Your text  │ ──▶ │  NL parser   │ ──▶ │  Param delta │ ──▶ │ WebGL2 grade │
└─────────────┘     │ (no LLM)     │     │ (JSON)       │     │ (native res) │
                    └──────────────┘     └──────────────┘     └──────────────┘
  1. Parse. Compositional intent parser walks the sentence, matches 42 intents and 6 modifier classes, emits structured deltas. Pure TS, < 1 ms.
  2. Adapt. A 256-px CPU pass on upload (~5 ms) computes mean luminance, std, mean RGB, and 5/95th percentiles. Each intent declares an adaptive scaler so prompt magnitudes scale to the photo.
  3. Render. Two-pass WebGL2 pipeline: WB → exposure → tone → HSL → curves → split-tone → vignette → optional 3D LUT → letterbox.

Agents + LUT mode (signed in, when the parser doesn't have a literal match):

                   ┌───────────────────────┐
                   │   Emotion analyst     │
                   │   Groq gpt-oss-20b    │
                   ├───────────────────────┤
   Your text ────▶ │   Image analyst       │ ──┐
   Your photo ───▶ │   Llama-4-Scout VLM   │   │     ┌──────────────────┐     ┌──────────────┐
                   ├───────────────────────┤   ├──▶  │  Action agent    │ ──▶ │ Param delta  │
                   │   LUT retriever       │   │     │  Groq gpt-oss-20b│     │ + LUT pick   │
                   │   Gemini 512-d cosine │ ──┘     │  + CURRENT_LUT   │     └──────────────┘
                   │   from 367 candidates │         │  awareness       │
                   └───────────────────────┘         └──────────────────┘
  1. Three analysts run in parallel. Wallclock = max(A1, A2, retrieval) ≈ 600 ms warm.
  2. Action agent (A3) receives EMOTION_SUMMARY, IMAGE_SUMMARY, LUT_CANDIDATES (top-3 with cosine scores), and CURRENT_LUT (the LUT already applied from prior turns, if any). It can pick a candidate, keep the current LUT by omitting lutId, or strip the LUT with lutId: null. For refinements like "warmer still" it emits slider deltas only.
  3. Same renderer. A3's output merges with the parser output and the existing sliders into one GradingParams snapshot; the UI sliders stay editable.
  4. Save. Saved edits live in Postgres as parameter snapshots; thumbnails render the actual graded preview via a shared WebGL baker (single GL context for the whole gallery — scales to 100+ photos without context eviction).

Quickstart

git clone https://github.com/jiajunl23/nlumination.git
cd nlumination
pnpm install
cp .env.local.example .env.local   # add your keys
pnpm db:push                        # apply schema to Neon
pnpm dev

Open http://localhost:3000. Without env keys, slider mode + keyword chips still work — Clerk runs in keyless dev mode and the parser is fully client-side. The full agents + LUT path needs GROQ_API_KEY + GEMINI_API_KEY set server-side (or a signed-in user supplying their own via the BYO popovers).


Stack

Layer Choice Why
Framework Next.js 16 (App Router) · React 19 · TypeScript Server components for auth-gated pages, RSC-friendly data fetching
Styling Tailwind v4 Token-driven theme, @theme inline for design system
Auth Clerk Drop-in, keyless dev mode, themed via appearance overrides
Database Neon + Drizzle ORM Serverless Postgres, branchable, type-safe queries
Storage Cloudinary Free 25 GB, signed uploads, on-the-fly transforms
Pixels WebGL2 + custom GLSL Native-res, GPU-accelerated, fully local; single shared context for the gallery
Slider-mode prompts In-house parser Deterministic, < 1 ms, no API call
Agents Groq gpt-oss-20b + Llama-4-Scout A1 emotion, A2 VLM, A3 action — 200K TPD free tier, BYO key to bypass
LUT retrieval Gemini gemini-embedding-001 @ 512-d Matryoshka Asymmetric RETRIEVAL_QUERY/DOCUMENT, 100 RPM + 1000 RPD free tier, BYO key to bypass
LUT library 241 entries · 367 manifest aliases 137 license-clean film stocks + 104 v4-generated colloquial looks (CC0)

Service setup

Clerk  — auth, optional in dev
  1. Create an app at https://dashboard.clerk.com.
  2. Copy the publishable + secret keys into .env.local.
  3. The app creates DB user rows lazily on first authenticated request — no webhook needed.
Neon  — Postgres, required for the gallery
  1. Create a project at https://console.neon.tech.
  2. Copy the pooled connection string (with ?sslmode=require) into DATABASE_URL.
  3. Run pnpm db:push to create users, photos, edits, llmUsage, embeddingUsage.
Cloudinary  — image CDN, free tier no card
  1. Create a free account at https://cloudinary.com.
  2. From the Dashboard, copy Cloud name, API Key, and API Secret into CLOUDINARY_CLOUD_NAME, CLOUDINARY_API_KEY, CLOUDINARY_API_SECRET. Set NEXT_PUBLIC_CLOUDINARY_CLOUD_NAME to the same cloud name.
  3. No CORS or bucket setup. Free tier: 25 GB storage, 25 GB monthly bandwidth, 25k transformations. When you hit a limit, Cloudinary stops serving — no surprise bills.
Groq  — LLM for the agents pipeline, free tier no card
  1. Create an account at https://console.groq.com.
  2. Generate an API key at https://console.groq.com/keys.
  3. Set GROQ_API_KEY=gsk_... in .env.local.
  4. Free tier: 200K TPD on gpt-oss-20b and Llama-4-Scout; the app caps each user at 100 LLM calls/day against that bucket.
  5. Users can bring their own Groq key in-app via the BYO key popover — bypasses the shared 100/day cap. Keys are header-only (X-Groq-Key), never logged or persisted.
Gemini  — LUT retrieval embeddings, free tier no card
  1. Visit https://aistudio.google.com/apikey.
  2. Generate an API key.
  3. Set GEMINI_API_KEY=AIza... in .env.local.
  4. Free tier: 100 RPM (per-item) + 1000 RPD on gemini-embedding-001; the app caps each user at 20 LUT retrievals/day against that bucket.
  5. Users can bring their own Gemini key in-app via the BYO Gemini popover — bypasses the shared 20/day cap. Same header-only pattern (X-Gemini-Key), never logged or persisted.
  6. When the shared LUT quota is exhausted (no BYO Gemini key), agents mode degrades gracefully to slider-only deltas — the request never fails closed.

BYO keys

Both Groq and Gemini support per-user BYO keys directly in the UI. Click the BYO key / BYO Gemini pill in the editor, paste a key, tick the no-billing safety checkbox, save. Both popovers carry a Get a key → link to the provider's console.

  • Where keys live: localStorage on your device only. Never sent to NLumination's backend except as a one-shot request header on the way to the upstream provider.
  • What BYO unlocks: Unlimited daily calls (subject to whatever quota your own account has). The UI badge switches to Using your key (unlimited).
  • What happens at quota / rate-limit: Shared-key users see a notice with the failure reason ("Daily limit reached", "Groq may be rate-limited", "Gemini embedder unavailable") plus a suggestion to either wait or add a BYO key. Both providers can fail independently; if both fail, both BYO suggestions surface.

Scripts

Command What it does
pnpm dev Local dev server
pnpm build · pnpm start Production build / start
pnpm lint Run ESLint
pnpm typecheck tsc --noEmit over the whole repo
pnpm db:generate Drizzle: generate a migration from the schema diff
pnpm db:push Drizzle: push current schema to the configured DB
pnpm db:migrate Drizzle: apply pending SQL migrations
pnpm db:studio Open Drizzle Studio
pnpm test:parser Smoke-test the NL parser with built-in cases

Keyboard shortcuts

Key Action
B (hold) View original — release to return to graded
+ S Save edit to gallery
+ E Export current grade as JPG


Built for photographers — and everyone else — who'd rather describe a feeling than chase a slider.

NLumination is a love letter to color, written in TypeScript and shaders.

About

Natural-language color grading in the browser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors