Skip to content

feat: resurrect combined image + smooth install + config-only secrets#40

Merged
VarunGitGood merged 4 commits into
mainfrom
feat/d2b-resurrect
May 30, 2026
Merged

feat: resurrect combined image + smooth install + config-only secrets#40
VarunGitGood merged 4 commits into
mainfrom
feat/d2b-resurrect

Conversation

@VarunGitGood
Copy link
Copy Markdown
Owner

Summary

Three layered changes that make path 1b (`docker compose up` against the GHCR image) actually work end-to-end and tighten the security model around the LLM API key.

1. Resurrect d2b (`8ff363f`). PR #39 was merged into the d2a branch (not main), so the combined UI+API image work + lazy LLM init never reached `main` and the published `:latest` is API-only. Cherry-picks the d2b commit onto main.

2. Smooth install (`600d064`).

  • Lazy SentenceTransformer: API boot went from ~10s (torch + transformers eagerly imported, model loaded in lifespan) to <1s. Model is loaded on first `/ingest` or `/investigate` with a clear log line. Tests also got ~5x faster.
  • Drop `all-in-one` profile: `docker compose up` now brings up `db + redis + app` by default. No more `--profile all-in-one` footgun.

3. Config-only secrets (`f9da711`).

  • `pydantic-settings` was silently reading env vars. A stale repo-root `.env` + compose substitution was leaking a real Mistral key into the container even when `.repi/config.json` was empty. Overrode `settings_customise_sources` so `Settings` reads only init kwargs (i.e., only what `get_settings()` loads from `.repi/config.json`).
  • Dropped the `environment:` block on the app service in compose. The image now ships `docker/config.docker.json` with docker-aware infra URLs; entrypoint seeds it into a `repi_config` named volume on first start. User-saved config persists across `docker compose down` (lost only on `down -v`).
  • Removed the last two direct `os.getenv` reads (`container.py` log-config block, `cli.py` `_is_prod()`).
  • Fixed a latent `PUT /config` bug: it used to validate `Settings(**new_config)` and write the full dict, which clobbered `DATABASE_URL` with the localhost default on partial PUTs. Now merges on top of existing config.json before validating.
  • Added `tests/test_config_isolation.py` to lock the no-env-leak invariant — fails loudly if anyone re-enables env sources.

Test plan

  • `uv run pytest tests/ -v` — 74/74 pass (was 72; +2 new isolation tests)
  • `docker compose up` with no host env vars → `/health` returns `llm_configured: false` and `/investigate` returns 409 with actionable message
  • `MISTRAL_API_KEY=leak docker compose up` → container still reports `llm_configured: false` (env did NOT reach the app)
  • UI on `:3000` serves `/repi/docs`, `/investigations`, `/config`; `/api/*` proxies through to colocated uvicorn
  • Partial `PUT /api/config` with only `{LLM_PROVIDER, MISTRAL_API_KEY}` preserves docker `DATABASE_URL` and `REDIS_URL` in the stored config
  • `docker compose down` then `up` → user-saved key persists (volume retains config.json, entrypoint does not re-seed)
  • `docker compose down -v` then `up` → config wiped, soft-fail returns

🤖 Generated with Claude Code

Combined-image layout
---------------------
The published image now ships the FastAPI backend (:8000) and the Next.js
UI (:3000) in a single container. Three-stage Dockerfile:

  py-builder   → uv sync into /app/.venv (carried over from d2a)
  web-builder  → node:20-bookworm-slim, npm ci --include=dev, next build
                 (standalone output)
  runtime      → python:3.11-slim + node binary copied from
                 node:20-bookworm-slim + venv + repi/ + db/schema.sql +
                 .next/standalone + .next/static + public/

`docker/entrypoint.sh` supervises both processes — either child exiting
brings the container down so the orchestrator restarts cleanly.

Browser hits only port 3000: `next.config.ts` rewrites `/api/:path*` to
the colocated uvicorn (REPI_API_INTERNAL_URL, default 127.0.0.1:8000).
The UI is reachable out-of-the-box on remote hosts without rebuilding
NEXT_PUBLIC_API_URL; port 8000 is still exposed for direct CLI/curl use.

Image size: 1.28 GB (+160 MB vs d2a's 1.12 GB — node binary + standalone
bundle). uv-cache and dev artefacts stay out via the .dockerignore tweak.

Lazy LLM init (fixes first-pull crash)
--------------------------------------
A fresh `docker pull` has no API key, so the previous lifespan handler
crashed with ValueError from create_provider_from_env(). Now:

  - Container.__init__ logs a warning instead of raising when there are
    no credentials, leaving llm_provider=None and llm_init_error set.
  - Container.require_llm() raises HTTPException 409 with an actionable
    message pointing the user at POST /config.
  - Routes that actually need the LLM (POST /investigate, the SSE
    stream) call require_llm() up front so the client sees a clean 409
    rather than a mid-stream error.
  - PUT /config now calls Container.refresh_llm() after persisting and
    settings.reload(), so the next request just works.
  - New /health endpoint exposes llm_configured + llm_init_error so the
    UI can nudge the user to /config when needed.

OOTB verification (compose `--profile all-in-one` with all keys empty)
----------------------------------------------------------------------
  $ curl /health         → {status:ok, llm_configured:false, ...}
  $ curl -XPOST /investigate → 409 with "POST /config with your provider"
  $ curl -XPUT  /config -d '{"LLM_PROVIDER":"openai","OPENAI_API_KEY":"…"}'
                         → 200, hot-reload succeeds
  $ curl /health         → {status:ok, llm_configured:true}
  $ curl -XPOST /investigate → 200
  $ curl /api/health (via UI on :3000) → proxies correctly to uvicorn

Other touches
-------------
- compose: redis no longer publishes 6379 to the host (only the app
  container needs it via compose DNS — avoids clashing with a host
  redis on dev boxes); app service merged from the old `api` profile.
- docker workflow: rebuild on web/** and docker/** path changes too.
- .dockerignore: stop excluding web/ outright; only exclude
  web/node_modules, web/.next, web/.env.local so the web-builder stage
  gets a clean install + build.
Two small fixes that smoothen path 1b (docker compose up) on top of the
d2b combined-image work resurrected in 8ff363f.

1. Lazy SentenceTransformer (repi/core/container.py)

The ~10s startup hit was dominated by `from sentence_transformers
import SentenceTransformer` at module import (drags in torch +
transformers) and the eager `SentenceTransformer("all-MiniLM-L6-v2")`
in Container.__init__. Both happen in the FastAPI lifespan, so the
container couldn't even accept connections on port 8000 until they
finished — users saw ~10s of "Connection refused" on first request.

The fix:
- Move the import inside the `model` property so it only fires the
  first time something asks for an embedding.
- Replace `self.model = SentenceTransformer(...)` with
  `self._model: Optional[...] = None` plus a property that lazy-loads.

Result: /health and /config answer in <1s; the ~10s load is paid by
the first /ingest or /investigate, with a clear log line so the
slowdown isn't a mystery. Tests also drop from ~5s → ~1s since the
import no longer happens at collection time.

2. Drop the all-in-one compose profile (docker-compose.yml)

The published image already bundles the API and UI in one container
(d2b). The `profiles: ["all-in-one"]` gate made `docker compose up`
silently skip the app — users had to know to add `--profile
all-in-one`. Removing the gate so `docker compose up` brings up
db + redis + app by default; comment block updated to match.
Today, pydantic-settings reads env vars by default. Calling `Settings()`
silently picks up any matching env var — MISTRAL_API_KEY, OPENAI_API_KEY,
etc. — with no warning. We verified the leak end-to-end: a stale .env in
the repo root + docker-compose's auto-substitution + the app service's
`environment:` block was injecting a real key into a container that
otherwise had no `.repi/config.json`. Result: surprising precedence and a
recurring "why is it picking up that key?" debugging episode.

Make config.json the only source.

1. Strip env source from Settings (repi/core/config.py)

Override settings_customise_sources to return only init_settings; drop
env / dotenv / file_secret sources. Now Settings() with no kwargs returns
class defaults — the only way real values arrive is the init kwargs path
that get_settings() uses after reading config.json. Add a LOG_LEVEL field
so logging config flows through Settings too.

2. Remove the last two direct env reads

- repi/core/container.py: drop the os.getenv("LOG_LEVEL") /
  os.getenv("ENV") block at module import; read settings.LOG_LEVEL and
  settings.REPI_ENV instead. Drop the now-unused `import os`.
- repi/cli.py: _is_prod() no longer falls back to os.environ; reads
  REPI_ENV from .repi/config.json directly.

3. Docker stack reads only from config.json

- docker-compose.yml: drop the entire `environment:` block on the app
  service. Add a `repi_config` named volume mounted at /app/.repi so
  user-saved config persists across `docker compose down` (lost only on
  `down -v`).
- Ship docker/config.docker.json as the baked-in default (docker-aware
  DATABASE_URL + REDIS_URL, empty LLM key). Dockerfile COPYs it to
  /app/config.docker.json.
- docker/entrypoint.sh seeds /app/.repi/config.json from the baked
  default on first start only (skipped if the volume already has one).

4. Fix a latent PUT /config bug (repi/api/config.py)

PUT /config used to validate the user's payload via
`Settings(**new_config)` and write the resulting dict verbatim. With env
fallback gone, a partial PUT like {"MISTRAL_API_KEY": "sk-..."} would
clobber DATABASE_URL/REDIS_URL with their localhost class defaults,
breaking the running container instantly under docker. Merge the payload
on top of the existing config.json (or empty dict if absent) before
validation, so partial updates only touch the fields they name.

5. tests/test_config_isolation.py locks the invariant

Two new tests: (a) Settings() with MISTRAL_API_KEY / OPENAI_API_KEY /
etc. set in the shell env returns the class defaults — no leak. (b) Init
kwargs still win when supplied. If anyone re-enables env sources later,
both tests fail loudly.

Verified end-to-end:
- 74/74 tests pass.
- `MISTRAL_API_KEY=leak docker compose up` → container reports
  llm_configured: false.
- Partial PUT /config preserves DATABASE_URL: docker hostname survives.
- `docker compose down` then `up` → user-saved config persists.
- `docker compose down -v` then `up` → config wiped, soft-fail returns.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
repi Ready Ready Preview, Comment May 30, 2026 12:31pm

@VarunGitGood VarunGitGood merged commit 823c6f6 into main May 30, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant