feat: resurrect combined image + smooth install + config-only secrets#40
Merged
Conversation
Combined-image layout
---------------------
The published image now ships the FastAPI backend (:8000) and the Next.js
UI (:3000) in a single container. Three-stage Dockerfile:
py-builder → uv sync into /app/.venv (carried over from d2a)
web-builder → node:20-bookworm-slim, npm ci --include=dev, next build
(standalone output)
runtime → python:3.11-slim + node binary copied from
node:20-bookworm-slim + venv + repi/ + db/schema.sql +
.next/standalone + .next/static + public/
`docker/entrypoint.sh` supervises both processes — either child exiting
brings the container down so the orchestrator restarts cleanly.
Browser hits only port 3000: `next.config.ts` rewrites `/api/:path*` to
the colocated uvicorn (REPI_API_INTERNAL_URL, default 127.0.0.1:8000).
The UI is reachable out-of-the-box on remote hosts without rebuilding
NEXT_PUBLIC_API_URL; port 8000 is still exposed for direct CLI/curl use.
Image size: 1.28 GB (+160 MB vs d2a's 1.12 GB — node binary + standalone
bundle). uv-cache and dev artefacts stay out via the .dockerignore tweak.
Lazy LLM init (fixes first-pull crash)
--------------------------------------
A fresh `docker pull` has no API key, so the previous lifespan handler
crashed with ValueError from create_provider_from_env(). Now:
- Container.__init__ logs a warning instead of raising when there are
no credentials, leaving llm_provider=None and llm_init_error set.
- Container.require_llm() raises HTTPException 409 with an actionable
message pointing the user at POST /config.
- Routes that actually need the LLM (POST /investigate, the SSE
stream) call require_llm() up front so the client sees a clean 409
rather than a mid-stream error.
- PUT /config now calls Container.refresh_llm() after persisting and
settings.reload(), so the next request just works.
- New /health endpoint exposes llm_configured + llm_init_error so the
UI can nudge the user to /config when needed.
OOTB verification (compose `--profile all-in-one` with all keys empty)
----------------------------------------------------------------------
$ curl /health → {status:ok, llm_configured:false, ...}
$ curl -XPOST /investigate → 409 with "POST /config with your provider"
$ curl -XPUT /config -d '{"LLM_PROVIDER":"openai","OPENAI_API_KEY":"…"}'
→ 200, hot-reload succeeds
$ curl /health → {status:ok, llm_configured:true}
$ curl -XPOST /investigate → 200
$ curl /api/health (via UI on :3000) → proxies correctly to uvicorn
Other touches
-------------
- compose: redis no longer publishes 6379 to the host (only the app
container needs it via compose DNS — avoids clashing with a host
redis on dev boxes); app service merged from the old `api` profile.
- docker workflow: rebuild on web/** and docker/** path changes too.
- .dockerignore: stop excluding web/ outright; only exclude
web/node_modules, web/.next, web/.env.local so the web-builder stage
gets a clean install + build.
Two small fixes that smoothen path 1b (docker compose up) on top of the d2b combined-image work resurrected in 8ff363f. 1. Lazy SentenceTransformer (repi/core/container.py) The ~10s startup hit was dominated by `from sentence_transformers import SentenceTransformer` at module import (drags in torch + transformers) and the eager `SentenceTransformer("all-MiniLM-L6-v2")` in Container.__init__. Both happen in the FastAPI lifespan, so the container couldn't even accept connections on port 8000 until they finished — users saw ~10s of "Connection refused" on first request. The fix: - Move the import inside the `model` property so it only fires the first time something asks for an embedding. - Replace `self.model = SentenceTransformer(...)` with `self._model: Optional[...] = None` plus a property that lazy-loads. Result: /health and /config answer in <1s; the ~10s load is paid by the first /ingest or /investigate, with a clear log line so the slowdown isn't a mystery. Tests also drop from ~5s → ~1s since the import no longer happens at collection time. 2. Drop the all-in-one compose profile (docker-compose.yml) The published image already bundles the API and UI in one container (d2b). The `profiles: ["all-in-one"]` gate made `docker compose up` silently skip the app — users had to know to add `--profile all-in-one`. Removing the gate so `docker compose up` brings up db + redis + app by default; comment block updated to match.
Today, pydantic-settings reads env vars by default. Calling `Settings()`
silently picks up any matching env var — MISTRAL_API_KEY, OPENAI_API_KEY,
etc. — with no warning. We verified the leak end-to-end: a stale .env in
the repo root + docker-compose's auto-substitution + the app service's
`environment:` block was injecting a real key into a container that
otherwise had no `.repi/config.json`. Result: surprising precedence and a
recurring "why is it picking up that key?" debugging episode.
Make config.json the only source.
1. Strip env source from Settings (repi/core/config.py)
Override settings_customise_sources to return only init_settings; drop
env / dotenv / file_secret sources. Now Settings() with no kwargs returns
class defaults — the only way real values arrive is the init kwargs path
that get_settings() uses after reading config.json. Add a LOG_LEVEL field
so logging config flows through Settings too.
2. Remove the last two direct env reads
- repi/core/container.py: drop the os.getenv("LOG_LEVEL") /
os.getenv("ENV") block at module import; read settings.LOG_LEVEL and
settings.REPI_ENV instead. Drop the now-unused `import os`.
- repi/cli.py: _is_prod() no longer falls back to os.environ; reads
REPI_ENV from .repi/config.json directly.
3. Docker stack reads only from config.json
- docker-compose.yml: drop the entire `environment:` block on the app
service. Add a `repi_config` named volume mounted at /app/.repi so
user-saved config persists across `docker compose down` (lost only on
`down -v`).
- Ship docker/config.docker.json as the baked-in default (docker-aware
DATABASE_URL + REDIS_URL, empty LLM key). Dockerfile COPYs it to
/app/config.docker.json.
- docker/entrypoint.sh seeds /app/.repi/config.json from the baked
default on first start only (skipped if the volume already has one).
4. Fix a latent PUT /config bug (repi/api/config.py)
PUT /config used to validate the user's payload via
`Settings(**new_config)` and write the resulting dict verbatim. With env
fallback gone, a partial PUT like {"MISTRAL_API_KEY": "sk-..."} would
clobber DATABASE_URL/REDIS_URL with their localhost class defaults,
breaking the running container instantly under docker. Merge the payload
on top of the existing config.json (or empty dict if absent) before
validation, so partial updates only touch the fields they name.
5. tests/test_config_isolation.py locks the invariant
Two new tests: (a) Settings() with MISTRAL_API_KEY / OPENAI_API_KEY /
etc. set in the shell env returns the class defaults — no leak. (b) Init
kwargs still win when supplied. If anyone re-enables env sources later,
both tests fail loudly.
Verified end-to-end:
- 74/74 tests pass.
- `MISTRAL_API_KEY=leak docker compose up` → container reports
llm_configured: false.
- Partial PUT /config preserves DATABASE_URL: docker hostname survives.
- `docker compose down` then `up` → user-saved config persists.
- `docker compose down -v` then `up` → config wiped, soft-fail returns.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
f9da711 to
0b1b9a2
Compare
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three layered changes that make path 1b (`docker compose up` against the GHCR image) actually work end-to-end and tighten the security model around the LLM API key.
1. Resurrect d2b (`8ff363f`). PR #39 was merged into the d2a branch (not main), so the combined UI+API image work + lazy LLM init never reached `main` and the published `:latest` is API-only. Cherry-picks the d2b commit onto main.
2. Smooth install (`600d064`).
3. Config-only secrets (`f9da711`).
Test plan
🤖 Generated with Claude Code