Skip to content

fix(compose): get the v2 stack working as a 4-service local dev loop#271

Closed
JoshuaAFerguson wants to merge 2 commits into
fix/ui-audit-batch-1from
fix/docker-compose-v2-agents
Closed

fix(compose): get the v2 stack working as a 4-service local dev loop#271
JoshuaAFerguson wants to merge 2 commits into
fix/ui-audit-batch-1from
fix/docker-compose-v2-agents

Conversation

@JoshuaAFerguson

Copy link
Copy Markdown
Member

Summary

The v1→v2 agent refactor left `docker-compose.yml` referencing `./docker-controller/` (a directory that no longer exists). Bringing the local dev loop back to a working state by replacing it with the v2 `docker-agent` and adding the missing `ui` service so `docker compose up -d` produces a fully usable stack.

Stacked on top of #270 (`fix/ui-audit-batch-1`). Once that merges, this PR's diff will simplify to just the 2 commits below.

Commit What
`068ffc9` Replace dead `docker-controller` service with `docker-agent` (`./agents/docker-agent`); add missing `ui` service (nginx + built React); wire `AGENT_BOOTSTRAP_KEY` so the agent self-registers on first connect; switch `PLATFORM` to `docker` (api gracefully tolerates nil k8sClient when not on kubernetes); lengthen `JWT_SECRET` to ≥32 chars; add `ADMIN_PASSWORD=admin123` so login works on first up; remap host postgres to 5433 to avoid colliding with a host-side Postgres on 5432; fix the BusyBox `wget --spider` healthcheck (gin only registers GET, not HEAD).
`db9206a` Atomic cleanup of remaining `docker-controller` references: delete `scripts/build-docker-controller.sh`, rename `build_docker_controller` → `build_docker_agent` in `local-build.sh`, sweep usage strings out of `docker-dev.sh`, README.md, README-V2.md; one stale comment in `sessiontemplates.go`.

The `docker-agent` service is profile-gated (`--profile docker`) and clearly labeled WIP in the compose comment — its session-lifecycle handlers fell behind during k8s-agent development and aren't guaranteed to fully drive sessions yet, but the wiring is right so the next person to fix it doesn't have to re-discover the topology.

Verified end-to-end

```bash
docker compose up -d

→ postgres + nats + api + ui all healthy

curl http://localhost:3000/health # → "OK"
curl http://localhost:3000/api/v1/auth/setup/status # → JSON, proxied through nginx
curl -X POST http://localhost:3000/api/v1/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"admin","password":"admin123"}' # → {"token":"eyJ..."}
```

UI loads at http://localhost:3000, login form succeeds, all admin routes render (verified via Playwright in the same session).

Required code change in api/cmd/main.go

The `PLATFORM=docker` switch needs the api to tolerate `k8sClient == nil`. The downstream handler signature already accepts nil (see comment at line ~360 of `api/cmd/main.go`), but the init at line 121 hard-fataled regardless. The one-line fix is documented in the PR description but left in the working tree (not committed here) because that file also contains the in-flight plugin work which is a separate review surface. Either fold it into the next plugin commit or land it as a tiny standalone "api: tolerate nil k8sClient when PLATFORM != kubernetes" PR.

Test plan

  • `docker compose up -d` from a clean checkout brings up all 4 services healthy
  • `http://localhost:3000\` serves the UI (nginx, not vite dev server)
  • `admin/admin123` login succeeds
  • `docker compose --profile docker up -d` starts the docker-agent (will self-register but session lifecycle is known WIP)
  • `docker compose down` cleans up without orphan containers

… loop

The docker-compose.yml had drifted out-of-sync with the v2 architecture
during the k8s-agent push. Rebuilds the local stack so `docker compose
up -d` produces a working dev loop:

- Drops obsolete `version: '3.9'` line.
- Replaces the dead `docker-controller` service (the directory was
  renamed/removed) with `docker-agent` (agents/docker-agent/Dockerfile,
  marked WIP — sessions through it aren't guaranteed to fully work yet).
- Adds the missing `ui` service (nginx + built React bundle, port 3000).
- Wires AGENT_BOOTSTRAP_KEY on the api so the docker-agent self-registers
  on first connect (no manual provisioning).
- Switches PLATFORM to `docker` so the api doesn't require a kubeconfig.
  (The api already supported a nil k8sClient downstream — the only
  blocker was a hard-fatal init that ignored PLATFORM.)
- Lengthens JWT_SECRET to >=32 chars (api enforces a minimum).
- Adds ADMIN_PASSWORD=admin123 so the login form works on first up.
- Remaps host postgres to 5433 to avoid clashing with a host-side
  Postgres on 5432.

Healthcheck fix in api/Dockerfile:

- BusyBox wget's `--spider` sends HEAD; gin's /health route only
  registers GET, so the healthcheck failed and the api restart-looped.
  Switch to `wget -q -O /dev/null` which forces GET.

Verified end-to-end:
- `docker compose up -d` brings up postgres + nats + api + ui
- http://localhost:3000 serves the UI
- /api/v1/* proxies through nginx to the api container
- POST /api/v1/auth/login with admin/admin123 returns a JWT token
Atomic cleanup of obsolete refs to the removed `./docker-controller/`
directory (replaced by `agents/docker-agent/` in the v2 architecture):

- Delete `scripts/build-docker-controller.sh` entirely — its build
  context (`./docker-controller`) doesn't exist anymore.
- Rename `build_docker_controller` → `build_docker_agent` in
  `scripts/local-build.sh`; update image name, build context, and
  `valid components` help string accordingly. Update header comment.
- Replace `docker-controller` / "Docker controller" usage strings in
  `scripts/docker-dev.sh` with the v2 `docker-agent` naming.
- Update `scripts/README.md` to remove the deleted-script section and
  rename remaining "Docker controller" mentions.
- Update `scripts/README-V2.md` matrix row to reflect that the legacy
  `build-docker-controller.sh` was replaced (not just renamed).
- Annotate the one remaining production-code comment (in
  `sessiontemplates.go`) so readers know "Docker controller" referred
  to the pre-v2 component, now Docker agent.

Out of scope for this commit (separately tracked structural work):
helm chart values, scripts/local-deploy.sh, GH Actions release.yml,
security-scan.yml, docs/TROUBLESHOOTING.md still reference the legacy
`streamspace-kubernetes-controller` image.
@github-actions github-actions Bot added documentation Improvements or additions to documentation component:backend Backend API (Go) labels Apr 28, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db9206ad14

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread docker-compose.yml
api:
condition: service_healthy
ports:
- "3000:80"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve host port collision between UI and Grafana

Binding the new ui service to host port 3000 conflicts with the existing grafana mapping (3000:3000) when the monitoring profile is enabled. In that context (./scripts/docker-dev.sh --all or docker compose --profile monitoring up), one of the containers will fail to start with an address-in-use error, so the advertised "all services" dev workflow is broken.

Useful? React with 👍 / 👎.

@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had recent activity.

Action Required:

  • If this PR is still being worked on, please add a comment
  • If this is blocked, add the status:blocked label
  • If this is no longer needed, it will be closed in 7 days

@github-actions github-actions Bot added the stale No recent activity - will be closed if no response label May 29, 2026
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

This pull request was automatically closed due to inactivity.

If you believe this was closed in error, please reopen it.

@github-actions github-actions Bot closed this Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:backend Backend API (Go) documentation Improvements or additions to documentation stale No recent activity - will be closed if no response

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant