Harden production deployment configuration by fakechris · Pull Request #23 · fakechris/Involute

fakechris · 2026-04-21T09:22:41Z

Summary

standardize production image compose around .env.production and direct DATABASE_URL
add Google OAuth fail-fast validation and production smoke checks
add Ansible Vault examples and deploy-time OAuth smoke validation
update production docs and backup script defaults

Verification

pnpm typecheck
pnpm test --run
docker compose --env-file .env.production.example -f docker-compose.prod.images.yml config
Ansible syntax check for ops/ansible/playbooks/deploy.yml
precommit secret scan for long token/key patterns: 0
production smoke passed for https://involute.edai100.com

Summary by CodeRabbit

New Features
- Added configurable Google OAuth requirement flag for production deployments.
- Introduced production smoke testing to verify deployment health.
Documentation
- Updated deployment runbooks with production environment configuration guidance.
Chores
- Enhanced production deployment infrastructure with improved Docker image support and Ansible automation.
- Added parameterized volume and service binding configuration for production environments.

coderabbitai · 2026-04-21T09:22:53Z

Warning

Rate limit exceeded

@fakechris has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 50 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 50 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 15b7e3d3-7da7-45fe-96b7-d7f9fdbdb6dc

📥 Commits

Reviewing files that changed from the base of the PR and between 47b3999 and 046bc2d.

📒 Files selected for processing (1)

ops/ansible/playbooks/deploy.yml

📝 Walkthrough

Walkthrough

This PR extends production deployment infrastructure by introducing database URL and bind-address configuration, Docker image deployment identifiers, and a Google OAuth requirement flag. It updates deployment automation (GitHub Actions, Ansible) to conditionally validate OAuth credentials, refactors Docker Compose production files to use pre-built images, and adds production smoke testing to verify health and OAuth configuration.

Changes

Cohort / File(s)	Summary
Environment Configuration `.env.production.example`, `ops/ansible/templates/env.production.j2`	Added `DATABASE_URL`, `SERVER_BIND_ADDRESS`, `WEB_BIND_ADDRESS`, image registry/namespace/tag variables, and `REQUIRE_GOOGLE_OAUTH` flag. Updated `POSTGRES_PASSWORD` placeholder guidance and made Google OAuth credential placeholders explicit.
Docker Compose Production `docker-compose.prod.images.yml`, `package.json`	Externalized `DATABASE_URL` from hardcoded construction to required environment variable; added `REQUIRE_GOOGLE_OAUTH` and bind-address port mappings; parameterized volume names; added `caddy` service profile. Removed `compose:prod:build` script and updated compose commands to target `docker-compose.prod.images.yml` without `--build`.
Ansible Deployment Infrastructure `ops/ansible/playbooks/deploy.yml`, `ops/ansible/inventory/hosts.yml.example`, `ops/ansible/group_vars/all/...`	Changed default stack profile from `tailscale` to `production` and database seeding from `true` to `false`. Added conditional Google OAuth validation; updated compose file selection to `.images.yml` variants; added smoke-test URL verification and post-deploy OAuth configuration checks. Added vault example template and group_vars structure.
GitHub Actions Deployment `.github/workflows/deploy.yml`	Added `INVOLUTE_IMAGE_TAG` and `INVOLUTE_REQUIRE_GOOGLE_OAUTH` environment variables with conditional secret validation for OAuth credentials when Google OAuth is required.
Server Environment Parsing `packages/server/src/environment.ts`, `packages/server/src/environment.test.ts`	Added `requireGoogleOAuth` boolean field to `ServerEnvironment`; implemented parsing of `GOOGLE_OAUTH_*` credentials with validation to ensure all OAuth fields are present when requirement is enabled. Added test cases for OAuth validation logic.
Production Smoke Testing `scripts/prod-smoke.sh`, `README.md`	Added new production smoke test script that validates service health, OAuth session configuration, and Google OAuth redirect; updated README with deployment runbook for images-based compose, vault setup instructions, and updated smoke-check commands.
Git & Package Configuration `.gitignore`, `scripts/postgres-backup.sh`	Added patterns to ignore Ansible vault files and group_vars; updated backup script to target `docker-compose.prod.images.yml`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Add automated deployment tooling #9: Introduces the Ansible deployment and Docker Compose production infrastructure that this PR extends with database URL, bind-address, image configuration, and OAuth requirement handling.
Add Google OAuth sessions and team access UI #8: Adds Google OAuth environment validation and parsing logic in packages/server/src/environment.ts, which this PR builds upon with the requireGoogleOAuth flag and conditional validation.
Harden auth and add CI merge gates #3: Modifies packages/server/src/environment.ts to add new ServerEnvironment fields, directly related to the OAuth environment parsing changes in this PR.

Poem

🐰 A hop through the ops, with profiles galore,
Database URLs and OAuth's encore,
From Tailscale to production, the stack profile spins,
With smoke tests ascending—let deployment begin! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Harden production deployment configuration' directly summarizes the main objective of the PR: improving security and robustness of production deployment through standardized compose files, environment configuration, validation checks, and documentation updates.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/production-deploy-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request standardizes the production deployment process by introducing Ansible Vault for secret management, implementing automated smoke checks, and refining the Docker Compose configuration. Key changes include the addition of a DATABASE_URL environment variable, a new REQUIRE_GOOGLE_OAUTH flag with associated server-side validation, and a dedicated smoke test script. Feedback focuses on improving Ansible deployment reliability by adding descriptive failure messages to credential assertions and using more robust file existence checks when cleaning up legacy stacks.

gemini-code-assist · 2026-04-21T09:24:30Z

+          - >-
+            (
+              (involute_stack_profile | default(involute_default_stack_profile)) != 'production'
+            ) or (
+              involute_postgres_password is defined
+              and involute_postgres_password | length > 0
+              and (involute_postgres_password is match('^[A-Za-z0-9._~-]+$'))
+            )
+          - >-
+            (
+              (involute_stack_profile | default(involute_default_stack_profile)) != 'production'
+              or not (involute_require_google_oauth | default(true) | bool)
+            ) or (
+              involute_google_oauth_client_id is defined
+              and involute_google_oauth_client_id | length > 0
+              and involute_google_oauth_client_secret is defined
+              and involute_google_oauth_client_secret | length > 0
+              and involute_google_oauth_redirect_uri is defined
+              and involute_google_oauth_redirect_uri | length > 0
+            )


The validation logic for involute_postgres_password and Google OAuth credentials is a significant improvement for deployment reliability. However, the ansible.builtin.assert task lacks a fail_msg. If the regex check for the password fails (e.g., if it contains characters like # or !), the operator will see a generic 'Assertion failed' error. Adding a descriptive fail_msg would help the operator understand that only URL-safe characters are permitted for the password to ensure DATABASE_URL consistency.

Fixed in 046bc2d by adding an explicit fail_msg that calls out the URL-safe Postgres password requirement, required Google OAuth variables, and Tailscale admin prerequisites.

gemini-code-assist · 2026-04-21T09:24:30Z

+    - name: Stop legacy source-build stack if present
      ansible.builtin.shell: >
        docker compose
+        --env-file .env
+        -f docker-compose.yml
+        down --remove-orphans || true
+      args:
+        chdir: "{{ involute_deploy_path | default(involute_default_deploy_path) }}"
+
+    - name: Stop legacy production source-build stack if present
+      ansible.builtin.shell: >
+        docker compose
+        --env-file .env.production
+        -f docker-compose.prod.yml
+        down --remove-orphans || true
+      args:
+        chdir: "{{ involute_deploy_path | default(involute_default_deploy_path) }}"


These tasks use shell with || true to clean up legacy stacks. This approach can be noisy in Ansible logs if the files are missing. A more robust and cleaner pattern is to use the ansible.builtin.stat module to check for the existence of docker-compose.yml and docker-compose.prod.yml and then use the when conditional to run the cleanup only when necessary.

Fixed in 046bc2d by adding stat checks for the legacy compose files and running the cleanup tasks conditionally instead of relying on || true.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

README.md (1)

297-303: ⚠️ Potential issue | 🟡 Minor

Document the actual GitHub Actions OAuth toggle name.

The workflow reads vars.INVOLUTE_REQUIRE_GOOGLE_OAUTH, but this section refers to REQUIRE_GOOGLE_OAUTH=true. Operators trying to disable the requirement via REQUIRE_GOOGLE_OAUTH=false will still get the workflow default.

Docs patch

-- `INVOLUTE_GOOGLE_OAUTH_CLIENT_ID`, `INVOLUTE_GOOGLE_OAUTH_CLIENT_SECRET`, `INVOLUTE_GOOGLE_OAUTH_REDIRECT_URI` when `REQUIRE_GOOGLE_OAUTH=true`
+- `INVOLUTE_GOOGLE_OAUTH_CLIENT_ID`, `INVOLUTE_GOOGLE_OAUTH_CLIENT_SECRET`, `INVOLUTE_GOOGLE_OAUTH_REDIRECT_URI` when `INVOLUTE_REQUIRE_GOOGLE_OAUTH=true`
 - optional: `INVOLUTE_ADMIN_EMAIL_ALLOWLIST`, `INVOLUTE_IMAGE_TAG`
 
 Recommended repository variables:
 
 - `INVOLUTE_DEPLOY_ON_MAIN=false` to keep deploy manual by default
 - `INVOLUTE_DEPLOY_PROFILE=tailscale` for the current private test phase
+- `INVOLUTE_REQUIRE_GOOGLE_OAUTH=true` for production OAuth enforcement, or `false` only for deployments that intentionally disable Google sign-in

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 297 - 303, The README references REQUIRE_GOOGLE_OAUTH
but the GitHub Actions workflow uses vars.INVOLUTE_REQUIRE_GOOGLE_OAUTH; update
the documentation to mention INVOLUTE_REQUIRE_GOOGLE_OAUTH (and its expected
true/false values) wherever REQUIRE_GOOGLE_OAUTH appears (including the list of
required env vars and any examples), so the documented toggle matches the
workflow variable name and operators can correctly override the default.

🧹 Nitpick comments (5)

ops/ansible/group_vars/all/.gitkeep (1)
1-1: LGTM! The .gitkeep file correctly preserves the directory.

The file serves its purpose of keeping the ops/ansible/group_vars/all/ directory in Git. While a single blank line is functional, you might optionally consider either leaving it completely empty or adding a brief comment explaining its purpose (e.g., # Keep this directory for Ansible group vars), but the current approach works fine.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ansible/group_vars/all/.gitkeep` at line 1, The .gitkeep file is correct
and no change is required; keep the existing file as-is to preserve the
directory, or optionally replace the single blank line with a short explanatory
comment in the .gitkeep to document its purpose (e.g., indicating it preserves
the directory for Ansible group vars).
scripts/prod-smoke.sh (1)
15-30: Temp file can leak on failure.

SESSION_RESPONSE is created at line 15, but with set -eu any failure between the mktemp and line 30 (e.g. curl connection error) will abort the script without running the later rm -f. Register a trap so the file is always removed.
🧹 Suggested cleanup
 SESSION_RESPONSE="$(mktemp)"
+trap 'rm -f "$SESSION_RESPONSE"' EXIT
 SESSION_STATUS="$(
Then drop the explicit rm -f "$SESSION_RESPONSE" calls on lines 25 and 30.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/prod-smoke.sh` around lines 15 - 30, The temporary file created in
SESSION_RESPONSE via mktemp can leak on errors; after creating SESSION_RESPONSE
call mktemp, immediately register a cleanup trap (e.g. trap 'rm -f
"$SESSION_RESPONSE"' EXIT) so the file is removed on script exit, and then
remove the later explicit rm -f "$SESSION_RESPONSE" calls; ensure the trap is
set while SESSION_RESPONSE is defined and before running curl (referencing
SESSION_RESPONSE, mktemp, and the curl/SESSION_STATUS block).
ops/ansible/templates/env.production.j2 (2)
7-7: DATABASE_URL relies on password being URL-safe.

The default embeds involute_postgres_password verbatim without percent-encoding. This is currently safe because deploy.yml enforces ^[A-Za-z0-9._~-]+$ for the production profile, but the coupling is implicit. Consider either:

Documenting in a template comment that involute_postgres_password must be URL-safe when involute_database_url is not overridden; or

Applying Jinja's urlencode filter to the password so the default is robust even if the regex constraint is ever relaxed.
♻️ Suggested robustness tweak
-DATABASE_URL={{ involute_database_url | default('postgresql://' ~ (involute_postgres_user | default('involute')) ~ ':' ~ involute_postgres_password ~ '@db:5432/' ~ (involute_postgres_db | default('involute')) ~ '?schema=public') }}
+DATABASE_URL={{ involute_database_url | default('postgresql://' ~ (involute_postgres_user | default('involute') | urlencode) ~ ':' ~ (involute_postgres_password | urlencode) ~ '@db:5432/' ~ (involute_postgres_db | default('involute')) ~ '?schema=public') }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ansible/templates/env.production.j2` at line 7, DATABASE_URL default
currently embeds involute_postgres_password verbatim which assumes the password
is URL-safe; update the template so the password is percent-encoded to be
robust: change the default expression that builds DATABASE_URL (the one using
involute_database_url | default('postgresql://' ~ (involute_postgres_user |
default('involute')) ~ ':' ~ involute_postgres_password ~ '@db:5432/' ~
(involute_postgres_db | default('involute')) ~ '?schema=public')) to apply
Jinja's urlencode filter to involute_postgres_password (e.g.,
involute_postgres_password|urlencode) so the generated URL safely encodes
special characters, or alternatively add a clear template comment above
DATABASE_URL stating that involute_postgres_password must be URL-safe when not
overriding involute_database_url.
11-13: Missing volume-name variables emitted by template.

docker-compose.prod.images.yml consumes POSTGRES_VOLUME_NAME, CADDY_DATA_VOLUME_NAME, and CADDY_CONFIG_VOLUME_NAME (lines 129-133), but this template does not emit them. Ansible-rendered .env.production will therefore always fall back to compose defaults, whereas operators editing .env.production.example can override them — the two paths diverge. Either emit these with matching defaults here, or drop them from .env.production.example to keep parity.
♻️ Suggested addition
 INVOLUTE_IMAGE_TAG={{ involute_image_tag | default('latest') }}
+POSTGRES_VOLUME_NAME={{ involute_postgres_volume_name | default('involute_postgres-prod-data') }}
+CADDY_DATA_VOLUME_NAME={{ involute_caddy_data_volume_name | default('involute_caddy-data') }}
+CADDY_CONFIG_VOLUME_NAME={{ involute_caddy_config_volume_name | default('involute_caddy-config') }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ansible/templates/env.production.j2` around lines 11 - 13, The template
is missing the volume name variables used by docker-compose.prod.images.yml: add
POSTGRES_VOLUME_NAME, CADDY_DATA_VOLUME_NAME, and CADDY_CONFIG_VOLUME_NAME to
the env.production.j2 template with sensible defaults (e.g., match the defaults
used in .env.production.example or compose defaults) so rendered .env.production
contains the same keys; locate where INVOLUTE_IMAGE_REGISTRY /
INVOLUTE_IMAGE_NAMESPACE / INVOLUTE_IMAGE_TAG are defined and add analogous
lines emitting POSTGRES_VOLUME_NAME={{ postgres_volume_name | default('...') }},
CADDY_DATA_VOLUME_NAME={{ caddy_data_volume_name | default('...') }}, and
CADDY_CONFIG_VOLUME_NAME={{ caddy_config_volume_name | default('...') }} to
restore parity.
.env.production.example (1)
6-8: Keep POSTGRES_PASSWORD and DATABASE_URL in sync.

The password placeholder appears in two places, so operators must remember to update both or the DB connection will silently fail. Consider a short inline comment to call this out.
✏️ Suggested comment
 POSTGRES_PASSWORD=replace-with-a-long-url-safe-random-password
+# Keep password in sync with POSTGRES_PASSWORD above; characters must be URL-safe.
 DATABASE_URL=postgresql://involute:replace-with-a-long-url-safe-random-password@db:5432/involute?schema=public
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.production.example around lines 6 - 8, The POSTGRES_PASSWORD and
DATABASE_URL placeholders are duplicated and can fall out of sync; update
.env.production.example so operators don't have to edit two places—either change
DATABASE_URL to reference the POSTGRES_PASSWORD variable (so the password is
sourced from a single env var) or add a short inline comment next to
POSTGRES_PASSWORD and/or DATABASE_URL (mentioning POSTGRES_PASSWORD and
DATABASE_URL by name) explicitly telling maintainers to keep them in sync if
they manually replace the placeholder.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/deploy.yml:
- Line 43: The workflow currently falls back to a mutable default by setting
INVOLUTE_IMAGE_TAG: ${{ vars.INVOLUTE_IMAGE_TAG || 'latest' }}, so update the
deploy job to disallow using 'latest' for production: remove the 'latest'
fallback and add a validation step (or job-level if check) that fails when the
environment is production and INVOLUTE_IMAGE_TAG is empty or equals 'latest';
alternatively set INVOLUTE_IMAGE_TAG from the image-publish job output (e.g., a
sha-<short-sha> tag) before running the deploy job. Ensure the check references
the INVOLUTE_IMAGE_TAG variable and the production environment name so deploys
cannot proceed with a mutable tag.

In `@ops/ansible/inventory/hosts.yml.example`:
- Around line 3-13: The example inventory's production profile is inconsistent:
remove or replace the Tailscale CGNAT values so templates don't set
SERVER_BIND_ADDRESS/WEB_BIND_ADDRESS to 100.x.y.z; either delete
involute_bind_address from the involute_production block (letting
env.production.j2 fall back to 127.0.0.1) or add explicit keys
involute_server_bind_address and involute_web_bind_address with
loopback/realistic production values, and change ansible_host from
tailnet-host.example.ts.net to a realistic host name or IP; update the
involute_production block accordingly so env.production.j2 and
.env.production.example are consistent.

In `@ops/ansible/playbooks/deploy.yml`:
- Around line 218-236: The two unconditional smoke tasks ("Verify auth session
reports Google OAuth configured" and "Verify Google OAuth start redirects to
Google") must be gated by the deployment flag so deployments with
involute_require_google_oauth=false don't fail; update both tasks to include a
when condition such as using the involute_require_google_oauth variable (e.g.
when: involute_require_google_oauth | default(true) | bool) so they run only
when OAuth is required, leaving the existing register names
involute_auth_session_smoke and involute_google_start_smoke and their
failed_when logic unchanged.

In `@package.json`:
- Line 12: The production smoke script invoked by the "smoke:prod" npm task
calls scripts/prod-smoke.sh which currently always asserts googleOAuthConfigured
and the /auth/google/start redirect; change scripts/prod-smoke.sh to gate those
OAuth-specific assertions behind an environment variable (e.g.,
REQUIRE_GOOGLE_OAUTH="${REQUIRE_GOOGLE_OAUTH:-true}") so the script always
performs generic checks (/health and /auth/session shape/status) but only
verifies googleOAuthConfigured=true and the /auth/google/start redirect if
REQUIRE_GOOGLE_OAUTH="true"; update deployment invocation to pass the rendered
REQUIRE_GOOGLE_OAUTH value (or rely on the default) so no-OAuth deployments can
pass the smoke test.

In `@README.md`:
- Around line 279-281: The README is missing the step to create the
ANSIBLE_VAULT_PASSWORD_FILE used by the deploy command; add an explicit
instruction to create ops/ansible/vault-password.txt containing the same vault
password (or a secure symlink to your password manager output) and set strict
permissions (e.g., chmod 600) before running
ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod so
the deploy can read the vault password; reference the vault file creation step
(cp ops/ansible/group_vars/all/vault.yml.example ...) and the
ANSIBLE_VAULT_PASSWORD_FILE environment variable in the same block.

---

Outside diff comments:
In `@README.md`:
- Around line 297-303: The README references REQUIRE_GOOGLE_OAUTH but the GitHub
Actions workflow uses vars.INVOLUTE_REQUIRE_GOOGLE_OAUTH; update the
documentation to mention INVOLUTE_REQUIRE_GOOGLE_OAUTH (and its expected
true/false values) wherever REQUIRE_GOOGLE_OAUTH appears (including the list of
required env vars and any examples), so the documented toggle matches the
workflow variable name and operators can correctly override the default.

---

Nitpick comments:
In @.env.production.example:
- Around line 6-8: The POSTGRES_PASSWORD and DATABASE_URL placeholders are
duplicated and can fall out of sync; update .env.production.example so operators
don't have to edit two places—either change DATABASE_URL to reference the
POSTGRES_PASSWORD variable (so the password is sourced from a single env var) or
add a short inline comment next to POSTGRES_PASSWORD and/or DATABASE_URL
(mentioning POSTGRES_PASSWORD and DATABASE_URL by name) explicitly telling
maintainers to keep them in sync if they manually replace the placeholder.

In `@ops/ansible/group_vars/all/.gitkeep`:
- Line 1: The .gitkeep file is correct and no change is required; keep the
existing file as-is to preserve the directory, or optionally replace the single
blank line with a short explanatory comment in the .gitkeep to document its
purpose (e.g., indicating it preserves the directory for Ansible group vars).

In `@ops/ansible/templates/env.production.j2`:
- Line 7: DATABASE_URL default currently embeds involute_postgres_password
verbatim which assumes the password is URL-safe; update the template so the
password is percent-encoded to be robust: change the default expression that
builds DATABASE_URL (the one using involute_database_url |
default('postgresql://' ~ (involute_postgres_user | default('involute')) ~ ':' ~
involute_postgres_password ~ '@db:5432/' ~ (involute_postgres_db |
default('involute')) ~ '?schema=public')) to apply Jinja's urlencode filter to
involute_postgres_password (e.g., involute_postgres_password|urlencode) so the
generated URL safely encodes special characters, or alternatively add a clear
template comment above DATABASE_URL stating that involute_postgres_password must
be URL-safe when not overriding involute_database_url.
- Around line 11-13: The template is missing the volume name variables used by
docker-compose.prod.images.yml: add POSTGRES_VOLUME_NAME,
CADDY_DATA_VOLUME_NAME, and CADDY_CONFIG_VOLUME_NAME to the env.production.j2
template with sensible defaults (e.g., match the defaults used in
.env.production.example or compose defaults) so rendered .env.production
contains the same keys; locate where INVOLUTE_IMAGE_REGISTRY /
INVOLUTE_IMAGE_NAMESPACE / INVOLUTE_IMAGE_TAG are defined and add analogous
lines emitting POSTGRES_VOLUME_NAME={{ postgres_volume_name | default('...') }},
CADDY_DATA_VOLUME_NAME={{ caddy_data_volume_name | default('...') }}, and
CADDY_CONFIG_VOLUME_NAME={{ caddy_config_volume_name | default('...') }} to
restore parity.

In `@scripts/prod-smoke.sh`:
- Around line 15-30: The temporary file created in SESSION_RESPONSE via mktemp
can leak on errors; after creating SESSION_RESPONSE call mktemp, immediately
register a cleanup trap (e.g. trap 'rm -f "$SESSION_RESPONSE"' EXIT) so the file
is removed on script exit, and then remove the later explicit rm -f
"$SESSION_RESPONSE" calls; ensure the trap is set while SESSION_RESPONSE is
defined and before running curl (referencing SESSION_RESPONSE, mktemp, and the
curl/SESSION_STATUS block).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb49f198-f2ed-482a-8de0-67f3943ffcd7

📥 Commits

Reviewing files that changed from the base of the PR and between 30942c5 and 47b3999.

📒 Files selected for processing (15)

.env.production.example
.github/workflows/deploy.yml
.gitignore
README.md
docker-compose.prod.images.yml
ops/ansible/group_vars/all/.gitkeep
ops/ansible/group_vars/all/vault.yml.example
ops/ansible/inventory/hosts.yml.example
ops/ansible/playbooks/deploy.yml
ops/ansible/templates/env.production.j2
package.json
packages/server/src/environment.test.ts
packages/server/src/environment.ts
scripts/postgres-backup.sh
scripts/prod-smoke.sh

coderabbitai · 2026-04-21T09:28:50Z

      INVOLUTE_GOOGLE_OAUTH_CLIENT_ID: ${{ secrets.INVOLUTE_GOOGLE_OAUTH_CLIENT_ID }}
      INVOLUTE_GOOGLE_OAUTH_CLIENT_SECRET: ${{ secrets.INVOLUTE_GOOGLE_OAUTH_CLIENT_SECRET }}
      INVOLUTE_GOOGLE_OAUTH_REDIRECT_URI: ${{ secrets.INVOLUTE_GOOGLE_OAUTH_REDIRECT_URI }}
+      INVOLUTE_IMAGE_TAG: ${{ vars.INVOLUTE_IMAGE_TAG || 'latest' }}


⚠️ Potential issue | 🟠 Major

Avoid defaulting production deploys to mutable latest.

Line 43 makes the deployed image depend on whatever latest points to at deploy time, which can race Docker publishing or redeploy an unintended build. Prefer requiring INVOLUTE_IMAGE_TAG for production, or deploy an immutable tag such as the published sha-<short-sha> tag.

Suggested direction

- INVOLUTE_IMAGE_TAG: ${{ vars.INVOLUTE_IMAGE_TAG || 'latest' }} + INVOLUTE_IMAGE_TAG: ${{ vars.INVOLUTE_IMAGE_TAG }}

Then fail validation for production when it is unset, or set it from a workflow that runs after the image publish completes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/deploy.yml at line 43, The workflow currently falls back to a mutable default by setting INVOLUTE_IMAGE_TAG: ${{ vars.INVOLUTE_IMAGE_TAG || 'latest' }}, so update the deploy job to disallow using 'latest' for production: remove the 'latest' fallback and add a validation step (or job-level if check) that fails when the environment is production and INVOLUTE_IMAGE_TAG is empty or equals 'latest'; alternatively set INVOLUTE_IMAGE_TAG from the image-publish job output (e.g., a sha-<short-sha> tag) before running the deploy job. Ensure the check references the INVOLUTE_IMAGE_TAG variable and the production environment name so deploys cannot proceed with a mutable tag.

coderabbitai · 2026-04-21T09:28:50Z

+    involute_production:
      ansible_host: tailnet-host.example.ts.net
      ansible_user: root
+      involute_stack_profile: production
      involute_bind_address: 100.x.y.z
-      involute_stack_profile: tailscale
-      involute_app_origin: http://100.x.y.z:4201
-      involute_seed_database: true
-      involute_admin_email_allowlist: "first-admin@example.com"
-      involute_auth_token: change-me
-      involute_viewer_assertion_secret: change-me
+      involute_app_domain: involute.example.com
+      involute_app_origin: https://involute.example.com
+      involute_smoke_base_url: https://involute.example.com
+      involute_seed_database: false
+      involute_require_google_oauth: true
+      # Put real secrets in ops/ansible/group_vars/all/vault.yml.


⚠️ Potential issue | 🟡 Minor

Example still carries Tailscale-style bind address under the production profile.

ansible_host: tailnet-host.example.ts.net and involute_bind_address: 100.x.y.z are CGNAT/Tailscale values, but the profile is now production with https://involute.example.com as origin. Via the template fallback chain in env.production.j2, this will set both SERVER_BIND_ADDRESS and WEB_BIND_ADDRESS to 100.x.y.z, which conflicts with the 127.0.0.1 defaults shown in .env.production.example and with a typical production layout (Caddy fronting loopback-bound services).

Either drop involute_bind_address here (so the template falls through to 127.0.0.1) or set involute_server_bind_address/involute_web_bind_address explicitly and use a realistic ansible_host, so the example isn't self-contradictory.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ops/ansible/inventory/hosts.yml.example` around lines 3 - 13, The example inventory's production profile is inconsistent: remove or replace the Tailscale CGNAT values so templates don't set SERVER_BIND_ADDRESS/WEB_BIND_ADDRESS to 100.x.y.z; either delete involute_bind_address from the involute_production block (letting env.production.j2 fall back to 127.0.0.1) or add explicit keys involute_server_bind_address and involute_web_bind_address with loopback/realistic production values, and change ansible_host from tailnet-host.example.ts.net to a realistic host name or IP; update the involute_production block accordingly so env.production.j2 and .env.production.example are consistent.

coderabbitai · 2026-04-21T09:28:50Z

+    - name: Verify auth session reports Google OAuth configured
+      ansible.builtin.uri:
+        url: "{{ involute_smoke_base_url }}/auth/session"
+        method: GET
+        return_content: true
+        status_code:
+          - 200
+          - 401
+      register: involute_auth_session_smoke
+      failed_when: not (involute_auth_session_smoke.json.googleOAuthConfigured | default(false) | bool)
+
+    - name: Verify Google OAuth start redirects to Google
+      ansible.builtin.uri:
+        url: "{{ involute_smoke_base_url }}/auth/google/start"
+        method: GET
+        follow_redirects: none
+        status_code: 302
+      register: involute_google_start_smoke
+      failed_when: "'accounts.google.com' not in (involute_google_start_smoke.location | default(''))"


⚠️ Potential issue | 🟠 Major

Smoke tasks contradict the require_google_oauth=false escape hatch.

The validation block at lines 89-100 explicitly permits production deployments with involute_require_google_oauth=false, and docker-compose.prod.images.yml (line 62) + environment.ts honor that flag. However, both smoke tasks here are unconditional and assert googleOAuthConfigured=true plus a redirect to accounts.google.com, so any deployment that intentionally disables OAuth will fail the playbook at this step.

Gate both tasks on the same flag, e.g.:

🛠️ Suggested fix

- name: Verify auth session reports Google OAuth configured ansible.builtin.uri: url: "{{ involute_smoke_base_url }}/auth/session" method: GET return_content: true status_code: - 200 - 401 register: involute_auth_session_smoke failed_when: not (involute_auth_session_smoke.json.googleOAuthConfigured | default(false) | bool) + when: involute_require_google_oauth | default(true) | bool - name: Verify Google OAuth start redirects to Google ansible.builtin.uri: url: "{{ involute_smoke_base_url }}/auth/google/start" method: GET follow_redirects: none status_code: 302 register: involute_google_start_smoke failed_when: "'accounts.google.com' not in (involute_google_start_smoke.location | default(''))" + when: involute_require_google_oauth | default(true) | bool

Note the same logic is effectively baked into scripts/prod-smoke.sh, which is fine for its intended default use against production where OAuth is required, but consider documenting that limitation alongside the script if the flag gains broader use.

🧰 Tools

🪛 Checkov (3.2.519)

[medium] 218-230: Ensure that HTTPS url is used with uri

(CKV2_ANSIBLE_1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ops/ansible/playbooks/deploy.yml` around lines 218 - 236, The two unconditional smoke tasks ("Verify auth session reports Google OAuth configured" and "Verify Google OAuth start redirects to Google") must be gated by the deployment flag so deployments with involute_require_google_oauth=false don't fail; update both tasks to include a when condition such as using the involute_require_google_oauth variable (e.g. when: involute_require_google_oauth | default(true) | bool) so they run only when OAuth is required, leaving the existing register names involute_auth_session_smoke and involute_google_start_smoke and their failed_when logic unchanged.

coderabbitai · 2026-04-21T09:28:50Z

    "deploy:bootstrap": "sh scripts/ansible-playbook.sh ops/ansible/playbooks/bootstrap-host.yml",
    "deploy:prod": "INVOLUTE_STACK_PROFILE=production sh scripts/ansible-playbook.sh ops/ansible/playbooks/deploy.yml",
    "deploy:tailscale": "INVOLUTE_STACK_PROFILE=tailscale sh scripts/ansible-playbook.sh ops/ansible/playbooks/deploy.yml",
+    "smoke:prod": "scripts/prod-smoke.sh",


⚠️ Potential issue | 🟠 Major

Make production smoke respect disabled Google OAuth.

smoke:prod currently runs scripts/prod-smoke.sh, and that script always requires googleOAuthConfigured=true plus a Google redirect. That conflicts with the new INVOLUTE_REQUIRE_GOOGLE_OAUTH=false / REQUIRE_GOOGLE_OAUTH=false path, so a valid no-OAuth deployment will still fail smoke.

Suggested direction

Gate the OAuth-specific checks in scripts/prod-smoke.sh on an env var, for example:

REQUIRE_GOOGLE_OAUTH="${REQUIRE_GOOGLE_OAUTH:-true}" # Always check /health and /auth/session shape/status. # Only require googleOAuthConfigured=true and /auth/google/start redirect when: if [ "$REQUIRE_GOOGLE_OAUTH" = "true" ]; then # existing googleOAuthConfigured and redirect assertions fi

Then call it with the rendered production flag during Ansible/GitHub deploy.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@package.json` at line 12, The production smoke script invoked by the "smoke:prod" npm task calls scripts/prod-smoke.sh which currently always asserts googleOAuthConfigured and the /auth/google/start redirect; change scripts/prod-smoke.sh to gate those OAuth-specific assertions behind an environment variable (e.g., REQUIRE_GOOGLE_OAUTH="${REQUIRE_GOOGLE_OAUTH:-true}") so the script always performs generic checks (/health and /auth/session shape/status) but only verifies googleOAuthConfigured=true and the /auth/google/start redirect if REQUIRE_GOOGLE_OAUTH="true"; update deployment invocation to pass the rendered REQUIRE_GOOGLE_OAUTH value (or rely on the default) so no-OAuth deployments can pass the smoke test.

coderabbitai · 2026-04-21T09:28:50Z

+cp ops/ansible/group_vars/all/vault.yml.example ops/ansible/group_vars/all/vault.yml
+ansible-vault encrypt ops/ansible/group_vars/all/vault.yml
+ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod


⚠️ Potential issue | 🟡 Minor

Include the vault password-file creation step.

This flow encrypts vault.yml, then deploys with ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt, but never tells the operator to create that file with the same vault password. The deploy command will fail if the file is missing or contains a different password.

Docs patch

cp ops/ansible/group_vars/all/vault.yml.example ops/ansible/group_vars/all/vault.yml -ansible-vault encrypt ops/ansible/group_vars/all/vault.yml +printf '%s\n' '<choose-a-vault-password>' > ops/ansible/vault-password.txt +chmod 600 ops/ansible/vault-password.txt +ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt \ + ansible-vault encrypt ops/ansible/group_vars/all/vault.yml ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cp ops/ansible/group_vars/all/vault.yml.example ops/ansible/group_vars/all/vault.yml

ansible-vault encrypt ops/ansible/group_vars/all/vault.yml

ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod

cp ops/ansible/group_vars/all/vault.yml.example ops/ansible/group_vars/all/vault.yml

printf '%s\n' '<choose-a-vault-password>' > ops/ansible/vault-password.txt

chmod 600 ops/ansible/vault-password.txt

ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt \

ansible-vault encrypt ops/ansible/group_vars/all/vault.yml

ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 279 - 281, The README is missing the step to create the ANSIBLE_VAULT_PASSWORD_FILE used by the deploy command; add an explicit instruction to create ops/ansible/vault-password.txt containing the same vault password (or a secure symlink to your password manager output) and set strict permissions (e.g., chmod 600) before running ANSIBLE_VAULT_PASSWORD_FILE=ops/ansible/vault-password.txt pnpm deploy:prod so the deploy can read the vault password; reference the vault file creation step (cp ops/ansible/group_vars/all/vault.yml.example ...) and the ANSIBLE_VAULT_PASSWORD_FILE environment variable in the same block.

devin-ai-integration

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-04-21T09:29:11Z

+    - name: Verify auth session reports Google OAuth configured
+      ansible.builtin.uri:
+        url: "{{ involute_smoke_base_url }}/auth/session"
+        method: GET
+        return_content: true
+        status_code:
+          - 200
+          - 401
+      register: involute_auth_session_smoke
+      failed_when: not (involute_auth_session_smoke.json.googleOAuthConfigured | default(false) | bool)
+
+    - name: Verify Google OAuth start redirects to Google
+      ansible.builtin.uri:
+        url: "{{ involute_smoke_base_url }}/auth/google/start"
+        method: GET
+        follow_redirects: none
+        status_code: 302
+      register: involute_google_start_smoke
+      failed_when: "'accounts.google.com' not in (involute_google_start_smoke.location | default(''))"


🔴 Google OAuth smoke tests run unconditionally, breaking tailscale deployments

The two new smoke-test tasks ("Verify auth session reports Google OAuth configured" and "Verify Google OAuth start redirects to Google") have no when guard, so they execute for every deployment profile including tailscale. Tailscale deployments intentionally leave Google OAuth unconfigured (env.tailscale.j2 sets all three OAuth env vars to empty strings), which means:

/auth/session returns googleOAuthConfigured: false → failed_when at line 246 triggers.

/auth/google/start returns HTTP 503 (see packages/server/src/auth-routes.ts:96-101) → status_code: 302 at line 253 fails the task before failed_when is even evaluated.

Every other OAuth-conditional check in this PR is correctly gated: the Ansible variable validation (ops/ansible/playbooks/deploy.yml:89-100), the GitHub Actions secret check (.github/workflows/deploy.yml:73-80), and the server startup guard (packages/server/src/environment.ts:57-61). These two tasks are the only ones missing the condition.

Prompt for agents

The two Google OAuth smoke-test tasks at lines 237-255 of ops/ansible/playbooks/deploy.yml need a `when` condition to only run when Google OAuth is expected. They should mirror the same condition used in the variable validation block at lines 89-100. Add the following to both tasks: when: > (involute_stack_profile | default(involute_default_stack_profile)) == 'production' and (involute_require_google_oauth | default(true) | bool) This ensures the smoke tests only run for production deployments with involute_require_google_oauth enabled, which is consistent with the tailscale env template (env.tailscale.j2) leaving Google OAuth env vars empty, and with the GitHub Actions validation (.github/workflows/deploy.yml:66-80) which gates OAuth checks on DEPLOY_PROFILE=production and INVOLUTE_REQUIRE_GOOGLE_OAUTH=true.

Was this helpful? React with 👍 or 👎 to provide feedback.

Harden production deploy configuration

47b3999

gemini-code-assist Bot reviewed Apr 21, 2026

View reviewed changes

Address deploy hardening review

046bc2d

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

devin-ai-integration Bot reviewed Apr 21, 2026

View reviewed changes

Conversation

fakechris commented Apr 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

fakechris Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

fakechris Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fakechris commented Apr 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading