feat: supabase cost tracking#20
Conversation
Persistence layer for per-review-run token + USD cost data, written by the review container directly to a self-hosted Supabase project via PostgREST. Read surface: Supabase Studio + a ./wrily costs CLI. New ./wrily persistence init/migrate/status subcommands wrap the official supabase CLI to create the project, write .env, and apply migrations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sixteen tasks across six phases: env + cost capture, schema migrations, persistence HTTP client, workflow step, CLI subcommands (costs + persistence init/migrate/status), bash entrypoint wiring, opt-in integration test, and docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CodeQL flagged two clear-text password leaks in `wrily persistence init`:
the generated DB password was printed to stdout and also passed via the
`--db-password` / `--password` flags (visible via `ps` and included in
error messages built from `args.join(' ')`).
Drop the console.log entirely and route the password via the
SUPABASE_DB_PASSWORD env var, which the supabase CLI reads for both
`projects create` and `link`. runSupabase now accepts an `env` option
so secrets can be passed to the child process without touching argv.
The password is never used by Wrily at runtime (writes go through the
service-role key) — operators who need SQL-editor access can reset it
from the dashboard.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`runSupabase` was always piping stdio, which broke `supabase login`: the CLI bails with "Cannot use automatic login flow inside non-TTY environments" because it can't drive the browser handoff prompt. Add an `interactive: true` option that inherits the parent stdio. `ensureLoggedIn` now uses it for the login fallback and surfaces the SUPABASE_ACCESS_TOKEN env-var workaround as the fast path so headless runs don't need a TTY at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The supabase CLI's `projects create` MarkFlagRequired check fires before it consults env vars, so the previous attempt to route the password exclusively via SUPABASE_DB_PASSWORD broke project creation with 'required flag(s) "db-password" not set'. Restore --db-password on the create call but keep error-message exposure contained: runSupabase now accepts a redactFlags option that masks the immediately-following value before interpolating args into its error string. The `link` and `db push` calls still receive the password via SUPABASE_DB_PASSWORD env so only one invocation puts the value on argv (and only for the few seconds project creation runs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The switch to --output-format=stream-json made stdout an NDJSON event log instead of the model's reply text. Downstream extractFindings looks for a \`\`\`json fence in the model output and bailed with "No \`\`\`json fence found in model reply" on every run. Walk the NDJSON, concatenate text blocks from every assistant event, and return that as AgentResult.stdout. The cost parser still reads the raw event stream for the final result event so token usage capture is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wrily-on-Wrily review flagged four bugs: 1. \`--since\` was ignored when \`--by repo|model\` routed through the pre-built 30-day views. 2. \`--by day\` returned raw review_runs rows, not a per-day rollup. 3. \`--repo\` was silently dropped when combined with \`--by model\` (the model view has no github_repo column). 4. \`deriveRunStatus\` only emitted success/failed; budget_exceeded / timeout never landed in review_runs.status, defeating the dashboard distinction promised by the schema CHECK and the spec verification plan. queryCosts now hits review_runs directly with an inserted_at filter, client-side aggregating by the requested axis. The 30d views remain in the schema for Studio convenience but the CLI no longer relies on them. \`--repo\` + \`--by model\` is rejected at parse time. A new persist/failure.ts classifies AgentBudgetExceededError / AgentTimeoutError (including one level of err.cause wrapping) and writes a row from main.ts catch blocks. Success path still goes through persistUsageStep at end of the workflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
barryroodt
left a comment
There was a problem hiding this comment.
test review via script
barryroodt
left a comment
There was a problem hiding this comment.
test review with comments
barryroodt
left a comment
There was a problem hiding this comment.
test 422 reproduction
Dogfood run on PR #20 failed with 'Variable \$commitOID of type GitObjectID was provided invalid value' from GitHub's REST endpoint. Root cause: review takes minutes (clone + agent + extract + route), and the commit SHA captured at the start of the bash entrypoint can go stale before the post step runs (force-push, follow-up commit, etc.). GitHub then rejects the review POST. Fix: - postToGitHubStep refreshes the head SHA via octokit.rest.pulls.get immediately before constructing the post payload, falling back to the original env-supplied SHA on lookup failure. - postReview's body-only 422 fallback now retries once more without commit_id at all, so a body-only prose post still lands even when the SHA is rejected for a reason the refresh didn't catch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously persistUsageStep ran last in the workflow, so any failure in postToGitHubStep (e.g. stale commit_id 422) prevented the cost row from being written even though the agent had already burned the spend. Move persistUsageStep ahead of postToGitHubStep — cost rows are now written as soon as agent results + findings are available, independent of GitHub response. deriveRunStatus drops the fallbackUsed check (unknown at this point); post-step issues remain tracked in workflow logs without polluting the cost dashboard status enum. A new persist/state.ts module exposes markUsagePersisted / wasUsagePersisted. The persistUsageStep flips the flag after a successful write so main.ts catch blocks skip persistFailureRun (and its duplicate zero-cost row) when the cost row already exists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
barryroodt
left a comment
There was a problem hiding this comment.
Wrily Review: PR #20
Overall Verdict: With fixes
Summary
0 critical, 2 important. Supabase cost tracking is well-tested and architecturally sound (best-effort persist, redacted argv, success/failure-path dedupe). Two gaps: bash wrapper skips .env sourcing when shell auth is set so persistence creds silently drop, and the new .env written by persistence init lands with default umask perms while holding the service_role_key. 5 minor findings hidden — set sensitivity: minor in .wrily.yml to see.
Confidence rating skipped — declare an application criticality tier in CLAUDE.md or AGENTS.md to enable.
Critical
None.
Important
- L137: wrily —
.envonly sourced when ANTHROPIC_API_KEY/CLAUDE_CODE_OAUTH_TOKEN are both unset, but SUPABASE_URL/SUPABASE_SERVICE_ROLE_KEY live in the same file. A user with auth in shell env gets persistence silently disabled (empty-e SUPABASE_URL=at lines 224-225). Move the.envsource above the auth gate, or add a second guard:if { [[ -z "${SUPABASE_URL:-}" ]] || [[ -z "${SUPABASE_SERVICE_ROLE_KEY:-}" ]]; } && [[ -f "${SCRIPT_DIR}/.env" ]]; then source "${SCRIPT_DIR}/.env"; fi. - L32: src/cli/persistence/dotenv.ts — New
.envwritten viawriteFileSync/appendFileSyncinherits umask (typically 0644 — world-readable). The file holds SUPABASE_SERVICE_ROLE_KEY, which bypasses RLS = full DB admin. On shared/multi-user hosts any local user can exfiltrate it. After writing,chmodSync(path, 0o600)(and when appending, ensure the existing file is already 0600 or tighten it).
Minor
None.
Strengths
- Retry-then-fail-soft +
markUsagePersisteddedupe between success and failure paths keeps observability from ever blocking a review. redactFlagsandSUPABASE_DB_PASSWORDenv-var path keep the generated DB password out of argv and error messages.- Strong unit + integration test coverage (env, retry, aggregateRuns, supabase stub binary, stream-json reassembly).
Suppressions
None.
Two findings from the dogfood Wrily review on PR #20: 1. The bash wrapper only sourced .env when ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN were both unset. Users with shell-exported auth had SUPABASE_URL / SUPABASE_SERVICE_ROLE_KEY silently dropped, so the container started with empty Supabase env and persistence stayed off without any indication. The .env source now runs whenever any of those keys is missing in the current shell env. set -a is used briefly so KEY=val lines export into the env we pass to docker. 2. appendDotEnv inherited the shell's umask, so .env landed with typical 0644 perms while holding the service_role key (which bypasses RLS = full DB admin). New files are created with mode 0o600; existing files are tightened to 0o600 after every append so a pre-existing world-readable file gets fixed on the next write. No-op on win32 where the chmod semantics differ. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Summary
Adds opt-in persistence of per-review token + USD cost to a self-hosted Supabase project. Reviews still work without it enabled — purely additive.
Wired up:
claudeCLI now invoked with--output-format=stream-json --verbose; finalresultevent parsed intoAgentTokenUsage(wasnullbefore).src/persist/module withrecordReviewRun(retry-then-fail-soft) +queryCosts. No new runtime deps — plainfetchagainst PostgREST.persistUsageStepappended to the review workflow, no-op when env vars absent.review_runs+review_subagent_runstables,spend_by_repo_30d+spend_by_model_30dviews../wrily persistence {init,migrate,status}subcommands wrap the officialsupabaseCLI to create a project, write.env, and apply migrations../wrily costs [--since 30d] [--by repo|model|day] [--repo X] [--json]for spend rollups.trigger_source=local_cli; GitHub App runs collapse togithub_app.Spec + plan:
docs/superpowers/specs/2026-05-18-supabase-cost-tracking-design.mddocs/superpowers/plans/2026-05-18-supabase-cost-tracking.mdTest Plan
pnpm test— 243 passing, 1 integration test skipped (opt-in viaWRILY_INT_SUPABASE_URL)pnpm typecheck— cleanpnpm build— clean./wrily --helpshows the new subcommands./wrily persistence statusreportsdisabledwhen env unset./wrily persistence initagainst a throwaway Supabase project — verify tables + views landSUPABASE_URLset — verify a row appears inreview_runs./wrily costs --since 7dagainst a project with rows🤖 Generated with Claude Code