Added an environment and deployment runbook#19
Conversation
ogazboiz
left a comment
There was a problem hiding this comment.
thanks for taking on an ops runbook, the shape is genuinely useful (env vars, local dev, deploy, healthchecks, rollback, troubleshooting). but a lot of the specifics don't match the actual codebase, and a runbook that's wrong on the specifics is worse than none, so this needs grounding against the real backend first. i checked against remitlend-backend's .env.example and package.json:
- package manager: the whole doc uses pnpm (
pnpm install,pnpm migrate,pnpm dev), but the project uses npm, there's a package-lock.json and the scripts are npm + node-pg-migrate. switch the commands to npm (npm install,npm run migrate,npm run dev). - migrations: it says migrations live in
backend/src/db/migrations/and references aschema_migrationstable. neither is right, migrations live at the repo rootmigrations/and run vianpm run migrate(= node-pg-migrate up), and node-pg-migrate tracks state in apgmigrationstable, not schema_migrations. (src/db/migrations/was an orphaned dir that was just removed.) - env var names are mostly invented. the real ones from .env.example:
- there is no single
CONTRACT_ID. there are four: LOAN_MANAGER_CONTRACT_ID, LENDING_POOL_CONTRACT_ID, REMITTANCE_NFT_CONTRACT_ID, MULTISIG_GOVERNANCE_CONTRACT_ID. - email is SendGrid, not SMTP: SENDGRID_API_KEY + FROM_EMAIL + ADMIN_EMAIL (and TWILIO_* for sms). there is no SMTP_HOST/SMTP_PORT/SMTP_USER/SMTP_PASS.
- the webhook timeout is WEBHOOK_REQUEST_TIMEOUT_MS, not WEBHOOK_TIMEOUT_MS, and i don't see a WEBHOOK_SECRET var at all.
- the indexer batch size is INDEXER_BATCH_SIZE, not MAX_EVENTS_PER_POLL (INDEXER_POLL_INTERVAL_MS is right).
- the backend network var is STELLAR_NETWORK (the frontend one is NEXT_PUBLIC_STELLAR_NETWORK).
- there is no single
- service layout: it describes separate
notifier/andindexer/services/dirs with their own ports (cd notifier && pnpm dev,dev:indexer) and separate staging subdomains. the project is split into remitlend-backend / remitlend-frontend / remitlend-contracts repos, and the indexer + notifications run inside the backend, there's no notifier/ dir or repo. describe the real layout. - the staging section (Trivy scanning, kubectl/k8s manifests, registry.example.com, per-service /health subdomains, auto-rollback pipeline) reads as aspirational rather than what's set up. if that pipeline doesn't exist yet, drop it or clearly mark it as a proposed/target setup rather than current state.
i'd rather have a shorter runbook that's 100% accurate than a comprehensive one that sends an operator down the wrong path. treat the backend .env.example as the source of truth for the variable names. happy to re-review once the commands, paths and env vars match.
if you want to keep contributing, join us on Telegram: https://t.me/+DOylgFv1jyJlNzM0
Runbook & Operational Documentation
This PR adds missing operational documentation required to run, deploy, and maintain the system reliably.
What’s included
Environment Variables
Staging Deployment
Local Development
Troubleshooting
Files added/updated
runbook.md— full operational guidewiki/README.md— quick links and setup entry pointOut of scope
Closes #8