Skip to content

Added an environment and deployment runbook#19

Open
dubemoyibe-star wants to merge 1 commit into
LabsCrypt:mainfrom
dubemoyibe-star:docs/environment-deployment-runbook
Open

Added an environment and deployment runbook#19
dubemoyibe-star wants to merge 1 commit into
LabsCrypt:mainfrom
dubemoyibe-star:docs/environment-deployment-runbook

Conversation

@dubemoyibe-star

Copy link
Copy Markdown

Runbook & Operational Documentation

This PR adds missing operational documentation required to run, deploy, and maintain the system reliably.


What’s included

Environment Variables

  • Lists all required env vars per service
  • Clearly marks mandatory vs optional variables
  • Notes startup-critical dependencies

Staging Deployment

  • Docker image build and deployment flow
  • Healthcheck validation process
  • Rollback procedure to previous image
  • Notes on Trivy scanning in CI/CD

Local Development

  • Database + Redis setup instructions
  • How to run migrations
  • Steps to start services locally

Troubleshooting

  • Common startup failures and fixes
  • Missing env vars
  • DB connection / migration issues
  • Docker healthcheck failures

Files added/updated

  • runbook.md — full operational guide
  • wiki/README.md — quick links and setup entry point

Out of scope

  • Terraform / infra provisioning
  • Secret management tooling

Closes #8

@ogazboiz ogazboiz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking on an ops runbook, the shape is genuinely useful (env vars, local dev, deploy, healthchecks, rollback, troubleshooting). but a lot of the specifics don't match the actual codebase, and a runbook that's wrong on the specifics is worse than none, so this needs grounding against the real backend first. i checked against remitlend-backend's .env.example and package.json:

  1. package manager: the whole doc uses pnpm (pnpm install, pnpm migrate, pnpm dev), but the project uses npm, there's a package-lock.json and the scripts are npm + node-pg-migrate. switch the commands to npm (npm install, npm run migrate, npm run dev).
  2. migrations: it says migrations live in backend/src/db/migrations/ and references a schema_migrations table. neither is right, migrations live at the repo root migrations/ and run via npm run migrate (= node-pg-migrate up), and node-pg-migrate tracks state in a pgmigrations table, not schema_migrations. (src/db/migrations/ was an orphaned dir that was just removed.)
  3. env var names are mostly invented. the real ones from .env.example:
    • there is no single CONTRACT_ID. there are four: LOAN_MANAGER_CONTRACT_ID, LENDING_POOL_CONTRACT_ID, REMITTANCE_NFT_CONTRACT_ID, MULTISIG_GOVERNANCE_CONTRACT_ID.
    • email is SendGrid, not SMTP: SENDGRID_API_KEY + FROM_EMAIL + ADMIN_EMAIL (and TWILIO_* for sms). there is no SMTP_HOST/SMTP_PORT/SMTP_USER/SMTP_PASS.
    • the webhook timeout is WEBHOOK_REQUEST_TIMEOUT_MS, not WEBHOOK_TIMEOUT_MS, and i don't see a WEBHOOK_SECRET var at all.
    • the indexer batch size is INDEXER_BATCH_SIZE, not MAX_EVENTS_PER_POLL (INDEXER_POLL_INTERVAL_MS is right).
    • the backend network var is STELLAR_NETWORK (the frontend one is NEXT_PUBLIC_STELLAR_NETWORK).
  4. service layout: it describes separate notifier/ and indexer/ services/dirs with their own ports (cd notifier && pnpm dev, dev:indexer) and separate staging subdomains. the project is split into remitlend-backend / remitlend-frontend / remitlend-contracts repos, and the indexer + notifications run inside the backend, there's no notifier/ dir or repo. describe the real layout.
  5. the staging section (Trivy scanning, kubectl/k8s manifests, registry.example.com, per-service /health subdomains, auto-rollback pipeline) reads as aspirational rather than what's set up. if that pipeline doesn't exist yet, drop it or clearly mark it as a proposed/target setup rather than current state.

i'd rather have a shorter runbook that's 100% accurate than a comprehensive one that sends an operator down the wrong path. treat the backend .env.example as the source of truth for the variable names. happy to re-review once the commands, paths and env vars match.

if you want to keep contributing, join us on Telegram: https://t.me/+DOylgFv1jyJlNzM0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Docs] Add an environment and deployment runbook (env vars, staging deploy, healthchecks, rollback)

2 participants