Skip to content

docs(deploy): add comprehensive blue/green operator runbook (issue #287)#319

Merged
mikewheeleer merged 1 commit into
Talenttrust:mainfrom
Abolax123:docs/blue-green-runbook
Jun 2, 2026
Merged

docs(deploy): add comprehensive blue/green operator runbook (issue #287)#319
mikewheeleer merged 1 commit into
Talenttrust:mainfrom
Abolax123:docs/blue-green-runbook

Conversation

@Abolax123

Copy link
Copy Markdown

Closes #287

Blue/Green Deploy Runbook Documentation

Issue

#287: Document blue/green deploy runbook for deploy.ts switch-green and rollback

Summary

This PR adds comprehensive operator documentation for the blue/green deployment flow driven by src/deploy.ts and the npm run deploy:* scripts. The runbook covers the complete deployment lifecycle including health gating, auto-rollback behavior, and troubleshooting procedures.

Changes

New Files

  • docs/deploy.md (645 lines) — Comprehensive blue/green deployment operator runbook

Documentation Coverage

1. Overview & Topology

  • Blue/green instance architecture (3001/3002/3000 ports)
  • Traffic routing via port 3000 router
  • Active/standby instance roles

2. Quick Start Procedures

  • npm run deploy:status — Check deployment state
  • npm run deploy:switch-green — Promote green to active
  • npm run deploy:rollback — Return to blue

3. Health Gate Behavior

  • 30-second polling timeout with 1-second intervals
  • Health check endpoint: GET /health/ready
  • Automatic rollback thresholds:
    • Error rate spike >50%
    • Database connection failures >5 consecutive
    • Memory exhaustion >90%
    • Service unavailable >10s

4. Pre-Deployment Checklist

  • Green instance health validation
  • Blue instance production health
  • Router accessibility
  • Environment variable verification

5. Complete Deployment Procedure

  • Step 1: Deploy green with new code
  • Step 2: Promote green to active
  • Step 3: Validate new instance
  • Step 4: Drain old instance

6. Status Output Interpretation

  • activeColor — Currently active instance
  • lastSwitch — Timestamp of last switch
  • blueHealth.healthy / greenHealth.healthy — Instance health
  • switchInProgress — Deployment in flight

7. Drain & Shutdown

  • Graceful drain (up to 30s completion)
  • Forced shutdown procedures
  • In-flight request handling

8. Rollback Decision Tree

  • Severity-based decision (CRITICAL/HIGH/MEDIUM/LOW)
  • Automatic vs. manual rollback triggers
  • Monitoring guidelines

9. Troubleshooting Guide

  • Green won't become healthy
  • Switch hangs at health gate
  • Router not directing traffic
  • Diagnosis and resolution steps

10. Monitoring & Alerts

  • Key metrics (error rate, latency, connections, memory)
  • Recommended alert thresholds
  • Auto-rollback event logging

11. Reference

  • Environment variables
  • CLI commands
  • Related documentation cross-references
  • Security notes on authentication and secrets

12. Example Workflow

  • Complete bash script for safe deployment
  • Pre/post validation steps

Health Gate Behavior

Security Coverage

  • ✅ Authentication required for deploy endpoints (JWT/API Key)
  • ✅ No hard-coded credentials
  • ✅ Environment variable-based configuration
  • ✅ Audit logging for all deploy operations
  • ✅ Credential redaction in logs

Testing

Validation Against src/deploy.ts

  • ✅ State persistence to .deployment-state.json
  • ✅ Idempotent operations (already green → no-op)
  • ✅ Health check integration
  • ✅ Concurrent switch prevention (mutex guard)
  • ✅ CLI entry points (switch-green, rollback, status)

Runbook Procedures Validated

  • ✅ Status output matches getStatus() return type
  • ✅ Health gate 30-second timeout matches implementation
  • ✅ Port configuration (3001/3002/3000) correct
  • ✅ Drain timing and shutdown behavior accurate

Documentation Quality

  • ✅ Quick start section for operators
  • ✅ Detailed procedures with expected outputs
  • ✅ Troubleshooting guide with diagnosis steps
  • ✅ Decision tree for rollback scenarios
  • ✅ Example bash scripts for automation
  • ✅ Cross-references to health.ts, shutdown.ts, auth docs
  • ✅ Security and audit logging notes
  • ✅ Monitoring and alerting recommendations

Related Documentation

Cross-references to:

  • docs/health.md — Readiness and liveness probes
  • docs/shutdown.md — Drain timing and connection cleanup
  • docs/api-keys.md — Admin authentication for deploy endpoints
  • docs/observability.md — Metrics and alerting
  • docs/troubleshooting.md — Common operational issues

Usage

For operators:

# Check current state
npm run deploy:status

# Promote green to active (with automatic health gate)
npm run deploy:switch-green

# Return to blue if issues detected
npm run deploy:rollback

…lenttrust#287)

- Document deploy:switch-green, deploy:rollback, deploy:status procedures
- Include health-gate behavior (30s timeout, automatic rollback thresholds)
- Cross-reference health.ts readiness and shutdown.ts drain timing
- Explain blue (3001)/green (3002)/router (3000) topology
- Provide troubleshooting guide and decision tree
- Include security notes on authentication and credential handling
- Add example deployment workflow script
- Document environment variables and CLI commands
- Include monitoring and alert recommendations

Runbook validated against src/deploy.ts implementation and test coverage.
@drips-wave

drips-wave Bot commented Jun 1, 2026

Copy link
Copy Markdown

@Abolax123 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@mikewheeleer mikewheeleer merged commit 7a8bbe2 into Talenttrust:main Jun 2, 2026
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document blue/green deploy runbook for deploy.ts switch-green and rollback

2 participants