Skip to content

5.5 — Configure auto-scaling rules #64

@davidortinau

Description

@davidortinau

Description

Configure Container Apps auto-scaling rules based on resource usage and HTTP metrics. Automatically add/remove replicas to handle load spikes and reduce costs during quiet times.

Dependencies

Acceptance Criteria

  • Scaling rules configured for each service:
    • API: Scale on HTTP concurrent requests (>50 per instance)
    • WebApp: Scale on CPU (>70%)
    • Workers: Scale on queue depth (if applicable)
    • Marketing: Fixed 1 replica (no scaling)
  • Min/max replica settings:
    • Staging: 1-2 replicas
    • Production: 2-5 replicas (minimum 2 for HA)
  • Tested: scale-up triggered under load, scale-down when idle
  • Cost optimized: appropriate SKU and scaling ranges
  • Dashboards show scaling events

Technical Notes

  • HTTP-based scaling is most common for web services
  • CPU-based scaling for compute-bound services
  • Queue-based scaling for background workers
  • Scale-down requires grace period to drain connections

Phase: 5 | Size: M | Owner: Zoe (Lead)

Metadata

Metadata

Assignees

No one assigned

    Labels

    phase:5-hardeningPhase 5: Production Hardeningsize:MMedium task (3-5 days)squadSquad triage inbox — Lead will assign to a membersquad:zoeAssigned to Zoe (Tester)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions