Skip to content

Implement Phase 11: Auto-Scaler#14

Merged
richardkiene merged 1 commit into
mainfrom
feature/phase-11-auto-scaler
Jan 17, 2026
Merged

Implement Phase 11: Auto-Scaler#14
richardkiene merged 1 commit into
mainfrom
feature/phase-11-auto-scaler

Conversation

@richardkiene
Copy link
Copy Markdown
Contributor

Summary

Implements Phase 11 (Auto-Scaler) with automatic scaling based on resource metrics.

Resource Limits

  • ResourceLimits config struct for memory and table size limits
  • Integration with Wasmtime's StoreLimits for enforcement
  • Applied at Store creation in CLI, HTTP, and TCP runtimes

Metrics Collection

  • MetricsCollector tracks request counts, latencies, and instance counts
  • Load calculation based on latency relative to configurable baseline (50ms default)
  • Latency percentile calculations (avg, p50, p99)
  • API endpoints: GET /v1/metrics, GET /v1/services/{id}/metrics

Auto-Scaler

  • AutoScaler monitors metrics and triggers scale up/down decisions
  • Cooldown mechanism prevents scaling thrashing
  • Respects min/max replica configuration per service
  • ServiceAutoScaled event published on scaling actions

New Files

  • fabricks-runtime/src/limits.rs - Resource limits configuration
  • fabricksd/src/scaler/ - Scaler module (mod.rs, types.rs, metrics.rs, autoscaler.rs)
  • fabricksd/src/api/handlers/metrics.rs - Metrics API handlers
  • fabricks-e2e/tests/autoscaler.rs - E2E tests

Test plan

  • All 243 tests pass (200 unit + 43 e2e)
  • Clippy passes with no warnings
  • 12 new E2E tests for metrics and auto-scaling
  • Tests verify registration, request tracking, load calculation
  • Tests verify auto-scaler respects replica limits

Add automatic scaling based on resource metrics with three components:

Resource Limits:
- ResourceLimits config struct for memory and table size limits
- Integration with Wasmtime's StoreLimits for enforcement
- Applied at Store creation in all runtime types (CLI, HTTP, TCP)

Metrics Collection:
- MetricsCollector tracks request counts, latencies, and instance counts
- Load calculation based on latency relative to configurable baseline
- Latency percentile calculations (avg, p50, p99)
- API endpoints: GET /v1/metrics, GET /v1/services/{id}/metrics

Auto-Scaler:
- AutoScaler monitors metrics and triggers scale up/down decisions
- Cooldown mechanism prevents scaling thrashing
- Respects min/max replica configuration per service
- ServiceAutoScaled event published on scaling actions

E2E Tests:
- 12 comprehensive tests for metrics collection and auto-scaling
- Tests verify registration, request tracking, load calculation
- Tests verify auto-scaler respects replica limits
@richardkiene richardkiene merged commit dc46c0f into main Jan 17, 2026
0 of 3 checks passed
@richardkiene richardkiene deleted the feature/phase-11-auto-scaler branch January 17, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant