Production-ready load balancer that routes traffic based on real-time CPU, RAM, disk, and GPU availability—not round-robin guesswork. Routes to the best backend based on actual resource metrics, geography, and configurable tier requirements.
Key Features:
- 🎯 Smart Routing - Resource-aware (CPU, RAM, disk, GPU) + geography-based routing
- 📦 Simple Deployment - Two binaries (server + client), Docker support, YAML config, systemd integration
- 🛡️ Production Security - API keys, IP whitelisting, rate limiting, TLS, request size limits, timeouts
- 🔄 High Reliability - Circuit breaker, exponential backoff, panic recovery, graceful shutdown
- 🎯 Sticky Sessions - Configurable session affinity via custom headers or client IP
- 🚀 Built-in Reverse Proxy - WebSocket + SSE/streaming support, automatic tier detection, path preservation
- ⚡ Performance - <15µs routing (100 backends), in-memory decisions, connection pooling
curl -sSL https://raw.githubusercontent.com/cyqlelabs/opsen/main/scripts/install.sh | bashThis automatically detects your platform and installs pre-built binaries to /usr/local/bin.
Pre-built images are available from GitHub Container Registry (no authentication required):
ghcr.io/cyqlelabs/opsen-server:latestghcr.io/cyqlelabs/opsen-client:latest
Quick Start with Docker Compose (Recommended for testing):
git clone https://github.com/cyqlelabs/opsen.git
cd opsen
docker compose up -d
# Verify it's running
curl http://localhost:8080/healthThis starts a server and two example clients. The server runs on localhost:8080.
Production Deployment:
# Build images
docker compose build
# Start with custom environment variables
OPSEN_SERVER_PORT=9000 docker compose up -d
# Or use custom config files
docker compose -f docker-compose.production.yml up -dIndividual Containers (using pre-built images):
# Server
docker run -d \
-p 8080:8080 \
-v opsen-data:/data \
-e OPSEN_SERVER_PORT=8080 \
-e OPSEN_SERVER_DATABASE=/data/opsen.db \
ghcr.io/cyqlelabs/opsen-server:latest
# Client
docker run -d \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
-e OPSEN_CLIENT_SERVER_URL=http://opsen-server:8080 \
-e OPSEN_CLIENT_WINDOW_MINUTES=15 \
ghcr.io/cyqlelabs/opsen-client:latestBuild from source (optional):
# Server
docker build -f Dockerfile.server -t opsen-server .
# Client
docker build -f Dockerfile.client -t opsen-client .Environment Variables:
Server:
OPSEN_SERVER_PORT- Port to listen on (default: 8080)OPSEN_SERVER_HOST- Host to bind to (default: 0.0.0.0)OPSEN_SERVER_DATABASE- Database path (default: /data/opsen.db)OPSEN_SERVER_STALE_TIMEOUT- Client stale timeout in minutes (default: 5)
Client:
OPSEN_CLIENT_SERVER_URL- Load balancer server URL (required)OPSEN_CLIENT_WINDOW_MINUTES- Metrics averaging window (default: 15)OPSEN_CLIENT_INTERVAL_SECONDS- Report interval (default: 60)OPSEN_CLIENT_DISK_PATH- Disk path to monitor (default: /)
For advanced options (TLS, auth, logging, sticky sessions, etc.), mount a config file:
docker run -v ./server.yml:/etc/opsen/config.yml:ro ghcr.io/cyqlelabs/opsen-server:latestGPU Support:
For GPU monitoring, use NVIDIA Container Runtime:
docker run --gpus all -e OPSEN_CLIENT_SERVER_URL=http://server:8080 ghcr.io/cyqlelabs/opsen-client:latestImage Sizes:
- Server: ~33MB (Alpine-based)
- Client: ~100MB (Debian-based, required for GPU support)
See docker-compose.yml for complete examples.
Download pre-built binaries from GitHub Releases:
# Linux AMD64
wget https://github.com/cyqlelabs/opsen/releases/latest/download/opsen-server_VERSION_linux_amd64.tar.gz
wget https://github.com/cyqlelabs/opsen/releases/latest/download/opsen-client_VERSION_linux_amd64.tar.gz
tar xzf opsen-server_VERSION_linux_amd64.tar.gz
tar xzf opsen-client_VERSION_linux_amd64.tar.gz
sudo mv opsen-server opsen-client /usr/local/bin/Build from source (requires Go 1.23+):
git clone https://github.com/cyqlelabs/opsen.git
cd opsen
make all
sudo make installopsen-server -version
opsen-client -version- Quick Installation
- Architecture
- Building from Source
- Scripts
- GeoIP Setup (Optional)
- Usage
- API Endpoints
- Routing Algorithm
- Systemd Integration
- Application Integration
- Database Schema
- Monitoring
- Performance
- Health Checks & Latency Tracking
- Security Features
- Reliability Features
- License
Server (opsen-server) - Central routing coordinator that receives metrics and makes routing decisions based on resource availability, geography, and tier requirements.
Client (opsen-client) - Runs on each backend, collects CPU/RAM/disk/GPU metrics (15min avg), reports to server every 60s. Supports NVIDIA GPUs via NVML (gracefully disabled if absent). Automatically downloads and uses MaxMind GeoIP database for location detection.
Tiers - Fully customizable resource specifications (vCPU, memory, storage, optional GPU + VRAM). Define tiers matching your infrastructure and pricing model.
| Tier | vCPU | Memory | Storage | GPU | GPU Memory |
|---|---|---|---|---|---|
| small | 1 | 1 GB | 5 GB | - | - |
| medium | 2 | 4 GB | 20 GB | - | - |
| large | 4 | 8 GB | 30 GB | - | - |
| gpu-inference | 8 | 32 GB | 100 GB | 1 | 16 GB |
| gpu-training | 16 | 64 GB | 500 GB | 2 | 48 GB |
# Build both client and server
make all
# Build server only
make build-server
# Build client only
make build-client
# Install binaries and systemd services
sudo make install
# Download Go dependencies
make depsBinaries are output to bin/:
bin/opsen-server- Load balancer serverbin/opsen-client- Metrics collector client
The repository includes helpful scripts for common setup tasks:
Downloads the MaxMind GeoLite2-City database for geographic routing from Opsen's S3 mirror.
./scripts/download-geoip.sh [TARGET_PATH]Note: The client automatically downloads this database on first run. This script is only needed for:
- Manual server-side GeoIP setup (optional, for routing request geolocation)
- Updating the database (recommended monthly)
- Custom installation paths
Source: https://cyqle-opsen.s3.us-east-2.amazonaws.com/GeoLite2-City.mmdb (no authentication required)
Generates self-signed TLS certificates with Subject Alternative Names (SANs) for development/testing.
./scripts/generate-tls-cert.sh [cert_dir] [domain] [days]
# Examples:
./scripts/generate-tls-cert.sh # Default: ./certs, lb.cyqle.local, 365 days
./scripts/generate-tls-cert.sh ./ssl lb.example.com 730 # Custom directory, domain, and validityOutputs server.crt and server.key ready for use in your server configuration.
Client geolocation is automatic - the client downloads the GeoIP database on first run to ./GeoLite2-City.mmdb.
Server geolocation is optional - only needed if you want distance calculation from routing request origin:
# Download for server (optional - only for multi-datacenter routing)
./scripts/download-geoip.sh
# Configure in server YAML
geoip_db_path: ./GeoLite2-City.mmdbWhen server GeoIP is needed:
- ✓ Multi-datacenter deployments with routing requests from different regions
- ✗ Single datacenter (backends already have location via auto-download)
Update monthly (first Tuesday) for best accuracy.
Create server.yml (all settings shown with defaults):
# Basic
port: 8080
host: 0.0.0.0
database: /opt/opsen/opsen.db
stale_minutes: 5
log_level: info # debug, info, warn, error, fatal
json_logging: false
# Security
server_key: "" # Client auth (opsen-client must match)
api_keys: [] # Additional API keys for other integrations
whitelisted_ips: [] # CIDR ranges (empty = allow all)
rate_limit_per_minute: 60 # Requests per minute per IP (0 = disabled)
rate_limit_burst: 120 # Burst capacity
max_request_body_bytes: 10485760 # 10MB
request_timeout_seconds: 30 # WebSocket + SSE exempt
idle_timeout_seconds: 120 # Keep-alive timeout
read_header_timeout_seconds: 10 # Slowloris protection
disable_security_headers: false # Disable X-Frame-Options, X-XSS-Protection, etc. (e.g., when using WAF)
# TLS
tls_cert_file: "" # Empty = HTTP only
tls_key_file: ""
tls_insecure_skip_verify: false # For self-signed certs (dev only!)
# CORS
enable_cors: false
cors_allowed_origins: []
# Reverse Proxy
proxy_endpoints: [] # e.g., ["/api", "/browse"]
proxy_sse_flush_interval_ms: -1 # -1=immediate (SSE), 0=disabled, >0=interval
# Geolocation
geoip_db_path: "" # Path to GeoLite2-City.mmdb
# Sticky Sessions
sticky_header: "" # e.g., "X-Session-ID", "X-User-ID" (empty = disabled)
sticky_by_ip: false # Use client IP when header not present
sticky_affinity_enabled: true
pending_allocation_timeout_seconds: 120
# Tier Detection
tier_field_name: "tier" # JSON body field
tier_header: "X-Tier" # HTTP header
# Database
db_max_open_conns: 25
db_max_idle_conns: 5
db_conn_max_lifetime: 300
cleanup_interval_seconds: 60
shutdown_timeout_seconds: 30
# Tiers (customize to your infrastructure)
tiers:
- name: small
vcpu: 1
memory_gb: 1.0
storage_gb: 5
- name: medium
vcpu: 2
memory_gb: 4.0
storage_gb: 20
- name: gpu-inference # GPU example
vcpu: 8
memory_gb: 32.0
storage_gb: 100
gpu: 1
gpu_memory_gb: 16.0Run:
./bin/opsen-server -config server.yml
# CLI flags override YAML
./bin/opsen-server -config server.yml -port 9000 -stale 10Create client.yml:
# Basic
server_url: http://lb.example.com:8080
server_key: "" # Must match server's server_key (if set)
endpoint_url: "" # Override (default: http://{local_ip}:11000)
# Metrics
window_minutes: 15 # Averaging window
report_interval_seconds: 60
disk_path: /
# Identity
client_id: "" # Auto-generated UUID if empty
hostname: "" # Uses system hostname if empty
# Geolocation (auto-downloads GeoIP database on first run)
skip_geolocation: false # Skip entirely (fastest)
geoip_db_path: "" # Auto-downloads to ./GeoLite2-City.mmdb if not set
# Logging & TLS
log_level: info
insecure_tls: false # Dev only - skip cert verificationImportant: endpoint_url Configuration
The endpoint_url defines where this backend accepts traffic. The load balancer uses this URL to route requests and perform health checks.
- Format:
http://hostname:portorhttps://hostname:port - Must be accessible from the load balancer server
- Run one client per backend, each with a unique
endpoint_url
Example: Multiple Backends
# backend-1.yml
endpoint_url: http://backend-1.internal:8000
# backend-2.yml
endpoint_url: http://backend-2.internal:8000
# backend-3.yml
endpoint_url: http://backend-3.internal:9000Each client monitors its own resources and reports to the same load balancer server.
Path-Based Routing (Optional):
Instead of a single endpoint_url, you can configure multiple endpoints with path-based routing. Supports exact matches, prefix matching, and wildcards:
endpoints:
- url: https://backend:11000
paths: ["/v1", "/api", "/auth"] # Prefix: matches /api, /api/users, etc.
- url: https://backend:8002
paths: ["/monitor/*"] # Wildcard: matches /monitor/anything
- url: https://backend:9000
paths: ["/api/*/users"] # Pattern: matches /api/v1/users, /api/v2/users
- url: https://backend:7000
paths: ["/*"] # Catch-all: matches everything elsePattern Types:
- Exact match:
/api/usersmatches only/api/users - Prefix match:
/apimatches/api,/api/users,/api/v1/sessions, etc. - Wildcard:
/api/*matches any path starting with/api/(e.g.,/api/anything) - Pattern:
/api/*/usersmatches/api/v1/users,/api/v2/users, etc. - Catch-all:
/*matches any path
Routing Priority:
- Exact match (highest priority)
- Prefix match (medium priority)
- Wildcard match (lower priority)
- Within each category: longer/more specific patterns win
- If no match: falls back to first endpoint
Example: Request to /api/users will match exact /api/users over prefix /api over wildcard /api/* over catch-all /*.
Run:
./bin/opsen-client -config client.yml
# CLI flags override YAML
./bin/opsen-client -config client.yml -server http://lb.example.com:9000 -window 20All endpoints except /health support API key auth (X-API-Key header). Rate limited per IP (60/min, burst 120 by default; can be disabled). Security headers included automatically.
Register backend. Required before stats reporting or routing.
Request: client_id, hostname, public_ip, local_ip, latitude, longitude, country, city, total_cpu, total_memory_gb, total_storage_gb, optional: total_gpus, gpu_models, endpoint_url
Response: {"status": "registered"}
Report metrics (every 60s default).
Request: client_id, hostname, timestamp, cpu_cores, cpu_usage_avg (per-core array), memory_*, disk_*, optional: gpus[] (deviceid, name, utilization_pct, memory*, temperature_c, power_draw_w)
Response: {"status": "ok"}
Get routing decision.
Request: tier, client_ip, optional: client_lat, client_lon
Headers: Optional sticky session header (e.g., X-Session-ID)
Response: client_id, endpoint, hostname, distance_km
Server health (no auth required).
Response: status, timestamp, total_clients, active_clients
List backends with current metrics.
Response: Array of: client_id, hostname, endpoint, location, cpu_*, memory_*, disk_*, gpus[], last_seen, is_active
The server uses a weighted scoring algorithm with sticky session support to select the optimal backend:
The load balancer supports session affinity via two methods:
- Header-based stickiness (
sticky_header): Uses a custom HTTP header as the sticky identifier - IP-based stickiness (
sticky_by_ip): Uses client IP address as the sticky identifier
When enabled, the load balancer provides session affinity:
- First request: Standard routing algorithm selects best server, creates assignment
(sticky_id, tier) → server - Subsequent requests: Same
sticky_id + tieralways routes to the assigned server (if healthy) - Affinity mode (
sticky_affinity_enabled: true): Different tiers from samesticky_idprefer the same server - Automatic fallback: If assigned server is unavailable or overloaded, selects a new server
Configuration options:
sticky_header: "X-Session-ID"+sticky_by_ip: false- Header-based only (authenticated users)sticky_header: ""+sticky_by_ip: true- IP-based only (anonymous users, no session tracking)- Both enabled - Header takes precedence; IP used as fallback when header not present
Use cases:
X-Session-ID: Per-session stickiness (different sessions can go to different servers)X-User-ID: All sessions from same user prefer same server (when affinity enabled)X-Device-ID: All sessions from same device prefer same server- IP-based: Anonymous users without session IDs (e.g., public APIs, CDN origins)
The server uses a weighted scoring algorithm to select the optimal backend:
score = distance_km + (avg_cpu_usage_pct * 1.0) + (memory_usage_pct * 1.0) + (gpu_usage_pct * 1.5) + latency_ms
Where:
avg_cpu_usage_pct= Average usage of the N least-loaded cores (representing what a new session would experience)memory_usage_pct= Total memory usage percentage (used/total * 100)gpu_usage_pct= Average GPU utilization across all GPUs (if tier requires GPUs)latency_ms= Round-trip latency to backend from health checks (EWMA smoothed, 0 if health checks disabled)- GPU gets higher weight (1.5x) as GPU workloads are more sensitive to resource contention
Lower scores are better. The algorithm:
-
Filters clients with insufficient resources:
- CPU: At least N cores with <80% average usage (accounting for pending allocations)
- Memory: At least N GB available (accounting for pending allocations)
- Disk: At least N GB available (accounting for pending allocations)
- GPU: At least N GPUs available with sufficient VRAM (if tier requires GPUs)
-
Calculates distance from end user to backend (Haversine formula)
-
Computes score combining distance and resource utilization:
- CPU scoring: Calculates the average of the N least-loaded cores (sorted by usage)
- Memory scoring: Uses total memory usage percentage (not accounting for pending allocations)
- Note: Pending allocations affect filtering (step 1) but not scoring (step 3)
-
Selects the client with the lowest score
-
Reserves resources immediately to prevent race conditions
Pending allocations prevent concurrent requests from all selecting the same overloaded server:
- When a server is selected, resources are immediately reserved in-memory
- Subsequent requests see reduced available capacity (actual + pending allocations)
- Reservations expire after
pending_allocation_timeout_seconds(default: 120s) - Duplicate allocations for same
sticky_id + tierare automatically deduplicated
CPU Availability Details:
- A CPU core is considered "available" if its average usage over the time window is <80%
- For scoring, the algorithm selects the N least-loaded cores and averages their usage
- This represents the actual CPU resources a new session would consume
Example: For a medium tier (2 vCPU, 4GB RAM, 20GB storage):
- Server has 8GB free RAM, 0 pending allocations → Available: 8GB
- Request A reserves 4GB → Available: 4GB (for concurrent requests)
- Request B reserves 4GB → Available: 0GB
- Request C finds different server (race condition prevented!)
Backend must have:
- ≥2 cores with <80% usage (minus pending CPU allocations)
- ≥4GB free memory (minus pending memory allocations)
- ≥20GB free disk space (minus pending disk allocations)
After running make install, manage services with systemd:
# Server
sudo systemctl start opsen-server
sudo systemctl enable opsen-server
sudo systemctl status opsen-server
journalctl -u opsen-server -f
# Client (on each backend)
sudo systemctl start opsen-client
sudo systemctl enable opsen-client
sudo systemctl status opsen-client
journalctl -u opsen-client -fEdit service files at:
/etc/systemd/system/opsen-server.service/etc/systemd/system/opsen-client.service
After changes: sudo systemctl daemon-reload
Configure paths to proxy, point frontend to load balancer. Zero code changes needed.
Server:
proxy_endpoints: ["/api", "/v1"] # Or "/*" for all paths
sticky_header: "X-Session-ID" # Optional
proxy_sse_flush_interval_ms: -1 # SSE support: -1=immediate, 0=disabled, >0=interval
idle_timeout_seconds: 300 # Increase for long-lived WebSocket connections (default: 120)Frontend:
// Change base URL only - all existing API calls work unchanged
const API_BASE = "https://lb.example.com:8080"; // Was: https://backend1.example.com
fetch(`${API_BASE}/api/users`, {
headers: {
"X-Session-ID": sessionId, // Optional: sticky sessions
},
body: JSON.stringify({ tier: "medium", ...data }), // Tier auto-detected
});
// SSE/streaming and WebSocket work automatically
const eventSource = new EventSource(`${API_BASE}/events/stream`);
const ws = new WebSocket(`wss://lb.example.com:8080/ws`);Tier Detection (priority order):
- JSON body field (
tier_field_name, default: "tier") - Query parameter (
?tier=medium) - HTTP header (
tier_header, default: "X-Tier") - Default: "lite"
Customize field names in server.yml:
tier_field_name: "subscription_level"
tier_header: "X-Subscription-Level"Benefits: Path preservation, SSE support, sticky sessions, no routing logic needed
Call /route endpoint from your app, forward request to returned endpoint.
async function handleRequest(req, res) {
const { endpoint } = await fetch("http://lb.example.com:8080/route", {
method: "POST",
body: JSON.stringify({ tier, client_ip, client_lat, client_lon })
}).then(r => r.json());
const result = await fetch(`${endpoint}/api/resource`, { ... });
res.json(await result.json());
}Use when: Custom routing logic, own proxy layer, or request modification needed
SQLite database stores:
client_id(TEXT, PRIMARY KEY)hostname(TEXT)public_ip(TEXT)latitude,longitude(REAL)country,city(TEXT)total_cpu,total_memory,total_storage(INTEGER/REAL)total_gpus(INTEGER) - Total number of GPUs (0 if none)gpu_models(TEXT) - JSON array of GPU model namesendpoint(TEXT) - HTTP endpoint for this backendcreated_at,last_seen(TIMESTAMP)
id(INTEGER, PRIMARY KEY)client_id(TEXT, FOREIGN KEY)timestamp(TIMESTAMP)cpu_cores(INTEGER)cpu_usage_json(TEXT) - JSON array of per-core usagememory_total,memory_used,memory_avail(REAL)disk_total,disk_used,disk_avail(REAL)gpu_stats_json(TEXT) - JSON array of GPU metrics
sticky_id(TEXT, NOT NULL) - Value from sticky headertier(TEXT, NOT NULL)client_id(TEXT, FOREIGN KEY)created_at,last_used(TIMESTAMP)- PRIMARY KEY:
(sticky_id, tier)
Indexes:
idx_stats_client_timeonstats(client_id, timestamp DESC)idx_clients_last_seenonclients(last_seen)idx_sticky_last_usedonsticky_assignments(last_used)idx_sticky_clientonsticky_assignments(client_id)idx_sticky_idonsticky_assignments(sticky_id)
curl http://localhost:8080/healthcurl http://localhost:8080/clients | jqcurl -X POST http://localhost:8080/route \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key-here" \
-d '{
"tier": "medium",
"client_ip": "203.0.113.45",
"client_lat": 40.7128,
"client_lon": -74.0060
}' | jq# Server logs
journalctl -u opsen-server -f
# Client logs
journalctl -u opsen-client -fBenchmarked on AMD Ryzen 9 5900X (24 cores), verified across 5 runs:
| Metric | Value | Notes |
|---|---|---|
| Server Latency | 140 ns → 14.8 µs | Scales O(n): 0.14 µs (1 client), 1.5 µs (10 clients), 14.8 µs (100 clients) |
| Concurrent Routing | 3.7 µs | 1000 concurrent requests, 100% success rate (5/5 runs identical) |
| Scalability | 150 clients tested | 1000 requests with 100% success, no race conditions |
| Client Overhead | 0.33% | Combined CPU+RAM+disk+2×GPU at 1 sample/sec (5/5 runs identical) |
| Memory (Server) | ~8 MB | Baseline heap allocation, in-memory routing cache |
| Memory (Client) | ~3-7 MB | Varies by CPU core count and GPU monitoring |
| Database | SQLite + WAL | Off critical path - persistence only, not routing |
All routing decisions are in-memory with no database I/O on the critical path.
Reproduce benchmarks:
# Routing latency benchmarks (no test output)
go test ./server -bench=BenchmarkRoutingLatency -benchmem -run='^$'
# Scalability tests (includes test output)
go test ./server -run TestScalability -v
# Run all tests with race detector
go test -race ./...Active health checks verify backends are reachable and measure latency. Enabled by default.
Configuration:
health_check_enabled: true # Enable active probes (default: true)
health_check_type: "tcp" # "tcp" or "http" (default: tcp)
health_check_interval_seconds: 10 # Probe interval (default: 10)
health_check_timeout_seconds: 2 # Probe timeout (default: 2)
health_check_path: "/health" # HTTP path (default: /health)
health_check_unhealthy_threshold: 3 # Failures before unhealthy (default: 3)
health_check_healthy_threshold: 2 # Successes before healthy (default: 2)Behavior:
- TCP probes - Verify backend port is accepting connections (fast, lightweight)
- HTTP probes - GET request to
endpoint + health_check_path, expects 2xx/3xx status - Latency - Measured on each probe, uses EWMA (exponential weighted moving average) for smoothing
- Routing impact - Unhealthy backends excluded, latency added to routing score (lower = better)
- Sticky sessions - Automatically removed for unhealthy backends, reassigned on next request
- Status transitions -
unknown→healthy(after 2 successes) →unhealthy(after 3 failures) →healthy(recoverable)
View health status:
curl http://localhost:8080/clients | jq '.[] | {hostname, health_status, latency_ms}'Example output:
{
"hostname": "backend-1",
"health_status": "healthy",
"latency_ms": "12.5"
}When backend goes down:
- Health checks fail (3 consecutive failures)
- Status changes to
unhealthy - Sticky assignments removed automatically
- Backend excluded from routing
- Requests fail over to healthy backends
Recovery:
- Backend comes back online
- Health checks succeed (2 consecutive successes)
- Status changes to
healthy - Backend rejoins routing pool
API Key Authentication - api_keys[], server_key in server.yml. Clients send X-API-Key header. Use 32+ char random keys, rotate periodically.
IP Whitelisting - whitelisted_ips[] (CIDR ranges). Empty = allow all.
Rate Limiting - Token bucket per IP with continuous token refill. rate_limit_per_minute: 60, rate_limit_burst: 120. Returns 429 on excess. Set rate_limit_per_minute: 0 to disable (useful for trusted networks, internal APIs, or when rate limiting is handled by upstream WAF/CDN).
Request Size Limits - max_request_body_bytes: 10485760 (10MB). Returns 413 on excess.
Timeout Enforcement - request_timeout_seconds: 30 (WebSocket/SSE exempt), idle_timeout_seconds: 120 (keep-alive), read_header_timeout_seconds: 10 (Slowloris protection).
TLS/HTTPS - tls_cert_file, tls_key_file. tls_insecure_skip_verify: false (backend verification, dev only).
CORS - enable_cors: true, cors_allowed_origins[].
Security Headers - Auto-added: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, Strict-Transport-Security (HTTPS). Can be disabled via disable_security_headers: true (e.g., when using a WAF/reverse proxy that manages headers).
Input Validation - Content-Type, path traversal, host injection, IP formats, tier names.
Circuit Breaker (Client) - CLOSED → OPEN (5 failures) → HALF-OPEN (30s) → CLOSED. Prevents cascading failures.
Retry Logic - Exponential backoff: 5 attempts, 1s → 2s → 4s → 8s → 16s (max 30s).
Panic Recovery - Server: 500 error + stack trace. Client: logs + continues.
Graceful Shutdown - shutdown_timeout_seconds: 30. Waits for in-flight requests, cancels goroutines, closes DB.
Database Pooling - db_max_open_conns: 25, db_max_idle_conns: 5, db_conn_max_lifetime: 300.
Structured Logging - log_level: info, json_logging: true. JSON or plain text with timestamp, level, file, line, data.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Copyright 2026 Opsen Contributors