Skip to content

Architecture

joshuaaferguson edited this page Apr 25, 2026 · 3 revisions

Architecture

This page describes the runtime architecture of a StreamSpace deployment. For the full technical reference (CRDs, sequence diagrams, security boundaries) see docs/ARCHITECTURE.md in the main repo.

Control Plane + Agent

┌──────────────────────────────────────────────────────────────┐
│  Control Plane (single deployment, multi-pod)                │
│  ┌────────┐  ┌──────────────┐  ┌──────────────────────────┐  │
│  │ Web UI │  │   API (Gin)  │  │ Agent WebSocket Hub      │  │
│  └────────┘  │  REST + WS   │  │ Heartbeats, commands,    │  │
│              │              │  │ status updates           │  │
│              │  Selkies     │  └──────────────────────────┘  │
│              │  HTTP/WebRTC │                                │
│              │  Proxy       │  ┌──────────────────────────┐  │
│              └──────────────┘  │ PostgreSQL  (sessions,   │  │
│                     │          │ users, templates)        │  │
│                     │          └──────────────────────────┘  │
└─────────────────────┼────────────────────────────────────────┘
                      │  WebSocket (wss://)
              ┌───────┴───────┐
              ▼               ▼
        ┌──────────┐    ┌──────────┐
        │ K8s Agent│    │ Docker Ag│
        └────┬─────┘    └────┬─────┘
             │               │
             ▼               ▼
        ┌─────────┐     ┌─────────┐
        │ Session │     │ Session │
        │  Pod    │     │ Container│
        │ Selkies │     │ Selkies │
        │  :8080  │     │  :8080  │
        └─────────┘     └─────────┘

Components

Control Plane API

  • Auth — JWT issuance/refresh, MFA, SSO bridges (SAML/OIDC/OAuth2)
  • Session orchestration — CRUD on Session resources, dispatching commands to the right agent based on the session's chosen platform
  • Agent WebSocket Hub — bidirectional channel: control plane → agent (commands), agent → control plane (heartbeats, status, lifecycle events)
  • Selkies HTTP/WebRTC proxy/api/v1/http/<session-id>/ validates the bearer token, looks up the session's in-cluster Service, and reverse-proxies to the Selkies endpoint on port 8080
  • Multi-tenancy — every request is org-scoped via JWT claims; cross-tenant access is rejected at the handler

K8s Agent

  • Watches a Session CRD for new sessions, materializes them into Deployment/Service/PVC
  • Manages the session lifecycle: create, hibernate, wake, terminate
  • Reports back via the WebSocket Hub: streamingReady once the Selkies endpoint is up
  • Leader election for HA (multi-replica deployment)

Docker Agent

  • Equivalent of the K8s Agent for Docker hosts
  • HA via swappable backend: file, Redis, or Docker Swarm

Web UI

  • React + TypeScript + Material-UI
  • Embeds the Selkies stream via <iframe src="/api/v1/http/<session-id>/?token=…">
  • Admin pages: users, agents, plugins, templates, audit log, monitoring

Database

  • PostgreSQL — single source of truth for sessions, users, templates, tokens, audit events
  • Redis — optional, used by the Agent Hub when the control plane runs multi-pod (for cross-pod agent routing)

Streaming data flow

1. User opens session in the UI
2. UI requests /api/v1/http/<session>/?token=<jwt>
3. API validates the token, looks up the session's in-cluster Service
4. API reverse-proxies the HTTP request to the session pod (Selkies UI assets)
5. Browser establishes a WebRTC peer connection (signaling through the proxy)
6. Once connected, video/audio/input flow browser ↔ pod directly via WebRTC

The control plane never sees raw video frames. Its job is auth, signaling, and asset delivery.

High availability

  • Control Plane — multi-pod deployment behind a load balancer with sticky sessions (required for the WebSocket hub and the Selkies signaling channel)
  • K8s Agent — multi-replica with leader election; only one acts on commands, others stand by
  • Docker Agent — file/Redis/Swarm backend selectable per environment
  • Database — standard PostgreSQL HA (your provider of choice)

Security boundaries

  • Browser ↔ API — TLS-enforced ingress, JWT-authenticated, rate-limited
  • API ↔ Agent — outbound WebSocket from agent (firewall-friendly), API key or mTLS authentication
  • API ↔ Session pod — in-cluster only; the Selkies proxy never exposes a session's address externally
  • Network policies — sessions cannot reach each other or the control plane unless explicitly allowed

Where to read more

Clone this wiki locally