Flox

Real-time fault intelligence for HVAC actuators. Flox ingests live telemetry from Belimo actuators — torque, motor position, temperature, signal quality — runs continuous fault classification, and surfaces actionable insights through a facility dashboard and a conversational AI operations agent.

What it does

Belimo actuators collect rich internal signals during operation — torque demand, motor position feedback, internal temperature, and control signal quality — but this data is rarely used beyond basic device status.

Flox closes that gap:

Telemetry ingest — actuator signals are ingested in real time and persisted with full history per variable per device.
Fault classification — a Celery worker runs a continuous diagnosis cycle. Heuristic rules detect known failure modes (stiction, high-torque anomaly, temperature drift, signal loss). An optional ML inference server extends this with trained classifiers.
Fault propagation — device-level faults roll up through the node hierarchy (actuator → AHU → plant), so system-level health reflects the worst downstream condition.
Facility dashboard — a live map view shows zone health, device positions, and active faults across the building. An issues panel lists all open faults ranked by severity with diagnosis context and recommended actions.
Operations agent — a Claude-powered agent answers natural-language questions about faults, runs diagnosis on demand, retrieves fault history, and can execute corrective actions with explicit operator approval before any write is committed.

Fault types detected

Kind	Severity
`stiction_suspected`	Critical
`high_torque_anomaly`	Warning
`temperature_drift`	Warning
`signal_loss`	Critical
`weak_signal`	Warning

ML-based classifiers (when enabled) extend coverage beyond rule thresholds.

Quick start

cp .env.example .env      # set NAME, ANTHROPIC_API_KEY, and database credentials
make init                 # create venv, sync Python deps, link env files
make up                   # start postgres, redis, fastapi backend, classifier worker
make dev                  # start Vite frontend at http://localhost:3000

The frontend connects to the FastAPI backend at /api/status. If the backend is not running the dashboard will show a connection error.

make doctor               # verify toolchain
make help                 # list all targets

Data pipeline

flowchart TD
    A([Actuator]) -->|telemetry stream| B[POST /api/ingest\nFastAPI]
    B --> C[(PostgreSQL\ntelemetry history\n+ latest values)]
    C --> D[Celery beat worker\nrun_diagnosis_cycle]
    D -->|heuristic + ML classifier| E{Fault?}
    E -->|yes| F[Attach / update fault\nset node status]
    E -->|no| G[Clear fault\nmark healthy]
    F --> H[Propagate status\nup node hierarchy]
    G --> H
    H --> C
    C --> I[GET /api/status\nFastAPI]
    I --> J[React dashboard\nmap · issues · telemetry charts]

Repository layout

mindmap
  root((Flox))
    apps
      webapp
        Facility map
        Issues dashboard
        Device telemetry charts
        AI agent panel
      backend
        fastapi
          Telemetry ingest
          Status endpoint
          Fault resolution
          Agent chat
          Document upload
      worker
        Celery beat
        Classification loop
    ml
      models
        Architecture
        Training loop
      data
        ETL pipeline
        Processed artifacts
      inference.py
        ML inference server
      configs
        Hyperparameter YAML
    shacklib
      diagnosis_engine.py
        Fault classification
        State management
        Payload builders
      agent.py
        Claude integration
      backend_state.py
        Postgres read/write
      node_simulator.py
        Actuator signal simulator
      mock_facility.py
        Seed data
      logger.py
        Structured JSON logging
    database
      SQL init files
    docker
      Per-service Dockerfiles
    scripts
      Seed and migration helpers

Operations agent

The agent is powered by Claude and has access to platform tools: querying live device status, fetching fault history for a specific node, running the diagnosis cycle, and resolving faults.

Destructive actions require explicit operator approval before execution. The frontend surfaces an approval prompt; the agent does not proceed until the operator confirms.

# The agent is exposed at POST /api/agent/chat
# The frontend sends the full conversation history on each turn.
# Tool events are returned alongside the reply so the UI can display what ran.

To interact via the UI, open the Operations Agent panel and type a question. Use @NODE_ID to attach a specific device to your message. Quick prompts are generated automatically from the current top fault.

Example prompts:

Give me a live system overview and top active faults.
Why is node BEL-VLV-003 reporting stiction_suspected?
Show fault history for node BEL-AHU-001.
Run diagnosis for BEL-VLV-003 now.
Resolve fault fault-a3b2c1d0 with note "validated on site".

Environment variables

Copy .env.example to .env and fill in the values relevant to your deployment.

Variable	Description
`NAME`	Project name, used as Docker container prefix
`ANTHROPIC_API_KEY`	Required for the operations agent
`BACKEND_PORT`	FastAPI listen port (default: 5000)
`POSTGRES_*`	Database connection settings
`REDIS_PORT`	Redis port
`ML_URL`	URL of the ML inference service
`CLASSIFIER_INTERVAL_SECONDS`	How often the classifier runs (default: 5)
`BACKEND_STARTUP_SEED_MODE`	Seed mode on startup: `always` or `once`
`VITE_REQUIRE_AUTH`	Enable Supabase session auth on the frontend
`LOKI_PORT` / `GRAFANA_PORT`	Enable remote log aggregation

Make targets

Target	Description
`make init`	First-time setup: venv, deps, env linking
`make dev`	Start Vite frontend
`make up`	Start core services (postgres, redis, backend, classifier, worker)
`make down`	Stop all services
`make run.backend`	Start FastAPI backend only
`make run.worker`	Start Celery worker only
`make run.ml`	Start ML inference server
`make lift.ml`	Core services + ML inference
`make lift.sim`	Core services + node simulator
`make lift.logging`	Add Loki + Grafana log stack
`make lift.mlflow`	Add MLflow experiment tracking
`make etl`	Run ETL pipeline
`make train`	Run model training
`make fmt`	Format Python with black
`make lint`	Lint with ruff
`make type`	Type-check with mypy
`make test`	Run pytest
`make clean`	Remove caches and build artifacts
`make doctor`	Verify toolchain (Python, uv, Bun, Docker)

Optional services

Profile	Services	Command
(default)	postgres, redis, backend, classifier, worker	`make up`
`ml`	+ ML inference	`make lift.ml`
`sim`	+ node simulator	`make lift.sim`
`minio`	+ MinIO object storage	`make lift.minio`
`tensorboard`	+ TensorBoard	`make lift.tensorboard`
`mlflow`	+ MLflow	`make lift.mlflow`
`logging`	+ Loki + Grafana	`make lift.logging`
`database`	+ MongoDB	`make lift.database`

Database schema

Application state is stored in normalized Postgres tables. A legacy JSONB snapshot in backend_state (id = 1) is maintained for backward compatibility. read_state() reconstructs the full JSON contract; update_state() writes both representations atomically.

erDiagram
    backend_nodes {
        TEXT id PK
        TEXT label
        TEXT type
        TEXT status
        DOUBLE position
        TEXT latest_fault_id
        TEXT updated_at
    }

    backend_node_latest_telemetry {
        TEXT node_id FK
        TEXT metric
        JSONB value
    }

    backend_node_history {
        TEXT node_id FK
        TEXT metric
        INTEGER ordinal
        TEXT point_time
        JSONB value
    }

    backend_faults {
        TEXT id PK
        TEXT node_id
        TEXT state
        TEXT kind
        DOUBLE probability
        TEXT summary
        TEXT recommended_action
        TEXT opened_at
        TEXT resolved_by
        TEXT note
    }

    backend_catalog_device_templates {
        TEXT id PK
        TEXT name
        TEXT model
        TEXT type
        TEXT zone_id
    }

    backend_agent_audit_log {
        BIGINT ordinal PK
        JSONB payload
    }

    backend_nodes ||--o{ backend_node_latest_telemetry : "latest telemetry"
    backend_nodes ||--o{ backend_node_history : "history"
    backend_nodes ||--o{ backend_faults : "faults"
    backend_catalog_device_templates ||--o| backend_catalog_fault_meta : "impact metadata"
    backend_agent_meta ||--o{ backend_agent_audit_log : "audit log"

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.claude/commands		.claude/commands
.cursor		.cursor
.github/workflows		.github/workflows
apps		apps
docker		docker
docs		docs
ml		ml
scripts		scripts
shacklib		shacklib
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
FAQ.md		FAQ.md
Makefile		Makefile
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
nx.json		nx.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flox

What it does

Fault types detected

Quick start

Data pipeline

Repository layout

Operations agent

Environment variables

Make targets

Optional services

Database schema

Bayesian network rendering (Python)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Flox

What it does

Fault types detected

Quick start

Data pipeline

Repository layout

Operations agent

Environment variables

Make targets

Optional services

Database schema

Bayesian network rendering (Python)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages