Build a Metabase Dashboard on Postgres — from scratch

A hands-on, fully-dockerized tutorial. A small sensor simulator streams synthetic data into a Postgres warehouse, and you build an operational KPI dashboard in Metabase — step by step, with screenshots for every click.

Why synthetic data? Because the point is to mimic a real-time data-producing pipeline without needing real plant access, credentials, or a big download. Simulated "machines" continuously write production telemetry into Postgres — exactly the shape of data a real PLC/sensor system produces — so the whole stack comes alive with one command and you can practise the analyst's real job: turning a live data stream into KPIs, dashboards, and decisions.

The example domain is manufacturing (machines, sensors, OEE), chosen to mirror a Data Analyst brief: exploratory analysis of operational data, dashboards, KPI/metric definition, descriptive + diagnostic views, and functional data-quality validation.

Reading the dashboard (an example insight)

Note: the data here is synthetic, so the figures are illustrative — not a real finding about real machines. The point of this section is to show how an analyst reads a dashboard and turns it into a decision. The actual conclusions are always specific to the real data and business you're working with; the same chart in a different context would tell a different story.

Reading the dashboard top-down tells a clear operational story. All three machines show healthy Quality (~0.93) and Performance (~0.92) — they're not producing bad parts or running slowly. The low OEE (0.38–0.45, well below the ~0.85 world-class benchmark) is driven almost entirely by Availability: the machines are stopped too often. The failure breakdown locates the cause — stoppages are dominated by heat-dissipation (HDF) and power (PWF) failures, and PRESS-01 carries a PWF problem the others don't. The takeaway: focus maintenance on availability, not quality or speed — specifically, investigate PRESS-01's power supply and the heat-dissipation conditions affecting the line.

That single sentence — what's fine, what's wrong, the specific cause, the action — is the actual deliverable; the dashboard is the evidence for it. (More on this workflow in docs/02-data-analyst-playbook.md.)

Companion reading — a short course, in order

docs/01-what-is-a-data-analyst.md — what the role is, what a KPI is, and the reports different industries (fintech, IoT, e-commerce) ask for.
docs/02-data-analyst-playbook.md — the craft: how a business question becomes a metric, becomes SQL, becomes a chart, becomes a decision.
docs/03-data-platforms-explained.md — why this repo uses one Postgres, and how operational databases, data warehouses, and data lakes differ in the real world.
docs/04-setup-runbook.md — the full step-by-step build, with a screenshot at every step and a "configure-as-code" section.

Architecture

machine-cnc01   ┐
machine-cnc02   ├──►  Postgres ("plant" DB)  ◄──  Metabase  ──►  dashboard
machine-press01 ┘        (warehouse)              (BI layer)     localhost:3000
   (sensor sim)                                                  (decisions)

All three machines run the same image; only their command-line arguments differ. That is the Dockerfile ENTRYPOINT (the script) + CMD (per-machine args, overridden in compose) pattern — one build, many machines.

In a real system the sensors would write to a live operational/time-series database, and a pipeline would move that data into a separate warehouse for analytics. Here a single Postgres plays the warehouse role for simplicity — see docs/03-data-platforms-explained.md.

Stack

Postgres 16 — the warehouse the sensors write to and Metabase reads from.
Metabase — open-source BI; SQL questions assembled into a dashboard.
Python sensor simulator — AI4I-style telemetry (temperatures, speed, torque, tool wear) plus five failure modes and OEE-ready production fields.
Docker Compose — brings the whole thing up with one command.

Quickstart

docker compose up --build

Each machine backfills ~2000 historical cycles instantly, then streams live, so dashboards have data within seconds. Check the warehouse directly:

docker compose exec postgres psql -U sensors -d plant \
  -c "SELECT machine_id, count(*) FROM sensor_readings GROUP BY 1;"

Run it a few times — the counts climb (here 2.6k → 3.5k → 93k across runs), which proves the sensor stream is genuinely live, not a one-off load:

Then connect Metabase and build the dashboard by following docs/04-setup-runbook.md. In short: open http://localhost:3000, add the Postgres database (host postgres, db plant, user sensors, password sensors_pw), create one SQL question per file in sql/kpi/, and arrange them on a dashboard.

"Live" dashboards: the sensor stream is genuinely continuous, but a Metabase chart runs its SQL once on load. Turn on the dashboard's auto-refresh (clock icon → 1 minute) for a continuously-updating view. It's near-real-time (1-minute floor), which is what "real-time" means for an operations dashboard in practice.

KPIs

One file per question in sql/kpi/:

Query	Question it answers	Chart
OEE by Machine	How effective is each machine overall?	bar
Failure Breakdown by Type	Which failure modes dominate, per machine?	bar
Downtime % by Hour	When and how much are machines stopped?	line
Tool Wear Trend	How does tool wear evolve between changes?	line
Data Quality Checks	Is the data internally consistent? (all 0 = healthy)	table

Repo structure

.
├── README.md
├── docker-compose.yml          # Postgres + Metabase + 3 machines
├── sensor/
│   ├── sensor.py               # the simulator (ENTRYPOINT target)
│   ├── Dockerfile              # single-stage (prebuilt psycopg2 wheel)
│   ├── Dockerfile.multistage   # learning variant: builder/runtime split
│   └── requirements.txt
├── sql/kpi/                    # one SQL file per Metabase question
└── docs/
    ├── 01-what-is-a-data-analyst.md     # the role, KPIs, industry reports
    ├── 02-data-analyst-playbook.md      # the analyst craft, explained
    ├── 03-data-platforms-explained.md   # DB vs warehouse vs lake
    ├── 04-setup-runbook.md              # illustrated step-by-step build
    └── *.png                            # screenshots

Notes

Data is synthetic by design (see the "why" above). The simulator is modelled on the UCI AI4I 2020 Predictive Maintenance dataset, so the columns and failure modes mirror a real machine log. Swap in the real CSV to validate distributions.
Tweak a machine by editing its command: args in compose, e.g. --interval 0.2 (rows/sec), --down-prob 0.08, --backfill 20000.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build a Metabase Dashboard on Postgres — from scratch

Reading the dashboard (an example insight)

Companion reading — a short course, in order

Architecture

Stack

Quickstart

KPIs

Repo structure

Notes

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
sensor		sensor
sql/kpi		sql/kpi
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Build a Metabase Dashboard on Postgres — from scratch

Reading the dashboard (an example insight)

Companion reading — a short course, in order

Architecture

Stack

Quickstart

KPIs

Repo structure

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages