[Epic] Persist valuable telemetry as time-series — stop discarding block / share-health / disk-growth / XvB / per-worker data

> **Data-capture audit + tracking epic.** A review of the dashboard's persistence layer found we historize only a fraction of the telemetry we fetch — the rest is rendered live and thrown away. This epic tracks persisting the valuable, low-cost series we currently discard, so operators get *trends* (not just last-value) and history survives restarts/disconnects.

## What's persisted today

Only three things become history in SQLite (`build/dashboard/mining_dashboard/service/storage_service.py` → `_create_tables`):

| Table | Stores | Cadence |
|---|---|---|
| `history` | total / P2Pool / XvB **hashrate** | 1 row / 30s poll |
| `shares` | P2Pool share events (`ts`, `difficulty`) | event-driven (#129) |
| `kv_store` | a few XvB scalars **+ the entire `latest_data` dict as one `snapshot_latest_data` blob, overwritten every 30s** | per cycle (overwrite) |

Everything else the data loop fetches survives only as last-value inside that overwritten snapshot → **no trend, gaps lost**.

## Design principles (shared across all sub-items)

- **Dedicated small tables**, not extra `history` columns — independent retention, and avoids multiplying `history` rows for per-worker/per-series data.
- **Retention + pruning** mirroring `HISTORY_RETENTION_SEC` + the existing probabilistic prune (`storage_service.py:240`).
- **Cumulative counters** (shares, blocks, proxy results) → store **deltas/events**, and detect a **counter reset** (proxy/p2pool restart → counter goes backwards) as a segment break, *not* a negative rate. Reuse the `_shares_to_record` pattern (`data_service.py:176`).
- **Throttle** high-cardinality / slow-moving series (per-worker ~5 min, network hourly) rather than the 30 s loop rate.
- **Expose** new series via `/api/state`; the actual charts/badges are tracked in the consuming feature issues (below).

## Captures

### Tier 1 — high value, negligible/low cost
- [ ] **P2Pool block-found events** — `blocks_found` / `last_block_found` (height) / `last_block_ts` (`collector/pools.py:74`). Detect the `totalBlocksFound` increment (mirror `_shares_to_record`) → `blocks(ts REAL PK, height INT, effort REAL)`. Blocks are the actual **payout** events; enables a found-blocks timeline + effort-per-block. ~rows per *week*. → feeds **#99**.
- [ ] **Pool-wide share-health series** (accepted / rejected / invalid / expired) — `data_service.py:116` (`_parse_proxy_summary`). **Already specced in #116** — design & implement there; listed here so the audit is complete. Enables reject-rate trends (the #1 "something's wrong" signal).
- [ ] **monerod DB size + disk used** — `db_size` from `client/monero/monero_client.py:82` (already rides in `monero_sync`) + `collector/system.py` `get_disk_usage()`. Hourly `disk_growth(ts, monero_db_bytes, disk_used_gb, disk_total_gb)`. The chains grow monotonically and a full disk **corrupts monerod mid-write**; a growth trend forecasts "days until full." → extends **#138** (closed) and feeds **#104** (low-disk badge).
- [ ] **XvB credited history** (avg_1h / avg_24h / fail_count / donation_fraction) — currently **last-value only** in `kv_store` (`storage_service.py:290`). Append `xvb_history(ts, avg_1h, avg_24h, fail_count, donation_fraction)` on each external sync (~5 min, `data_service.py:497`). The donation controller steers off these — a credited-vs-routed history shows credit-factor & tier-holding over time.

### Tier 2 — valuable, throttle to avoid bloat
- [ ] **Per-worker history** (hashrate `h15` + accepted / rejected) — `data_service.py:40` (`_parse_proxy_list_worker`). Per-worker data is **lost entirely when a worker disconnects** today. New `worker_history(ts, name, h15, accepted, rejected)` at **~5 min** cadence (not 30 s) + retention. Cardinality = workers × cadence → throttled & pruned. Intersects **#169 / #182** (v1.0 — but those are solvable with in-memory `last_active` and don't depend on this) and the per-worker stretch of **#116**. **Do #144 first** (remove the dead `known_workers` table) so we don't build on orphaned code.
- [ ] **Network series** (difficulty / height / reward + pool hashrate) — `collector/pools.py` `get_network_stats` / `get_p2pool_stats`. Hourly `network_history(...)`. Slow-moving → cheap at hourly; lets earnings estimates be back-tested against real difficulty.

### Explicitly NOT captured as a series (would bloat / low value)
- **CPU% / RAM% / load average** (`collector/system.py`) — high-frequency, low long-term value; keep as live gauges (hourly aggregates only if ever needed).
- **HugePages used/free** — effectively static after boot; live gauge only.
- **Raw stratum passthrough JSON** (`get_stratum_stats`) — large, redundant with the structured fields.
- **The full `latest_data` snapshot** — correctly stays a single overwritten last-value blob.

## Relationships
- **#116** — pool-wide (and per-worker stretch) share-stats time-series → the share-health slice above; dedupe/track design there.
- **#99** — consumes block-found + outage events for chart flagging / alerts.
- **#104** — consumes disk-growth for the low-disk badge.
- **#144** — remove the dead `known_workers` / `workers` table **before** adding `worker_history`.
- **#169 / #182** — per-worker accuracy (v1.0); informed by, but not blocked on, per-worker history.
- **#138** (closed) — runtime disk guard; this adds the growth *trend*.

## Milestone rationale
These captures back v1.1 trend/charting features (**#116, #99, #104**), and the v1.0 dashboard-accuracy issues (#169/#182) don't require persistence — so this epic is milestoned **v1.1**. Any individual slice can be pulled into v1.0 if a release-blocking issue comes to depend on it.

## Acceptance criteria
- [ ] Each captured series in its own table with retention + **reset-safe** delta handling.
- [ ] Cadence throttles applied (per-worker ~5 min, network hourly; Tier-1 on the 30 s / external-sync cadence as noted).
- [ ] New series available via `/api/state`.
- [ ] Storage-layer tests (insert / prune / counter-reset) mirroring the existing `tests/service/` coverage.
- [ ] **#144** landed before `worker_history`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Epic] Persist valuable telemetry as time-series — stop discarding block / share-health / disk-growth / XvB / per-worker data #196

What's persisted today

Design principles (shared across all sub-items)

Captures

Tier 1 — high value, negligible/low cost

Tier 2 — valuable, throttle to avoid bloat

Explicitly NOT captured as a series (would bloat / low value)

Relationships

Milestone rationale

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Table	Stores	Cadence
`history`	total / P2Pool / XvB hashrate	1 row / 30s poll
`shares`	P2Pool share events (`ts`, `difficulty`)	event-driven (#129)
`kv_store`	a few XvB scalars + the entire `latest_data` dict as one `snapshot_latest_data` blob, overwritten every 30s	per cycle (overwrite)

Uh oh!

[Epic] Persist valuable telemetry as time-series — stop discarding block / share-health / disk-growth / XvB / per-worker data #196

Description

What's persisted today

Design principles (shared across all sub-items)

Captures

Tier 1 — high value, negligible/low cost

Tier 2 — valuable, throttle to avoid bloat

Explicitly NOT captured as a series (would bloat / low value)

Relationships

Milestone rationale

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions