roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80
Draft
caezium wants to merge 49 commits into
Draft
roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80caezium wants to merge 49 commits into
caezium wants to merge 49 commits into
Conversation
…s (B7) Render the latest snapshot as Prometheus text exposition so a dev with Grafana can scrape their own Mac in minutes. New pure MetricsPrometheus.exposition: gauges for CPU/memory/health/uptime/battery, per-disk and per-interface labeled series, GPU omitted when unavailable (-1 on Apple Silicon), and Prometheus label-value escaping. QueryServer.route now returns (body, contentType): every route stays JSON except the new endpoint, which serves text/plain; version=0.0.4 (the de-facto scrape content type). A missing or undecodable snapshot yields a comment line rather than error JSON, so scrapers tolerate an empty target. Tests: 4 formatter cases + 2 route cases; existing route tests migrated to the (body, contentType) tuple.
Pure (ts, free-bytes) series → projection. Honest by construction: no date under a week of history, none when free space is flat or growing, and none past a 5-year horizon. Robust to single-sample cliffs (a temp file briefly eating space) via a Theil–Sen median slope rather than least-squares, which one outlier can drag arbitrarily far. Tracer-bullet core for roadmap A.3; the MCP tool + Home/History tile wiring follow as their own slices.
Pure stats: percentile/median primitives, a sustained-CPU-exceedance rule (recent median clears baseline p95 + a minimum effect size so near-idle noise can't trip it), and battery-drain regression (OLS %/hr per session, flagged when a recent discharge runs >=factor x the baseline median). The Maintenance pass writing burrow.findings + the Home card are integration.
Shared kernel for burrow_diff(since:) and the new-LaunchAgent watcher: sorted, de-duplicated added/removed over two identifier lists (apps, login items, LaunchAgents, ports, top-process membership).
Parses dirty/ahead/upstream/detached into a needs-attention verdict so the purge checklist can badge repos with uncommitted or unpushed work. Conservative: untracked files count, and a never-pushed branch is unpushed. Running git (timeout + concurrency cap) and badging the UI are integration.
System facts -> ranked ok/warn/fail checks (engine, Full Disk Access, memory pressure, disk headroom, recent errors). Level.rawValue orders worst-first. Gathering the facts + the Help-menu sheet are integration.
Pure value-folding step(): fire on crossing high, re-arm only after recovery below low, and a cooldown caps fires across episodes — one alert per episode, not per sample. Shared by notifications (D.12), SSE (B.6), the report (A.4). Rule thresholds, Sampler wiring, and UNUserNotification delivery are integration.
Pure facts -> markdown (cleanup freed, disk forecast phrased in weeks, top energy, battery delta, new-startup-items security note). Same artifact for the Home card and a burrow_report tool. Reuses Fmt.bytes; only names a fill date when DiskForecast was willing to. DB gathering is integration.
Pure: latest-backup + local-snapshot date tokens from tmutil, token->UTC Date. Backs 'Last backup: N days ago' in Clean/Purge sheets + the purgeable note. SMART/IOKit disk-health half of D.14 lands separately; tmutil invocation is integration.
Pure: recorded removals + a filesystem probe -> per-item restorable verdict. Only Trash-based removals are recoverable; a re-created origin is a collision, not an overwrite. Finder put-back + UI are integration.
…(A.3) Adds MetricsStore.diskFreeSeries (total-used per snapshot for a mount, or the largest volume) and a burrow_disk_forecast tool that runs DiskForecast over it — agents get an honest 'fills in ~N days' (or null) without the GUI. Integration test seeds declining/flat disks through the real dispatch path.
collected_at needs fractional seconds to match mo's ISO8601 decoder; without them the seeded rows dropped and the forecast saw an empty series. Also surface the tool JSON in the unwrap message for faster diagnosis.
Diffs top-process membership (via InventoryDiff) and free-space delta between the snapshot nearest `since` and the latest. v1 scope: app/login-item/port inventories aren't persisted across time yet, so they're out — noted in the reply. Integration-tested through the dispatch path.
Composes WeeklyReport from real snapshot data (disk forecast + top energy by CPU-seconds); cleanup/battery/login sections are nil='unavailable' until their sources land, never faked. spaceReclaimedBytes is now optional so 'unknown' is distinct from a genuine zero. Returns markdown as the tool text.
Composes Doctor.report from on-hand signals: mo presence (MoEngine.availability), Full Disk Access (Privacy), and memory-pressure + disk headroom from the latest snapshot, plus decode-drift count. Returns ok/warn/fail checks as JSON. Integration-tested on the snapshot-derived (deterministic) verdicts.
StartupWatcher.newlyAppeared diffs a previous inventory against the current StartupInventory scan (via InventoryDiff) → the 'new LaunchAgent/login item appeared' items. Baseline persistence + notification firing are integration.
One history→Input gather behind both burrow_report and the Home card, so the two can't drift. burrow_report now calls it. Tested by seeding snapshots.
New Home segment rendering the weekly digest from ReportComposer. zh-Hans/Hant 'Report' added. Compile-verified; needs hand-test against a real history DB.
Renders Doctor.report (same verdicts as burrow_doctor) from the latest snapshot + live FDA/engine checks, ok/warn/fail with colour. zh added. Compile-verified; needs hand-test.
Lists each ecosystem's (DevHygiene.catalog) existing cache roots with size (off-main recursive scan), biggest first, + reveal-in-Finder. Stage-2 per-item delete is a follow-up. zh added. Compile-verified; needs hand-test. NOTE: Home now has 6 segments — nav may want restructuring (your call).
Per-pid fd scan → socket fdinfo → TCP-listen + bound-UDP ports, feeding PortInspector (sort + kill-safety). No lsof, no elevation for own processes. NOTE: native C-interop, runtime-unverifiable in CI; hand-test vs lsof -i -P.
Lists listening sockets + owning process via PortEnumerator/PortInspector. Read-only. Shape-tested through dispatch (specific ports are machine-dependent).
New .ports Tool (top-nav pill) rendering PortEnumerator.listening() with a confirm-gated Quit (SIGTERM) on the user's own processes only. zh added. Compile-verified; needs hand-test (native enum + real kill).
StartupWatcher.check folds a scanLive() against a persisted baseline (Store.startupBaselineJSON), first-run-suppressing. BurrowNotifier.checkStartupWatcher runs it on the existing reminder timer (default-on Store.watchStartupItems, independent of smart reminders) and posts once per appearance. The fold is unit-tested; firing is compile-verified (hand-test the notification).
MetricsStore.processCPUSamples (name → CPU%/snapshot) feeds AnomalyScan.cpuFindings, which applies the Anomaly rule per process (recent vs baseline window) and ranks regressions. Pure + tested; the Maintenance pass writing burrow.findings + the Home Changes card are integration.
ThresholdAlerts folds CPU/memory through AlertEngine (hysteresis + cooldown, one fire per episode) given prior per-rule states. Pure + tested. Per-sample evaluation (Sampler), state persistence, and posting are integration; disk-low stays in ReminderRules.
ThresholdMonitor evaluates ThresholdAlerts on each LiveFeed.applySnapshot, holding per-rule AlertState and posting via BurrowNotifier.thresholdAlert. Off-by-default (Store.thresholdAlertsEnabled), inert under XCTest. Completes D.12's threshold half (watcher half already fires).
call() now dispatches then records a burrow.agent_audit row for the mutating tools (clean/optimize/uninstall/purge/installer): tool, dry-run flag, duration, ok, summary, args. Read tools aren't audited. db.insert is serialized + busy- timed, safe alongside the GUI writer. Tested through dispatch.
…eport (A.2)
ThresholdMonitor posts via Task { @mainactor } (thresholdAlert is main-actor
isolated). AnomalyScan.scan pulls recent-24h vs prior-14d per-process CPU;
ReportComposer folds findings into the report's new Changes section. Tested.
Confirm-gated Move-to-Trash on each Dev Hygiene cache row, then re-scan. Compile-verified; needs hand-test (real Trash move).
New .tuneup Tool gathering safe (cache-clear) + review (startup) recommendations via TuneUp's tested selection logic, with a one-click Run-safe-set (trashes caches). Shared DevHygiene.directorySize. zh added. Compile-verified; hand-test.
findRangeSampled buckets by (ts-since)/stride; 60s-apart rows collapsed to one sample, below the 5-sample minimum. Spread baseline 2h / recent 10min apart so 8 survive each window. Feature was correct; the fixture clustered.
RestorePlan.parseLog turns Mole's deletion log into restore candidates; RestoreView builds the plan and offers a ~/.Trash put-back per Trash-based item (caches shown locked, permanent). New .restore Tool. parseLog tested; the Trash move is compile-verified, needs hand-test.
…(C.11) FolderGrowth.diff ranks per-folder size growth between two scans. GitSweep walks up to the containing repo and runs git status (bounded) → GitRepoStatus verdict. Diff + walk-up tested; the mo-analyze scan, git subprocess, and purge- checklist badges are integration.
BackupStatus runs tmutil latestbackup → days-ago via the tested TimeMachine parser; Doctor gains a Backups check (ok / stale-warn / no-backup-warn), surfaced in the Doctor pane + burrow_doctor. SMART/IOKit half is separate.
Additive chartOverlay on the line chart maps a drag to a [t0,t1] window and opens SpikeSheet, which ranks processes via the tested MetricsStore.processWindow. Overlay-only — doesn't touch the chart's marks/axes. zh added. Compile-verified.
DiskCard takes an optional db and annotates the Status disk tile with the DiskForecast over 30d of free-space history (silent unless a date is honest). Exposed StatusModel.db. Additive line; zh added.
Maintenance persists a time-indexed login-item/LaunchAgent inventory (burrow.startup_inv); burrow_diff now reports login_items_added/removed since <time> via InventoryDiff. Tested through dispatch. Apps/ports still untracked.
DiskHealth reads the NVMe SMART verdict from system_profiler (no private API); Doctor gains a Disk-health check (verified=ok / failing=fail / unreadable=ok). Surfaced in the Doctor pane + burrow_doctor. wear%/temp still need IOKit.
runBrewStreaming pipes brew output line-by-line; upgrade()/upgradeAll() feed each line through BrewProgress.phrase and publish brewPhrase, shown live in the row instead of a bare spinner. NOTE (hand-test): exercises a real brew upgrade.
MoItemRow runs GitSweep (repo-walk + git status, off-main) per row and shows a warning badge when the candidate's folder has uncommitted/unpushed work. Read-only — never changes the selection. zh added.
EventHub holds open event-stream connections and fans out SSE frames (dead ones self-evict). QueryServer serves GET /events?token=… (Store.queryAuthToken; loopback-only server), exempting the stream from the idle-cancel timeout. Threshold + new-startup-item alerts broadcast onto it. Token parse unit-tested.
'SSE' resolved as an out-of-scope symbol in Notifications.swift (imports AppKit); EventHub/QueryServer (no AppKit) were fine. Renamed the type to disambiguate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tracks A, B, C, D, F, H, I of
feature-roadmap-2026-06-10, TDD'd, every commit CI-green. Logic + agent surface complete; most features have a working GUI pane and/or live firing; a small surgical/native/gated tail remains (listed last).Done end-to-end (logic + agent + GUI/firing, all green)
AnomalyScanover history, surfaced as a Changes section in the report.DiskForecast+burrow_disk_forecast+ in the report; growth-attribution diff (FolderGrowth).ReportComposer+burrow_report+ Report Home section.burrow.agent_auditrows./metrics?format=prometheus.burrow_diffv1 (processes + disk).PortEnumerator+burrow_ports+ Ports Tool pane (kill).GitRepoStatusparser +GitSweeprepo-walk + bounded git runner.RestorePlan+ deletion-log parse + Restore Tool pane (~/.Trash put-back).BackupStatus/tmutil → Backups Doctor check.Doctor+burrow_doctor+ Doctor Home section.Remaining tail (surgery on shipped views / native / gated)
dbthreaded into DiskCard (StatusView surgery)./events— gated on query-server auth (security).mosignal that doesn't exist.Each remaining item has its tested seam already in (logic done); what's left is the view-surgery/native/gated shell.