Skip to content

NetBinds: track service closes with debounce + transient labeling#822

Open
lacraig2 wants to merge 2 commits into
mainfrom
netbinds-lifecycle
Open

NetBinds: track service closes with debounce + transient labeling#822
lacraig2 wants to merge 2 commits into
mainfrom
netbinds-lifecycle

Conversation

@lacraig2

@lacraig2 lacraig2 commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Summary

Penguin tracks when guest services bind ports but not when they close — releases mutated an in-memory list that was never written out, so closes were invisible. Some firmware also opens/closes the same port in rapid succession. This adds a debounced socket lifecycle to NetBinds, plus the bug fixes and test-harness revival needed to exercise it.

Feature — pyplugins/analysis/netbinds.py

  • Per-socket lifecycle state machine keyed on (ipvn, sock_type, ip, port) recording open/close/reopen with timestamps.
  • Debounce (debounce_period, default 2.0s): a release is held pending; a re-bind within the window is a flap, not a close. Pending closes finalize on later events and at uninit.
  • Transient labeling (transient_threshold, default 3): flap count >= threshold => transient.
  • New outputs netbind_events.csv (open/flap/close log) and netbinds_lifecycle.csv (per-socket summary). Existing netbinds.csv / netbinds_summary.csv and the on_bind PPP event are unchanged.
  • Fixes two latent bugs: seen_binds permanently suppressing re-binds, and IPv6 keys never matching releases (bracket normalization).

Pairs with rehosting/igloo_driver#75 (guest hook only emits TCP releases for listening sockets).

Drive-by fixes — explore was broken (uncaught due to bit-rotted suite)

  • common.py: int_to_hex_representer passed a raw int to represent_scalar for values > 10, crashing every config dump with yamlcore (e.g. modes 73/493).
  • graph_search.py: single-threaded Worker(...) omitted the timeout arg.
  • utils.py: get_mitigation_providers only caught ValueError; a missing flat plugin file raises FileNotFoundError.

Test + harness

  • tests/comprehensive/netbinds/ — stable / flapping-transient / cleanly-closed listeners + assert_netbinds.
  • Revived the comprehensive harness against current penguin: makeImage.sh -> penguin explore, modern project layout (base/ + static/InitFinder.yaml), PENGUIN_IMAGE override, conditional TTY, forward n_iters.

Validation

  • Host lifecycle logic: full unit simulation (debounce, flap->transient, clean close, IPv6) passes.
  • Builds into the image (local-packages); the netbinds plugin loads and writes its new CSVs in the real runtime.
  • Not yet end-to-end in-guest: capturing real bind/release hypercalls is blocked by a kernel/driver vermagic mismatch in the available linux_builder artifacts and synthetic-config boot bit-rot (/igloo/init shebang vs current preinit flow). Follow-up: rebuild a matched kernel+driver and finish reviving the harness.

Draft: the three drive-by fixes are independent of the feature and can be split into their own PR on request.

@lacraig2

lacraig2 commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Paired guest-side change: rehosting/igloo_driver#75 (only emit TCP releases for listening sockets).

@lacraig2 lacraig2 force-pushed the netbinds-lifecycle branch from f860b43 to 694b95d Compare June 6, 2026 20:11
@lacraig2 lacraig2 marked this pull request as ready for review June 11, 2026 17:08
@lacraig2 lacraig2 force-pushed the netbinds-lifecycle branch from 694b95d to a9ec3bf Compare June 11, 2026 17:08
@lacraig2 lacraig2 enabled auto-merge June 11, 2026 17:08
@lacraig2 lacraig2 force-pushed the netbinds-lifecycle branch from a9ec3bf to b4fb610 Compare June 11, 2026 23:26
…labeling

NetBinds previously only persisted bind events; socket releases mutated an
in-memory list that was never written out, so closes were invisible. Add a
per-socket lifecycle state machine that records open/close/reopen with a
configurable debounce window: a re-bind within `debounce_period` is a flap
rather than a close, and a socket that flaps `transient_threshold` times is
labelled transient (emitting a `transient` lifecycle event).

Adds `debounce_period` and `transient_threshold` to the plugin Args, writes
netbind_events.csv (open/flap/transient/close log) and netbinds_lifecycle.csv
(per-socket summary). Existing netbinds.csv / netbinds_summary.csv output and
the on_bind PPP event are unchanged. Also fixes two latent bugs: seen_binds
permanently suppressing re-binds, and IPv6 keys never matching releases due to
a bracket mismatch.
@lacraig2 lacraig2 force-pushed the netbinds-lifecycle branch from b4fb610 to a7d5f0b Compare June 12, 2026 02:02
Add patches/tests/netbinds_lifecycle.yaml: a micropython scenario that binds,
listens, and closes a socket on 8401 three times (two flaps -> transient) and
cleanly closes 8402 once. Driving the lifecycle from one process with explicit
close()/bind() makes releases and re-binds deterministic regardless of
emulation speed -- unlike killing a background daemon, where under load the
port is not released before the next bind ("Address in use").

Verifier conditions cover netbind_events.csv (flap, transient) and
netbinds_lifecycle.csv (closed). Uses a large debounce_period so every re-bind
lands inside the window and counts as a flap.
@lacraig2 lacraig2 force-pushed the netbinds-lifecycle branch from a7d5f0b to 52c0ec4 Compare June 12, 2026 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant