Skip to content

fix(node): reconcile stale node versions on daemon startup#56

Merged
jacderida merged 1 commit intoWithAutonomi:rc-2026.4.2from
jacderida:fix-node_upgrade_status2
Apr 23, 2026
Merged

fix(node): reconcile stale node versions on daemon startup#56
jacderida merged 1 commit intoWithAutonomi:rc-2026.4.2from
jacderida:fix-node_upgrade_status2

Conversation

@jacderida
Copy link
Copy Markdown
Contributor

Summary

Follow-up to #53. A daemon that ran through an auto-upgrade under an older ant version left its node registry with stale version fields — the on-disk binary had been swapped but the registry was never updated. #53 only writes the new version when the supervisor itself drives the restart, so users who were already in the broken state would still see the wrong version after upgrading to the fixed daemon.

This PR adds a one-time reconciliation pass on daemon startup: for each registered node, re-read <binary> --version and persist any difference. Missing binaries and transient --version failures are silently skipped so daemon startup never aborts on this.

Changes

  • ant-core/src/node/daemon/server.rs
    • start() takes mut registry: NodeRegistry and calls a new reconcile_registry_versions helper before the supervisor comes up.
    • reconcile_registry_versions iterates the registry, runs extract_version on each binary, updates and saves on drift. Errors and missing binaries are skipped.
    • 3 new unit tests (#[cfg(all(test, unix))], using a shell-script fake binary via tempdir): stale version gets updated and persisted; matching version is left alone; missing binary is skipped without panicking.
  • ant-cli/src/commands/node/status.rs
    • Align the Version column header width (:<10:<18) with the value rows so the table stays straight once versions like 0.10.11-rc.1 or current → pending appear. Divider width bumped from 44 to 52 chars to match.

Test plan

  • cargo test --package ant-core --lib -- --test-threads=1 — 144 pass (includes the 3 new reconciliation tests)
  • cargo clippy --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean

Manual smoke test on the local machine (registry containing 6 nodes, some upgraded by the old daemon without the registry being updated):

$ ant node daemon stop && ant node daemon start
$ ant node status
  ID   Name           Version            Status
  ────────────────────────────────────────────────────
  1    node1          0.10.11-rc.1       ● Stopped
  2    node2          0.10.11-rc.1       ● Stopped
  3    node3          0.10.11-rc.1       ● Stopped
  4    node4          0.10.1             ● Stopped
  5    node5          0.10.1             ● Stopped
  6    node6          0.10.11-rc.1       ● Stopped

Before this change, nodes 1–3 and 6 would have continued to report 0.10.1 from the registry even though the on-disk binary had already been swapped to 0.10.11-rc.1.

🤖 Generated with Claude Code

Users who ran an older daemon through an auto-upgrade may have a registry
with a stale `version` field: the on-disk binary was replaced but the
previous daemon never refreshed the registry.

On startup, before bringing the supervisor up, re-read each node's binary
with `--version` and persist any differences. Missing binaries and
transient `--version` failures are silently skipped so daemon startup
never aborts on this pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jacderida jacderida merged commit 7295422 into WithAutonomi:rc-2026.4.2 Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants