Skip to content

feat(daemon): manual restart + auto-restart on install update#51

Open
Nic-dorman wants to merge 3 commits intomainfrom
feat/daemon-restart
Open

feat(daemon): manual restart + auto-restart on install update#51
Nic-dorman wants to merge 3 commits intomainfrom
feat/daemon-restart

Conversation

@Nic-dorman
Copy link
Copy Markdown
Contributor

Summary

Add a user-facing way to restart the node daemon, plus a silent auto-restart when the bundled sidecar has been updated since the current daemon started. Without this, users who install an app update stay on the previous daemon process until they manually kill it — a "works on my machine, breaks after the next release" trap flagged during the v0.6.7-rc.1 testing cycle.

Changes

  • restart_daemon Tauri command (src-tauri/src/lib.rs) — calls ant_core::node::daemon::client::stop directly, waits up to 10 s for the daemon to release its port/pid files, then re-uses the existing ensure_daemon_running spawn path. No shell-out to the ant CLI.
  • is_daemon_stale — mtime heuristic: compare find_daemon_binary() mtime against data_dir/daemon.pid mtime. If the sidecar is newer by more than 5 s, the daemon predates the current install and needs a swap. 5 s tolerance absorbs filesystem-timestamp granularity and install-time clock skew.
  • ensure_daemon_running hook — when the existing liveness probe confirms a daemon is running, checks is_daemon_stale and silently restarts if true. Startup remains a no-op in the common case.
  • Settings → Advanced → Node daemon — new block with a "Restart daemon" button, spinner while the request is in flight, toast on result.

Why nodes don't get interrupted

From ant-core/src/node/process/spawn.rs:41-43:

// Nodes intentionally survive daemon restarts. The daemon re-discovers
// running nodes on startup via the registry and PID checks.
.kill_on_drop(false);

Node child processes are deliberately decoupled from the daemon's lifecycle. client::stop only SIGTERMs the daemon PID (client.rs:77), leaving node processes running. The new daemon reattaches to them via the registry on startup. So both the manual button and the auto-restart are invisible from a node-operator perspective.

Test plan

  • cargo fmt --check — clean
  • cargo clippy --all-targets -- -D warnings — clean
  • cargo check — clean
  • npm run test:run — 21/21 pass
  • Manual: run daemon via CLI, click Restart, verify new PID and nodes keep running
  • Manual: touch sidecar binary to bump its mtime past daemon.pid, relaunch app, verify auto-restart log line appears and daemon PID changes

Deferred to follow-up

  • Surfacing the daemon's own version in DaemonInfo/DaemonStatus would let us check version equality instead of mtimes (more robust if macOS/Windows installers ever preserve sidecar mtimes). Upstream ant-core change.

Nic-dorman and others added 3 commits April 23, 2026 11:57
Adds a "Restart daemon" control in Settings > Advanced, and makes
`ensure_daemon_running` transparently stop a stale daemon (one that
predates the current app install) before spawning a fresh one.

- `restart_daemon` Tauri command: stop + wait-for-shutdown + spawn.
  Uses `ant_core::node::daemon::client::stop` directly — no subprocess.
- `is_daemon_stale`: compares bundled sidecar mtime against
  `daemon.pid` mtime with a 5s tolerance. If the sidecar is newer,
  the daemon was started by a previous app install and needs to be
  swapped.
- No interruption to running nodes. `ant-core/src/node/process/spawn.rs`
  sets `kill_on_drop(false)` on node child processes, so they keep
  running across daemon restarts; the new daemon re-discovers them
  via the registry + PID checks on startup.

Without this, users who update the app stay on the old daemon until
they manually kill the process — a silent "works locally, broken on
the next release" trap. The mtime heuristic handles the common case
invisibly; the Settings button covers explicit user intent (e.g. to
pick up a config change).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The manual restart used to flash the red "Cannot connect to node daemon"
error on pages/index.vue during the ~1 s window between the old daemon
shutting down and the new one accepting connections. The next poll
recovered automatically, but the flicker looked like a failure.

- `nodesStore.restarting` gates the transition. Set while the restart is
  in flight; polling errors during that window don't flip
  `daemonConnected`, so the disconnected panel never renders.
- New `restartDaemon` action on the nodes store wraps `invoke` with the
  flag + a forced re-fetch so the UI snaps back to the post-restart
  state without waiting for the next poll tick.
- pages/index.vue renders a dedicated "Restarting node daemon… / Your
  nodes keep running." panel, prioritised over both `initializing` and
  disconnected.
- Settings button delegates to `nodesStore.restartDaemon()` rather than
  invoking the command directly.

Top-right node count was already stable across restarts (it renders
from cached state, not the live poll) — no change there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ant-client PR #53 ("fix(node): handle ant-node auto-upgrade
transparently") added a new `NodeStatus::UpgradeScheduled` variant, a
`pending_version` field on `NodeStatusSummary`, and two new lifecycle
events (`UpgradeScheduled` / `NodeUpgraded`). Without consuming them
on the GUI side, nodes mid-upgrade rendered as an unknown-status grey
tile with a stale version string, and the upgrade itself only became
visible on the next poll cycle after completion.

- `utils/daemon-api.ts`: extend `NodeStatus` union with
  `upgrade_scheduled`, add `pending_version` to `NodeStatusSummary`,
  add `pending_version`/`old_version`/`new_version` to `NodeEvent`.
- `stores/nodes.ts`: carry `pending_version` through `summaryToNodeInfo`;
  handle `upgrade_scheduled` (set status + pending_version) and
  `node_upgraded` (promote `new_version` to `version`, clear pending,
  bounce status back to `running` if still `upgrade_scheduled`) in the
  SSE event switch.
- `components/StatusBadge.vue`: new "Upgrading" label with blue half-dot.
- `components/nodes/NodeTile.vue`: blue dot + ping animation for the
  `upgrade_scheduled` state; version row renders "v0.10.1 → v0.10.2"
  while pending.
- `components/nodes/NodeDetailPanel.vue` + `NodeDetail.vue`: same
  version-arrow treatment in the detail views.

Behaviour-preserving for every other status. Existing nodes are
unaffected; the new fields are all optional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant