feat(daemon): manual restart + auto-restart on install update#51
Open
Nic-dorman wants to merge 3 commits intomainfrom
Open
feat(daemon): manual restart + auto-restart on install update#51Nic-dorman wants to merge 3 commits intomainfrom
Nic-dorman wants to merge 3 commits intomainfrom
Conversation
Adds a "Restart daemon" control in Settings > Advanced, and makes `ensure_daemon_running` transparently stop a stale daemon (one that predates the current app install) before spawning a fresh one. - `restart_daemon` Tauri command: stop + wait-for-shutdown + spawn. Uses `ant_core::node::daemon::client::stop` directly — no subprocess. - `is_daemon_stale`: compares bundled sidecar mtime against `daemon.pid` mtime with a 5s tolerance. If the sidecar is newer, the daemon was started by a previous app install and needs to be swapped. - No interruption to running nodes. `ant-core/src/node/process/spawn.rs` sets `kill_on_drop(false)` on node child processes, so they keep running across daemon restarts; the new daemon re-discovers them via the registry + PID checks on startup. Without this, users who update the app stay on the old daemon until they manually kill the process — a silent "works locally, broken on the next release" trap. The mtime heuristic handles the common case invisibly; the Settings button covers explicit user intent (e.g. to pick up a config change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The manual restart used to flash the red "Cannot connect to node daemon" error on pages/index.vue during the ~1 s window between the old daemon shutting down and the new one accepting connections. The next poll recovered automatically, but the flicker looked like a failure. - `nodesStore.restarting` gates the transition. Set while the restart is in flight; polling errors during that window don't flip `daemonConnected`, so the disconnected panel never renders. - New `restartDaemon` action on the nodes store wraps `invoke` with the flag + a forced re-fetch so the UI snaps back to the post-restart state without waiting for the next poll tick. - pages/index.vue renders a dedicated "Restarting node daemon… / Your nodes keep running." panel, prioritised over both `initializing` and disconnected. - Settings button delegates to `nodesStore.restartDaemon()` rather than invoking the command directly. Top-right node count was already stable across restarts (it renders from cached state, not the live poll) — no change there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ant-client PR #53 ("fix(node): handle ant-node auto-upgrade
transparently") added a new `NodeStatus::UpgradeScheduled` variant, a
`pending_version` field on `NodeStatusSummary`, and two new lifecycle
events (`UpgradeScheduled` / `NodeUpgraded`). Without consuming them
on the GUI side, nodes mid-upgrade rendered as an unknown-status grey
tile with a stale version string, and the upgrade itself only became
visible on the next poll cycle after completion.
- `utils/daemon-api.ts`: extend `NodeStatus` union with
`upgrade_scheduled`, add `pending_version` to `NodeStatusSummary`,
add `pending_version`/`old_version`/`new_version` to `NodeEvent`.
- `stores/nodes.ts`: carry `pending_version` through `summaryToNodeInfo`;
handle `upgrade_scheduled` (set status + pending_version) and
`node_upgraded` (promote `new_version` to `version`, clear pending,
bounce status back to `running` if still `upgrade_scheduled`) in the
SSE event switch.
- `components/StatusBadge.vue`: new "Upgrading" label with blue half-dot.
- `components/nodes/NodeTile.vue`: blue dot + ping animation for the
`upgrade_scheduled` state; version row renders "v0.10.1 → v0.10.2"
while pending.
- `components/nodes/NodeDetailPanel.vue` + `NodeDetail.vue`: same
version-arrow treatment in the detail views.
Behaviour-preserving for every other status. Existing nodes are
unaffected; the new fields are all optional.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a user-facing way to restart the node daemon, plus a silent auto-restart when the bundled sidecar has been updated since the current daemon started. Without this, users who install an app update stay on the previous daemon process until they manually kill it — a "works on my machine, breaks after the next release" trap flagged during the v0.6.7-rc.1 testing cycle.
Changes
restart_daemonTauri command (src-tauri/src/lib.rs) — callsant_core::node::daemon::client::stopdirectly, waits up to 10 s for the daemon to release its port/pid files, then re-uses the existingensure_daemon_runningspawn path. No shell-out to theantCLI.is_daemon_stale— mtime heuristic: comparefind_daemon_binary()mtime againstdata_dir/daemon.pidmtime. If the sidecar is newer by more than 5 s, the daemon predates the current install and needs a swap. 5 s tolerance absorbs filesystem-timestamp granularity and install-time clock skew.ensure_daemon_runninghook — when the existing liveness probe confirms a daemon is running, checksis_daemon_staleand silently restarts if true. Startup remains a no-op in the common case.Why nodes don't get interrupted
From
ant-core/src/node/process/spawn.rs:41-43:Node child processes are deliberately decoupled from the daemon's lifecycle.
client::stoponly SIGTERMs the daemon PID (client.rs:77), leaving node processes running. The new daemon reattaches to them via the registry on startup. So both the manual button and the auto-restart are invisible from a node-operator perspective.Test plan
cargo fmt --check— cleancargo clippy --all-targets -- -D warnings— cleancargo check— cleannpm run test:run— 21/21 passdaemon.pid, relaunch app, verify auto-restart log line appears and daemon PID changesDeferred to follow-up
DaemonInfo/DaemonStatuswould let us check version equality instead of mtimes (more robust if macOS/Windows installers ever preserve sidecar mtimes). Upstream ant-core change.