Skip to content

Delete should also attempt graceful stop before hard kill#109

Open
sjmiller609 wants to merge 5 commits intomainfrom
codex/describe-hypeman-stop-behavior
Open

Delete should also attempt graceful stop before hard kill#109
sjmiller609 wants to merge 5 commits intomainfrom
codex/describe-hypeman-stop-behavior

Conversation

@sjmiller609
Copy link
Collaborator

@sjmiller609 sjmiller609 commented Feb 26, 2026

Summary

  • reduce the default stop timeout to 5s and centrally resolve it via resolveStopTimeout
  • share the graceful shutdown attempt between stop and delete flows by trying a guest shutdown before forcing the hypervisor to die
  • clarify the README about falling back to hypervisor shutdown when the nicely ordered stop times out

Testing

  • functionality of stop and delete covered by existing testing

I asked how well does existing testing cover this

Short answer: partially yes, but not fully for the new fallback edge cases.

What is covered well:

What is not strongly covered:

  • No focused test that forces the exact fallback chain in stop (graceful timeout -> hypervisor shutdown -> SIGKILL -> reap) in stop.go.
  • No deterministic test around timeout behavior/config (DefaultStopTimeout/resolveStopTimeout) in stop.go:18.

Confidence level:

  • High for normal lifecycle behavior.
  • Medium for “stuck/hung VM process” fallback behavior.
    So I would not say “fully confident” for that edge case without targeted tests.

Note

Medium Risk
Touches instance lifecycle shutdown/delete paths and adds SIGKILL/Wait4 logic, which could affect VM termination and resource cleanup if edge cases are missed. CI now enforces gofmt, reducing risk of formatting-only drift but not validating runtime behavior.

Overview
Updates instance lifecycle behavior so delete attempts a graceful guest shutdown (via the guest-agent) when the VM is Running, using a centrally resolved stop timeout.

Refactors stop/delete to share resolveStopTimeout + tryGracefulGuestShutdown, reduces the default stop timeout to 5s, and strengthens stop fallback behavior to shut down the hypervisor then SIGKILL/reap the process if it is still alive; docs are updated to reflect this.

Adds a CI step on Linux and macOS to fail builds on unformatted Go code (gofmt -l), with the remainder of the diff being formatting-only changes.

Written by Cursor Bugbot for commit edf0161. This will update automatically on new commits. Configure here.

@sjmiller609 sjmiller609 changed the title Align stop timeout and delete behavior for graceful shutdown Delete should also attempt graceful stop before hard kill Feb 26, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

if !gracefulShutdown {
log.DebugContext(ctx, "graceful shutdown before delete did not complete", "instance_id", id)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete flow leaks gRPC connection back into pool

Low Severity

Step 3 calls guest.CloseConn(dialer.Key()) to remove the pooled gRPC connection before killing the hypervisor. But step 4's tryGracefulGuestShutdown then calls guest.ShutdownInstance, which calls guest.GetOrCreateConn, creating a new connection and adding it back into connPool. After the instance is deleted, this new connection is never removed from the pool. Each delete of a running instance leaks one gRPC connection entry (with associated reconnection goroutines) that accumulates for the lifetime of the server process.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant