Skip to content

Fix guest cmd errors and improve resiliency#17

Open
lacraig2 wants to merge 3 commits into
mainfrom
fix_guest_cmd
Open

Fix guest cmd errors and improve resiliency#17
lacraig2 wants to merge 3 commits into
mainfrom
fix_guest_cmd

Conversation

@lacraig2

@lacraig2 lacraig2 commented Nov 5, 2025

Copy link
Copy Markdown
Contributor

This PR fixes #16.

It also handles ConnectionResetError by trying again if there is a connection reset.

This behavior may not be ideal if it's possible for us to re-run commands.

We might consider a lesser behavior where we do slightly different things if the transaction breaks at different points in the transaction. @zestrada

@lacraig2 lacraig2 requested a review from zestrada November 5, 2025 18:01

@zestrada zestrada left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds retry logic to handle transient connection failures when communicating with a guest via Unix socket. The function now automatically retries on ConnectionResetError exceptions, improving resilience.

  • Added retry mechanism with configurable max retries and delay parameters
  • Improved socket resource management with proper null checks
  • Enhanced error handling to distinguish between recoverable (ConnectionResetError) and non-recoverable (OSError) errors

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread guest_cmd.py Outdated
Comment thread guest_cmd.py
Comment on lines +45 to +48
except ConnectionResetError as e:
print(f"Connection reset by peer (attempt {attempt+1}/{max_retries}): {e}", file=sys.stderr)
attempt += 1
time.sleep(retry_delay)

Copilot AI Nov 5, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic only increments attempt for ConnectionResetError, but the while loop condition checks attempt < max_retries at the start of each iteration. This means after the last retry fails, the loop will execute one more time unnecessarily. Consider moving the increment to the beginning of the loop or adjusting the loop condition.

Copilot uses AI. Check for mistakes.
Comment thread guest_cmd.py
Comment on lines +52 to +53
except SystemExit as e:
sys.exit(e.code)

Copilot AI Nov 5, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching and re-raising SystemExit inside the retry loop causes the process to exit on the first retry, preventing any subsequent retry attempts. The SystemExit exception should be re-raised outside the retry loop or handled differently to allow retries to occur.

Suggested change
except SystemExit as e:
sys.exit(e.code)
# Let SystemExit propagate naturally; do not catch it here.

Copilot uses AI. Check for mistakes.
Comment thread guest_cmd.py
Comment on lines +57 to +59
else:
print("Failed to connect after multiple attempts.", file=sys.stderr)
sys.exit(1)

Copilot AI Nov 5, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The while-else clause will execute even when the loop is exited via break statements (lines 43, 51), not just when max_retries is exceeded. This means success cases that break out of the loop will incorrectly trigger the failure message and exit. The else clause should only execute when retries are exhausted.

Copilot uses AI. Check for mistakes.
Comment thread guest_cmd.py
import time


def run_guest(unix_socket, port, command, use_stdio=True, max_retries=3, retry_delay=1):

Copilot AI Nov 5, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error during guest command

3 participants