Skip to content

fix(sandbox): summarize cleanup_all failures with per-agent error reporting#4131

Open
UGVicV wants to merge 17 commits into
orchestration-agent:mainfrom
UGVicV:fix/bounty-4066-cleanup-failures
Open

fix(sandbox): summarize cleanup_all failures with per-agent error reporting#4131
UGVicV wants to merge 17 commits into
orchestration-agent:mainfrom
UGVicV:fix/bounty-4066-cleanup-failures

Conversation

@UGVicV
Copy link
Copy Markdown

@UGVicV UGVicV commented May 25, 2026

Description

This PR resolves the issue where AgentSandbox.cleanup_all() silently failed without reporting which sandbox cleanup failed. Operators could not tell which temporary workspaces remained after bulk cleanup.

Fix

  • Changed cleanup_all() return type from None to Dict[str, List] with succeeded and failed keys.
  • Each failed entry includes agent_id and error message for operator diagnostics.
  • Individual sandbox destroy failures are caught and collected — one failure does not prevent cleanup of other sandboxes.
  • Added logger.warning() when failures occur, logger.error() for individual failures.
  • Replaced print() with logger.error() throughout the file.
  • Removed unused os import.
  • Fixed all PEP8 violations (all lines under 79 characters).

Verification (Proof)

Added 3 regression tests in new TestCleanupAll class:

  • test_cleanup_all_success: Verifies all sandboxes cleaned up, returned in succeeded list.
  • test_cleanup_all_partial_fail: Verifies failed sandbox (mocked OSError) is collected in failed list.
  • test_cleanup_all_empty: Verifies empty sandbox returns empty result.
72 passed in 0.07s
  • flake8 src/agent/sandbox.py tests/test_sandbox.py -> Passed (0 errors)
  • git diff --check -> Passed (0 errors)

Closes #4066.

Vic added 17 commits May 22, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ Bounty $4k ] [ Sandbox ] Summarize cleanup_all failures — bulk cleanup

1 participant