Skip to content

Fix event handler leak in ExecuteLocal by unregistering handlers on cleanup#951

Open
kshinjo-pfr wants to merge 1 commit intoros2:rollingfrom
kshinjo-pfr:fix-event-handler-leak
Open

Fix event handler leak in ExecuteLocal by unregistering handlers on cleanup#951
kshinjo-pfr wants to merge 1 commit intoros2:rollingfrom
kshinjo-pfr:fix-event-handler-leak

Conversation

@kshinjo-pfr
Copy link

@kshinjo-pfr kshinjo-pfr commented Mar 4, 2026

Description

Fixes #565.

ExecuteLocal registers 6 event handlers in execute() but never unregisters them when the process exits normally. This causes a memory leak proportional to the number of processes launched over the lifetime of a LaunchService, as the handlers (and all objects they reference, including the action itself) remain in context._event_handlers indefinitely.

Additionally, _shutdown_process() registers a new OnProcessStart handler each time it is called before the process has started, which can also accumulate.

Measured impact: On a robotics system running ROS 2 Jazzy with 12 ros2 launch processes (each managing composable node containers), we observed ~56 MB/h memory growth from Python launch processes alone, leading to OOM kills every 17-29 hours.

Problem demonstration

The new tests confirm the leak exists on unpatched code:

Test Without fix With fix Description
test_event_handlers_cleaned_up_after_process_exit FAIL (2→8, 6 handlers leaked) PASS A single process leaks 6 handlers after exit
test_multiple_processes_handler_cleanup FAIL (2→32, 30 handlers leaked) PASS 5 processes leak 6 handlers each (5×6=30)
test_event_handlers_stable_during_respawn PASS PASS Handlers are registered once per action, not per respawn cycle

The first two tests fail without this fix, proving that event handlers accumulate in context._event_handlers and are never cleaned up. Each ExecuteLocal instance leaks exactly 6 handlers.

Changes

  • Store registered event handlers and context as instance variables
  • Add __unregister_event_handlers() helper method
  • Unregister handlers in __flush_buffers / __flush_cached_buffers after all ProcessExited handlers (including on_exit) have had a chance to fire
  • When process fails to start (no ProcessExited event), unregister handlers directly
  • Fix _shutdown_process() deferred handler accumulation: track and replace the OnProcessStart handler instead of creating new ones
  • Add 3 tests verifying handler cleanup and stability during respawn

Design note

Handlers cannot be unregistered in __cleanup() because __cleanup() runs after emit_event(ProcessExited) enqueues the event but before the event is actually processed. Unregistering there would prevent on_exit callbacks from firing. Instead, unregistration is deferred to __flush_buffers / __flush_cached_buffers, which are themselves OnProcessExit handlers and run after all other ProcessExited handlers (including on_exit). They check __completed_future.done() to determine if the action is truly complete (i.e., not respawning).

Testing

colcon test --packages-select launch --return-code-on-test-failure --event-handlers=console_direct+ --pytest-args -k "test_execute_local"

Is this user-facing behavior change?

No. This is an internal cleanup that prevents memory leaks. All existing behavior (respawn, on_exit callbacks, shutdown) is preserved.

Did you use Generative AI?

Yes. Claude (Anthropic) was used to assist with code analysis, implementation, and test writing. All changes were reviewed and verified by a human.

Additional Information

The 6 event handlers registered per ExecuteProcess instance are:

  1. EventHandler for ShutdownProcess
  2. EventHandler for SignalProcess
  3. OnProcessIO for stdin/stdout/stderr
  4. OnShutdown for graceful shutdown
  5. OnProcessExit for user's on_exit callback
  6. OnProcessExit for buffer flushing

These were never unregistered, keeping the ExecuteLocal instance and its I/O buffers alive for the entire lifetime of the LaunchService.

@kshinjo-pfr kshinjo-pfr marked this pull request as draft March 4, 2026 13:06
@kshinjo-pfr kshinjo-pfr force-pushed the fix-event-handler-leak branch from b4fe2ad to b3ecefd Compare March 4, 2026 13:21
…leanup

ExecuteLocal registers 6 event handlers in execute() but never
unregisters them when the process exits. This causes a memory leak
proportional to the number of processes launched over the lifetime of
a LaunchService, as the handlers and all objects they reference remain
in context._event_handlers indefinitely.

This change:
- Stores registered event handlers and context as instance variables
- Adds __unregister_event_handlers() helper that safely removes all
  handlers registered by this action
- Unregisters handlers in __flush_buffers/__flush_cached_buffers after
  all ProcessExited handlers (including on_exit) have fired
- Unregisters handlers when process fails to start (no ProcessExited)
- Fixes _shutdown_process() deferred handler accumulation by tracking
  and replacing the OnProcessStart handler instead of creating new ones
- Adds tests verifying handler cleanup and stability during respawn

Fixes ros2#565

Signed-off-by: Koki Shinjo <kshinjo@pfrobotics.jp>
@kshinjo-pfr kshinjo-pfr force-pushed the fix-event-handler-leak branch from b3ecefd to 54a8377 Compare March 4, 2026 13:22
@kshinjo-pfr kshinjo-pfr marked this pull request as ready for review March 4, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Actions (particularly ExecuteProcess) exist indefinitely

1 participant