Skip to content

WsClient: stale _wsa reference causes permanent RuntimeError on reconnect #147

@tobsh

Description

@tobsh

Hi @Voyz, first off thank you for maintaining ibind 🙏. It's become the
backbone of our IBKR integration and we really appreciate the work you put
into it.

I think I've tracked down a reconnect bug in WsClient and wanted to
share what we found in case it's useful.

Describe the bug

After a transient WebSocket disconnect, WsClient._new_websocket_app()
raises RuntimeError: WebSocketApp should be closed before attempting to create a new one on every reconnect attempt, because the stale _wsa
reference is never cleared. The client never recovers. It enters a retry
loop emitting repeated Thread already running: ws_client_thread-None
warnings until the process is restarted.

Two code paths seem to combine to cause this.

1. _new_websocket_app raises instead of recovering
(ibind/base/ws_client.py line 188 on master, v0.1.23):

if self._wsa is not None:
    raise RuntimeError(f'{self}: WebSocketApp should be closed before attempting to create a new one')

2. _handle_on_close returns early without clearing _wsa
(ibind/base/ws_client.py line 308–310 on master):

if not self._connected:
    _LOGGER.info(f'{self}: on_close event while disconnected')
    return   # self._wsa still points at the dead WSA

A close-while-disconnected leaves _wsa stale, then the next
_new_websocket_app() sees _wsa is not None, raises, and the reconnect
never proceeds.

Steps to Reproduce

  1. Connect IbkrWsClient to a gateway.
  2. Cause a disconnect (kill the gateway, let IBeam auth expire, or pull the network).
  3. Wait for restart_on_close to trigger a reconnect.
  4. Observe RuntimeError: WebSocketApp should be closed before attempting to create a new one.
  5. Observe repeated WARNING: Thread already running: ws_client_thread-None.
  6. Client never recovers. Only a process restart fixes it.

Expected Behaviour

Reconnection after a transient disconnect should succeed. A stale _wsa
reference from the previous connection cycle shouldn't block future
reconnects.

Additional context

We've been running a local patch with the change below for several weeks
in a live trading daemon. Before patching, we saw repeated bursts of
Thread already running warnings after every disconnect; after, the
reconnect path is clean and the client recovers on its own. Nothing else
appears to change.

We also have unit tests covering the patched behaviour (stale-WSA
force-close, close-failure handling, happy path, and the
_handle_on_close cleanup branch). Happy to port them if that's useful.

I also came across some earlier reports that mention overlapping symptoms
(#25, #81, #129). They were closed due to inactivity, so I wasn't sure
whether to resurrect one of those or open something new. Happy to do
whichever is easier on your end.

Possible Solution

A couple of small changes seem to address it.

In _new_websocket_app, force-close the stale WSA and join the dead
thread instead of raising:

def _new_websocket_app(self) -> bool:
    if self._wsa is not None:
        _LOGGER.warning(f'{self}: Force-closing stale WebSocketApp before reconnect')
        try:
            self._wsa.close()
        except Exception:
            _LOGGER.debug(f'{self}: Stale WSA close failed, abandoning', exc_info=True)
        self._wsa = None
        if self._thread is not None:
            self._thread.join(timeout=5)
            self._thread = None
    # ... proceed with existing WSA-creation logic

In _handle_on_close, clear _wsa on the early-return path when the
client isn't running:

if not self._connected:
    _LOGGER.info(f'{self}: on_close event while disconnected')
    if not self._running:
        self._wsa = None
    return

If this looks right to you, I'd be glad to put up a PR. Just let me know
whether you'd prefer that or to take it from here. Either works!

Your Environment

  • IBind version: 0.1.22 (also reproduces against master, v0.1.23)
  • Python version: 3.13
  • Authentication method: IBeam
  • Operating System and version: macOS (launchd-managed daemon)

Thanks again for all your work on this library 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions