Hi @Voyz, first off thank you for maintaining ibind 🙏. It's become the
backbone of our IBKR integration and we really appreciate the work you put
into it.
I think I've tracked down a reconnect bug in WsClient and wanted to
share what we found in case it's useful.
Describe the bug
After a transient WebSocket disconnect, WsClient._new_websocket_app()
raises RuntimeError: WebSocketApp should be closed before attempting to create a new one on every reconnect attempt, because the stale _wsa
reference is never cleared. The client never recovers. It enters a retry
loop emitting repeated Thread already running: ws_client_thread-None
warnings until the process is restarted.
Two code paths seem to combine to cause this.
1. _new_websocket_app raises instead of recovering
(ibind/base/ws_client.py line 188 on master, v0.1.23):
if self._wsa is not None:
raise RuntimeError(f'{self}: WebSocketApp should be closed before attempting to create a new one')
2. _handle_on_close returns early without clearing _wsa
(ibind/base/ws_client.py line 308–310 on master):
if not self._connected:
_LOGGER.info(f'{self}: on_close event while disconnected')
return # self._wsa still points at the dead WSA
A close-while-disconnected leaves _wsa stale, then the next
_new_websocket_app() sees _wsa is not None, raises, and the reconnect
never proceeds.
Steps to Reproduce
- Connect
IbkrWsClient to a gateway.
- Cause a disconnect (kill the gateway, let IBeam auth expire, or pull the network).
- Wait for
restart_on_close to trigger a reconnect.
- Observe
RuntimeError: WebSocketApp should be closed before attempting to create a new one.
- Observe repeated
WARNING: Thread already running: ws_client_thread-None.
- Client never recovers. Only a process restart fixes it.
Expected Behaviour
Reconnection after a transient disconnect should succeed. A stale _wsa
reference from the previous connection cycle shouldn't block future
reconnects.
Additional context
We've been running a local patch with the change below for several weeks
in a live trading daemon. Before patching, we saw repeated bursts of
Thread already running warnings after every disconnect; after, the
reconnect path is clean and the client recovers on its own. Nothing else
appears to change.
We also have unit tests covering the patched behaviour (stale-WSA
force-close, close-failure handling, happy path, and the
_handle_on_close cleanup branch). Happy to port them if that's useful.
I also came across some earlier reports that mention overlapping symptoms
(#25, #81, #129). They were closed due to inactivity, so I wasn't sure
whether to resurrect one of those or open something new. Happy to do
whichever is easier on your end.
Possible Solution
A couple of small changes seem to address it.
In _new_websocket_app, force-close the stale WSA and join the dead
thread instead of raising:
def _new_websocket_app(self) -> bool:
if self._wsa is not None:
_LOGGER.warning(f'{self}: Force-closing stale WebSocketApp before reconnect')
try:
self._wsa.close()
except Exception:
_LOGGER.debug(f'{self}: Stale WSA close failed, abandoning', exc_info=True)
self._wsa = None
if self._thread is not None:
self._thread.join(timeout=5)
self._thread = None
# ... proceed with existing WSA-creation logic
In _handle_on_close, clear _wsa on the early-return path when the
client isn't running:
if not self._connected:
_LOGGER.info(f'{self}: on_close event while disconnected')
if not self._running:
self._wsa = None
return
If this looks right to you, I'd be glad to put up a PR. Just let me know
whether you'd prefer that or to take it from here. Either works!
Your Environment
- IBind version: 0.1.22 (also reproduces against
master, v0.1.23)
- Python version: 3.13
- Authentication method: IBeam
- Operating System and version: macOS (launchd-managed daemon)
Thanks again for all your work on this library 🙏
Hi @Voyz, first off thank you for maintaining
ibind🙏. It's become thebackbone of our IBKR integration and we really appreciate the work you put
into it.
I think I've tracked down a reconnect bug in
WsClientand wanted toshare what we found in case it's useful.
Describe the bug
After a transient WebSocket disconnect,
WsClient._new_websocket_app()raises
RuntimeError: WebSocketApp should be closed before attempting to create a new oneon every reconnect attempt, because the stale_wsareference is never cleared. The client never recovers. It enters a retry
loop emitting repeated
Thread already running: ws_client_thread-Nonewarnings until the process is restarted.
Two code paths seem to combine to cause this.
1.
_new_websocket_appraises instead of recovering(
ibind/base/ws_client.pyline 188 onmaster, v0.1.23):2.
_handle_on_closereturns early without clearing_wsa(
ibind/base/ws_client.pyline 308–310 onmaster):A close-while-disconnected leaves
_wsastale, then the next_new_websocket_app()sees_wsa is not None, raises, and the reconnectnever proceeds.
Steps to Reproduce
IbkrWsClientto a gateway.restart_on_closeto trigger a reconnect.RuntimeError: WebSocketApp should be closed before attempting to create a new one.WARNING: Thread already running: ws_client_thread-None.Expected Behaviour
Reconnection after a transient disconnect should succeed. A stale
_wsareference from the previous connection cycle shouldn't block future
reconnects.
Additional context
We've been running a local patch with the change below for several weeks
in a live trading daemon. Before patching, we saw repeated bursts of
Thread already runningwarnings after every disconnect; after, thereconnect path is clean and the client recovers on its own. Nothing else
appears to change.
We also have unit tests covering the patched behaviour (stale-WSA
force-close, close-failure handling, happy path, and the
_handle_on_closecleanup branch). Happy to port them if that's useful.I also came across some earlier reports that mention overlapping symptoms
(#25, #81, #129). They were closed due to inactivity, so I wasn't sure
whether to resurrect one of those or open something new. Happy to do
whichever is easier on your end.
Possible Solution
A couple of small changes seem to address it.
In
_new_websocket_app, force-close the stale WSA and join the deadthread instead of raising:
In
_handle_on_close, clear_wsaon the early-return path when theclient isn't running:
If this looks right to you, I'd be glad to put up a PR. Just let me know
whether you'd prefer that or to take it from here. Either works!
Your Environment
master, v0.1.23)Thanks again for all your work on this library 🙏