Skip to content

Trap exit for graceful shutdown#317

Open
FOT-BOT wants to merge 2 commits intocommanded:masterfrom
STUDITEMPS:master
Open

Trap exit for graceful shutdown#317
FOT-BOT wants to merge 2 commits intocommanded:masterfrom
STUDITEMPS:master

Conversation

@FOT-BOT
Copy link

@FOT-BOT FOT-BOT commented Jan 28, 2026

In our testsuit we observed the following logs producing noise: [info] Postgrex.Protocol (#PID<0.867.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.868.0> exited.

They seem to be cause by calling Application.stop(:eventstore) at the end of our test case. The root cause appears to be the hard kill that AdvisoryLocks experiences without the ability to checkin a currently held DBConnection. Trapping exits is sufficient to allow for a normal connection checkin before termination.

In our testsuit we observed the following logs producing noise: `[info] Postgrex.Protocol (#PID<0.867.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.868.0> exited`.

They seem to be cause by calling `Application.stop(:eventstore)` at the end of our test case. The root cause appears to be the hard kill that `AdvisoryLocks` experiences without the ability to checkin a currently held `DBConnection`. Trapping exits is sufficient to allow for a normal connection checkin before termination.
@SilvanCodes
Copy link

Ah, sorry, I was logged in with the team bot account. This is a change made by me, not anything automated. ^^

@drteeth
Copy link
Contributor

drteeth commented Feb 4, 2026

We've seen this fix proposed a few times, so I understand that it is addressing something that people are actually seeing. Thanks for providing some context into where you are seeing the error and possible causes.

I get that this code solves your issue, but I don't understand the scope of the problem in the first place. Before accepting this I'd really want to understand what's going on.

Here's my current understanding of the advisory lock code:

  • EventStore.Supervisor starts an AdvisoryLock process and passes it a database connection to handle locks.

  • On startup, AdvisoryLock asks MonitoredServer to let it know when the connection dies.

  • MonitoredServer will attempt to restart that process in the event of a crash.

  • When a subscription is requested, SubscriptionFsm asks AdvisoryLock to take a lock.

  • AdvisoryLock does so and adds it to a list of locks that it is tracking.

  • AdvisoryLock monitors the SubscriptionFsm process that asked for the lock, when it goes down for any reason, the lock is released

So AdvisoryLock only monitors SubscriptionFsm and the PG connection (indirectly), nobody links to AdvisoryLock that I can tell, other than EventStore.Supervisor that started it.

If that's the case then the :EXIT signals are coming from the parent, which means the whole tree is coming down. This matches what you are seeing I think. This would mean AdvisoryLocks is getting a :shutdown message from the supervisor unless I'm mistaken.

I don't understand the fix though. Who is sending :normal as a shutdown reason, and is trying to ignore it the best solution? Presumably this will be killed brutally in 5000ms?

Long story short, while the request seems simple, I'm not yet seeing how this all plays together and I don't want to move forward until I do.

  1. Where are you seeing this?
  • In tests? dev? prod?
  1. Why is normal treated differently?

  2. Can you reproduce easily?

  3. Can you figure out which processes correspond to 0.867.0 and 0.868.0?

  4. Help me understand why :normal is being treated differently?

  5. Can you add a failing test as the first commit or add a comment here that shows how to reproduce the error being logged?

cc @cdegroot @TylerPachal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants