-
Notifications
You must be signed in to change notification settings - Fork 848
Fix NetAcceptAction::cancel() use-after-free race condition #12803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Fix a race condition between NetAcceptAction::cancel() and
NetAccept::acceptEvent() where the server pointer could be dereferenced
after the NetAccept object was deleted.
The race occurred as follows:
1. Thread T4 calls NetAcceptAction::cancel(), sets cancelled=true
2. Thread T3 running acceptEvent() sees cancelled==true
3. Thread T3 deletes the NetAccept (including embedded Server)
4. Thread T4 tries to call server->close() on freed memory
The fix uses std::atomic<Server*> with atomic exchange to ensure only
one thread can successfully obtain and use the server pointer. Both
cancel() and the cleanup paths before delete this atomically exchange
the pointer with nullptr - whichever succeeds first closes the server,
the other becomes a no-op.
This addresses the TODO comment that was in the code:
"// TODO fix race between cancel accept and call back"
ASAN report this fixes (seen intermittently on rocky CI builds):
==8850==ERROR: AddressSanitizer: heap-use-after-free on address 0x616000028cb4 at pc 0x000001346739 bp 0x7fa40fd2f580 sp 0x7fa40fd2f570
WRITE of size 4 at 0x616000028cb4 thread T4 ([ET_NET 2])
#0 0x1346738 in UnixSocket::close() ../src/iocore/eventsystem/UnixSocket.cc:138
#1 0x12b44ed in Server::close() ../src/iocore/net/Server.cc:88
#2 0x121fb95 in NetAcceptAction::cancel(Continuation*) ../src/iocore/net/P_NetAccept.h:71
#3 0x7fa41686d082 in TSActionCancel(tsapi_action*) ../src/api/InkAPI.cc:5828
...
0x616000028cb4 is located 308 bytes inside of 576-byte region [0x616000028b80,0x616000028dc0)
freed by thread T3 ([ET_NET 1]) here:
#0 0x7fa416d2036f in operator delete(void*, unsigned long)
#1 0x12593c4 in NetAccept::~NetAccept() ../src/iocore/net/P_NetAccept.h:128
#2 0x12bebf0 in NetAccept::acceptEvent(int, void*) ../src/iocore/net/UnixNetAccept.cc:484
...
2d0b259 to
6743be0
Compare
| Server *s = action_->server.exchange(nullptr, std::memory_order_acq_rel); | ||
| if (s != nullptr) { | ||
| s->close(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repetitive and could be an issue if the repeated code needs to change. I recommend adding a function to clear and close, or perhaps calling cancel on the action is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a race condition between NetAcceptAction::cancel() and NetAccept::acceptEvent() that could result in a use-after-free when the server pointer is dereferenced after the NetAccept object is deleted. The fix uses std::atomic<Server*> with atomic exchange operations to ensure only one thread can successfully obtain and close the server.
Changes:
- Converted
NetAcceptAction::serverfrom a raw pointer tostd::atomic<Server*> - Implemented atomic exchange pattern in all cleanup paths to prevent use-after-free
- Updated initialization to use atomic store with proper memory ordering
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/iocore/net/P_NetAccept.h | Changed server member to atomic pointer, updated cancel() to use atomic exchange, removed TODO comment about race condition |
| src/iocore/net/UnixNetProcessor.cc | Updated server pointer initialization to use atomic store with release semantics |
| src/iocore/net/QUICNetProcessor.cc | Updated server pointer initialization to use atomic store with release semantics |
| src/iocore/net/UnixNetAccept.cc | Added atomic exchange before server.close() in all event handler cleanup paths (acceptEvent, acceptFastEvent, acceptLoopEvent) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fix a race condition between NetAcceptAction::cancel() and NetAccept::acceptEvent() where the server pointer could be dereferenced after the NetAccept object was deleted.
The race occurred as follows:
The fix uses std::atomic<Server*> with atomic exchange to ensure only one thread can successfully obtain and use the server pointer. Both cancel() and the cleanup paths before delete this atomically exchange the pointer with nullptr - whichever succeeds first closes the server, the other becomes a no-op.
This addresses the TODO comment that was in the code: "// TODO fix race between cancel accept and call back"
ASAN report this fixes (seen intermittently on rocky CI builds):