The current implementation of Eio.Net.run_server provides no error handling at all if the call to "accept" fails. This means that the whole server is terminated and all running response fibers are canceled.
Is this on purpose?
The typical cause for such a failure is "too many open files", which in that case means "too many open/accepted sockets". A typical soft limit is 1024 and a typical hard limit is 2048. However, the default value for max_connections is max_int, which means every naive usage of Eio.Net.run_server is vulnerable to a simple denial of service attack if more than 1024 or 2048 connections are opened in parallel.
Naively I would have expected (but maybe there are reasons against doing so?):
- that either
max_connections has a low enough default value so this won't happen so easily (unless request handlers are opening tons of files or connections on their own),
- or that the call to
accept resp. accept_fork is guarded such that on exception, only that single connection is lost, then a call to on_error resp. a separate on_accept_error, followed by some waiting time (something in the range of 5-500 milliseconds, perhaps also a parameter to run_server), and finally retry the next accept/_fork.
Moreover, I noticed that:
- The semaphore is not properly released in that case. Since it is acquired outside of
accept_fork, it should be released in a protection around of accept_fork, not a protection of the request handler called by accept_fork.
My questions are:
- Would you accept a PR that fixes this?
- If so, which solution would you prefer?
- For the waiting time, should we adjust the interface of
Eio.Net.run_server to require a clock? Or be tricky and use Eio_unix.sleep? Or just Eio.Fiber.yield without any actual waiting time (could still be user provided within on_accept_error)?
The current implementation of
Eio.Net.run_serverprovides no error handling at all if the call to "accept" fails. This means that the whole server is terminated and all running response fibers are canceled.Is this on purpose?
The typical cause for such a failure is "too many open files", which in that case means "too many open/accepted sockets". A typical soft limit is 1024 and a typical hard limit is 2048. However, the default value for
max_connectionsismax_int, which means every naive usage ofEio.Net.run_serveris vulnerable to a simple denial of service attack if more than 1024 or 2048 connections are opened in parallel.Naively I would have expected (but maybe there are reasons against doing so?):
max_connectionshas a low enough default value so this won't happen so easily (unless request handlers are opening tons of files or connections on their own),acceptresp.accept_forkis guarded such that on exception, only that single connection is lost, then a call toon_errorresp. a separateon_accept_error, followed by some waiting time (something in the range of 5-500 milliseconds, perhaps also a parameter torun_server), and finally retry the nextaccept/_fork.Moreover, I noticed that:
accept_fork, it should be released in a protection around ofaccept_fork, not a protection of the request handler called byaccept_fork.My questions are:
Eio.Net.run_serverto require aclock? Or be tricky and useEio_unix.sleep? Or justEio.Fiber.yieldwithout any actual waiting time (could still be user provided withinon_accept_error)?