Skip to content

Add non-blocking DNS for the async API via c-ares event-loop integration#324

Draft
bjosv wants to merge 5 commits into
valkey-io:mainfrom
bjosv:use-cares-async
Draft

Add non-blocking DNS for the async API via c-ares event-loop integration#324
bjosv wants to merge 5 commits into
valkey-io:mainfrom
bjosv:use-cares-async

Conversation

@bjosv

@bjosv bjosv commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Even with c-ares enabled for timeout-bounded DNS, the async API still blocks the calling thread during DNS resolution. This happens because DNS is resolved inside valkeyAsyncConnectWithOptions() before returning, the same blocking call used by the sync API. For applications using async I/O to avoid blocking (e.g. serving many connections on a single thread), a slow DNS lookup stalls the entire event loop.

This PR builds on the c-ares sync DNS support (#323) and makes DNS fully non-blocking for the async API. When the libevent adapter is used, DNS resolution is driven by the event loop, so no thread blocks during DNS.

See commit "Add c-ares async DNS resolve for non-blocking connect", the other commits are from #323.

How it works:

When c-ares is enabled and an async connection is made, DNS is no longer resolved before the function returns. Instead:

  1. The connect call returns immediately with no socket fd; DNS is deferred
  2. When the application attaches an event-loop adapter (e.g. libevent), the adapter sees the pending DNS and acts:
    • Libevent adapter: starts a non-blocking DNS query via c-ares. The c-ares file descriptors and timers are registered with the event loop, so DNS completes as part of normal event processing without blocking.
    • Other adapters (libev, libuv, poll, etc.): resolve DNS synchronously with a timeout before proceeding. This still blocks briefly but is bounded by connect_timeout. These adapters can be upgraded to full async DNS by implementing the c-ares hooks.
  3. Once DNS resolves, a TCP socket is created and connect() is called (non-blocking, as usual). The event loop then handles connect completion.

Design decisions:

  • c-ares callbacks can fire synchronously (e.g. IP literals like 127.0.0.1). A done/failed flag pattern ensures cleanup happens safely outside the c-ares callback, preventing use-after-free.
  • A VALKEY_DNS_BLOCKING_FALLBACK macro makes it trivial for adapters to opt in to the blocking fallback, one line per adapter.
  • _EL_ADD_READ/_EL_ADD_WRITE macros suppress event registration while DNS is pending, preventing operations on an invalid fd.

Custom adapters:

Third-party adapters must add VALKEY_DNS_BLOCKING_FALLBACK(ac) to their attach function, or implement the c-ares hooks for full async DNS. See the documentation in docs/standalone.md.

Testing:

  • New ct_async_dns integration test exercising both the libevent async path and the standalone connect flow

bjosv added 4 commits June 22, 2026 14:40
Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
When built with USE_CARES=1 (make) or -DENABLE_CARES=ON (CMake),
DNS resolution in the sync API uses c-ares with a poll loop bounded
by connect_timeout (defaulting to 5s). This prevents indefinite hangs
when DNS is slow or unresponsive.

The c-ares path uses ARES_OPT_SOCK_STATE_CB for fd tracking, and a
short-lived channel per resolve call. IPv4/IPv6 fallback behavior is
preserved.

Without enabling c-ares, behavior is unchanged (plain getaddrinfo).

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
When c-ares is enabled, the async connect path defers DNS until the
event adapter is attached. The libevent adapter initiates async DNS
on attach, with c-ares fds and timers driven by the event loop.

On DNS completion, socket() + connect() proceed as normal and the
fd is registered with the event loop for connect completion.

Adapters without c-ares hooks fall back to blocking DNS.

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant