Skip to content

Add task-based async with persistent event loop#11

Closed
benoitc wants to merge 10 commits intomainfrom
feature/task-based-asgi
Closed

Add task-based async with persistent event loop#11
benoitc wants to merge 10 commits intomainfrom
feature/task-based-asgi

Conversation

@benoitc
Copy link
Copy Markdown
Owner

@benoitc benoitc commented Mar 8, 2026

Summary

  • Implement task-based async architecture with a persistent event loop for ASGI
  • Use reactor.signal_write_ready() API for async completion signaling
  • Update for erlang-python API changes and minor optimizations

benoitc added 10 commits March 7, 2026 18:28
Implement socketpair-based FD reactor model as alternative to NIF-based
WSGI/ASGI marshalling. The new backend_mode option allows choosing between:
- nif: Traditional NIF-based marshalling (default, unchanged)
- fd_reactor: Socketpair + reactor model with streaming body support

New components:
- priv/hornbeam_http/: HTTP parser package extracted from gunicorn
  with PROXY protocol v1/v2 support
- priv/hornbeam_reactor_http.py: HTTP protocol for FD reactor
- src/hornbeam_proxy_protocol.erl: PROXY v2 binary encoder
- src/hornbeam_socketpair.erl: Unix socketpair creation
- src/hornbeam_reactor_pool.erl: Reactor context pool management
- src/hornbeam_proxy_bridge.erl: Request/response relay via socketpair

Modified hornbeam_handler.erl to route requests based on backend_mode.
HTTP/2 requests are translated to HTTP/1.1 for the reactor path.

Includes test suite and benchmarking framework for comparing modes.
The py:bind/1 function doesn't exist in current erlang_python.
Use py:contexts_started() and py:context() instead.
High-performance HTTP parser using picohttpparser library:
- Pure C Python extension (no Cython required)
- SIMD optimizations (SSE4.2/AVX2 on x86, NEON on ARM)
- 12.7x faster than pure Python parser
- ~1.7M requests/sec parsing throughput

Includes build_environ() and build_asgi_scope() helpers for
WSGI/ASGI integration.

Build: cd priv/hornbeam_http_fast && python setup.py build_ext --inplace
Add pico_parser_fast.c with zero-copy optimizations:
- Returns HttpRequest object with lazy attribute access
- Avoids dict/string allocations until accessed
- Achieves 8M+ req/s parsing throughput (50x faster than Python)

Update hornbeam_reactor_http.py to use fast parser:
- Prefer pico_parser_fast, fallback to pico_parser, then Python
- Refactor HTTPProtocol to use parsed_request dict format
- Implement fast body reading for Content-Length and chunked
- Use dict-based build_environ and build_asgi_scope
- Full WSGI cycle: 269k req/s, ASGI cycle: similar performance

Update setup.py to build pico_parser_fast extension.
- Reuse shared event loop instead of creating new one per request
- Add ASGIState class with __slots__ for efficient state management
- Move receive/send coroutines to module level to reduce allocations
- ASGI performance: 18k -> 27k req/s (50% improvement)
- Remaining overhead (~33µs) is inherent to run_until_complete()
When running inside Erlang VM, use erlang.new_event_loop() instead
of asyncio.new_event_loop(). The Erlang-backed event loop provides:
- Sub-millisecond latency via enif_select
- Zero polling (event-driven)
- GIL release during I/O waits

Falls back to standard asyncio when running outside Erlang VM.
Supports async_pending for non-blocking task submission.
Tasks signal completion via erlang.send(reactor_pid, {write_ready, fd}).
- Update hornbeam.erl to use py:contexts_started()
- Update hornbeam_handler.erl to use py:call (automatic routing)
- Update hornbeam_lifespan.erl to use py:call with timeout option
- Update hornbeam_socketpair.erl to use py_nif:fd_close
- Add local variable caching in HTTP protocol hot paths
- Use memoryview for zero-copy buffer writes
- Pre-allocate ASGI disconnect message
- Add StreamingASGISender with flow control (max pending chunks)
- Add StreamingASGIReceive for streaming request body from Erlang
- Add run_asgi_streaming() entry point for message-based streaming
- Add StreamingResponse and run_wsgi_streaming() for WSGI
- Add handle_asgi_streaming/handle_wsgi_streaming to handler
- Add stream_body_loop for forwarding chunks to client
- Add handle_streaming to proxy bridge with stream_response
- Remove PROXY v2 overhead, use plain HTTP/1.1
- Optimize header map construction with maps:from_list
- Optimize write_all with non-blocking retry loop
- Fix fd_reactor_SUITE test setup and assertions
- Fix fd_reactor_benchmark application startup and formatting
@benoitc benoitc force-pushed the feature/task-based-asgi branch from d379f3f to 4621bfe Compare March 8, 2026 15:05
@benoitc benoitc closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant