Executor job add fast path#162758
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to micro-optimize HomeAssistant.async_add_executor_job by adding a fast path to decide whether an executor future should be tracked as a normal task or a background task, based on a marker set on background tasks created by async_create_background_task.
Changes:
- Add a marker attribute to tasks created via
async_create_background_task. - Use that marker as a fast path in
async_add_executor_jobto avoid a set membership check on the common background-task path.
|
I'm wondering about how you profiled this. Was it done before or after the latest fix? |
|
Benchmark code added. There are two commits to check, one containing the original and potential optimized code. You can run the benchmark via: Here's a copy of three runs on my machine: first the change and second the original code.bThe potential optimization is, as mentioned a micro-optimization, so it's marginal, as expected. Curious to see what it does on your machines and if it's worth proceeding. |
|
I think we should remove the commented code and clean up the commits and rebase to have two clean commits, so we can test this easily. |
|
|
||
| @benchmark | ||
| async def executor_job(hass: core.HomeAssistant) -> float: | ||
| """Schedule executor jobs from multiple background tasks.""" |
There was a problem hiding this comment.
I think we should add another benchmark for non background tasks (foreground), so we can check that those are not negatively impacted by the change.
There was a problem hiding this comment.
Added. Same sequence, first the potential optimized and then the original code. I spot a slight improvement, still a micro-optimization. But at this stage I think it goes up in the noise and is barely noticable.
vscode ➜ /workspaces/ha-core (exectur-job-fast-path) $ source /home/vscode/.local/ha-venv/bin/activate
(ha-venv) vscode ➜ /workspaces/ha-core (exectur-job-fast-path) $ python -m homeassistant --script benchmark executor_job_foreground
Using event loop: _UnixSelectorEventLoop
WARNING:asyncio:Executing <Task pending name='Task-1' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.125 seconds
Benchmark executor_job_foreground done in 4.227339091000431s
WARNING:asyncio:Executing <Task pending name='Task-4' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.137 seconds
Benchmark executor_job_foreground done in 4.243188408000606s
WARNING:asyncio:Executing <Task pending name='Task-7' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 3.895 seconds
Benchmark executor_job_foreground done in 4.008277234000161s
^CWARNING:asyncio:Executing <Task cancelling name='Task-10' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future cancelled created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.203 seconds
(ha-venv) vscode ➜ /workspaces/ha-core (exectur-job-fast-path) $ python -m homeassistant --script benchmark executor_job_foreground
Using event loop: _UnixSelectorEventLoop
WARNING:asyncio:Executing <Task pending name='Task-1' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.144 seconds
Benchmark executor_job_foreground done in 4.2441349760001685s
WARNING:asyncio:Executing <Task pending name='Task-4' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.022 seconds
Benchmark executor_job_foreground done in 4.125795158999608s
WARNING:asyncio:Executing <Task pending name='Task-7' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future pending cb=[Task.task_wakeup()] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.087 seconds
Benchmark executor_job_foreground done in 4.192483848999473s
^CWARNING:asyncio:Executing <Task cancelling name='Task-10' coro=<run_benchmark() running at /workspaces/ha-core/homeassistant/scripts/benchmark/__init__.py:49> wait_for=<Future cancelled created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:459> cb=[_run_until_complete_cb() at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/base_events.py:181] created at /home/vscode/.local/share/uv/python/cpython-3.14.3-linux-x86_64-gnu/lib/python3.14/asyncio/runners.py:109> took 4.145 seconds
| # Use loop.create_task | ||
| # to avoid the extra function call in asyncio.create_task. | ||
| task = self.loop.create_task(target, name=name) | ||
| setattr(task, "ha_background", True) |
There was a problem hiding this comment.
I think we should investigate setting an attribute on the foreground tasks that are tracked too.
There was a problem hiding this comment.
Do you wish to first run the benchmarks yourself and see if it's worth proceeding? Or, you like the idea and give green light that I continue to work it out.
There was a problem hiding this comment.
Brought it to two commits again and included the foreground tasks as an attribute. I'm very curious to see how your benchmark results are looking.
There was a problem hiding this comment.
I can probably also remove the current variable at this stage. But let's see first on how you want to progress.
429f496 to
7f19cae
Compare
eecacf1 to
72a0f83
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
You can also share your feedback on Copilot code review. Take the survey.
| # Use loop.create_task | ||
| # to avoid the extra function call in asyncio.create_task. | ||
| task = self.loop.create_task(target, name=name) | ||
| setattr(task, "ha_background_task", True) | ||
| self._background_tasks.add(task) | ||
| task.add_done_callback(self._background_tasks.remove) |
| current = asyncio.current_task() | ||
| # In this code I skip an extra (el)if, because we default to background_tasks | ||
| # Both tasks are now labeled and I believe this is the fastest approach | ||
| task_bucket = ( |
| def _executor_func() -> None: | ||
| """Run in executor.""" | ||
| nonlocal count | ||
| count += 1 | ||
|
|
||
| @core.callback | ||
| def _check_done(future: asyncio.Future) -> None: | ||
| """Check if all jobs are done.""" | ||
| if count == jobs_to_run: | ||
| event.set() | ||
|
|
| def _executor_func() -> None: | ||
| """Run in executor.""" | ||
| nonlocal count | ||
| count += 1 | ||
|
|
||
| @core.callback | ||
| def _check_done(future: asyncio.Future) -> None: | ||
| """Check if all jobs are done.""" | ||
| if count == jobs_to_run: | ||
| event.set() | ||
|
|
|
I ran 50 loops each of the two added benchmarks with PR code and the original code, and I can't find any evidence that this PR improves the runtime of the code. If anything, the I'm not sure this is a viable concept. @erwindouna I'm closing this now, please don't hesitate to open a new PR if you can reliably show the change in the PR improves performance. |
|
It was a micro-optimization and my benchmark was 4 months old already. Not quite sure what changed in the main time. |
Breaking change
Proposed change
The
async_add_executor_jobis rightfully considered an expensive resource. Upon digging in the code, I am in the assumption I found a micro-optimization, which introduces a fast path and hot path code pattern.Add an internal
ha_backgroundflag to background tasks and use it as a fast path inasync_add_executor_jobto avoid a set-member. I'm trying to leverage here, that an attribute check is faster than computing a hash, then check if there's a hit. Talking about micro/nanoseconds, but I thought it was worth a shot in profiling this:Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all.To help with the load of incoming pull requests: