test: add Dask chunk grid benchmark scaffold#2465
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2465 +/- ##
=======================================
Coverage 85.60% 85.60%
=======================================
Files 49 49
Lines 7671 7671
=======================================
Hits 6567 6567
Misses 1104 1104 |
|
I updated the title to a semantic PR title. I don’t seem to have permission to add labels on this repo, but the current validation failures look like triage metadata rather than code failures. Could a maintainer please add the appropriate labels, likely |
So to this end, it might make sense for you @ehsanestaji to create a separate repo with benchmarks that can be run. This repo would produce a graph/table at the end and then propose changes/reasons for those changes. Does that sound reasonable? I don't really see why Then there would be a PR to update the defaults, link to the benchmarking effort, and perhaps write a small documentation note/page explaining things here. |
Summary
This adds an exploratory benchmark scaffold for #2036 so we can compare virtual Dask chunk choices against HDF5/Zarr on-disk chunk layouts before changing AnnData defaults.
The benchmark runner:
Xarrays with controlled HDF5/Zarr chunks and optional Zarr v3 shardsXlazily throughanndata.experimental.read_elem_lazyscanpy_normalize_log1pworkloadThis also adds a small notebook for summarizing the generated CSV and README instructions for smoke/larger-grid runs. Generated benchmark outputs are ignored under
benchmarks/results.Local signal
A modest local grid (
3000x800, HDF5/Zarr, on-disk chunks250x800and1000x800, default vs1000x-1, 1/2 workers,sum_axis0andscanpy_normalize_log1p) produced 32 rows. For small on-disk chunks (250x800),1000x-1reduced task counts and improved timings in the 1-worker Scanpy-style case by about 1.16x for HDF5 and 1.26x for Zarr.These numbers are only an initial local smoke signal; the intent is to make the benchmark/review path available before proposing default behavior changes.
Checks
ruff check benchmarks/scripts/dask_chunk_grid.py tests/test_dask_chunk_grid_script.py.venv/bin/python -m pytest tests/test_dask_chunk_grid_script.py -qpython3 -m json.tool benchmarks/notebooks/dask_chunk_grid_analysis.ipynbgit diff --check