perf(numba/sizeshape): share the foreground pass across primitives#79
Merged
Merged
Conversation
…ives The four sizeshape primitives (spatial_moments_2d, convex_area_2d, perimeter_2d, crofton_euler_2d) each recomputed labels_to_offsets over the full raster, so it ran 4x per get_sizeshape call. Add a _Prep NamedTuple wrapping the labels_to_offsets result (lut, n, offsets) — the only prep every primitive shares — plus a _foreground_prep() helper; the primitives now take an optional `prep` arg and _sizeshape_2d threads one prep into all four. Each primitive still computes its own prep when called standalone (prep=None), so the public functions and their tests are unchanged. The full foreground pixel list (rows/cols/obj nonzero) is needed only by the moment kernel, so it stays inside spatial_moments_2d rather than the shared prep — standalone perimeter_2d/crofton_euler_2d/convex_area_2d no longer pay for a full-raster nonzero they never read. convex_area_2d does its own nonzero over the boundary mask (sparse, not redundant). Removes 3 redundant labels_to_offsets passes: ~9 ms / ~5% on the 1080^2 / 132-object tile. 24 tests unchanged (bit-exactness, dispatch, edge cases); ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
75d10e6 to
a4974d0
Compare
Collaborator
Author
|
Folded into #78. The shared-prep optimization only touches |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #78. Internal optimization of the numba sizeshape lane — no behavior change.
Problem
The four primitives in
core/numba/_sizeshape.pyeach recomputedlabels_to_offsets(labels)over the full raster independently, so it ran 4× perget_sizeshapecall (moments, convex, perimeter, crofton/euler).Fix
_PrepNamedTuple wrapping thelabels_to_offsetsresult (lut,n,offsets) — the only prep every primitive shares — plus_foreground_prep(labels).preparg;_sizeshape_2dcomputes one prep and threads it into all four.prep=None), so the public functions and their tests are unchanged.nonzero) is needed only by the moment kernel, so it stays insidespatial_moments_2d— standaloneperimeter_2d/crofton_euler_2d/convex_area_2dno longer pay for a full-rasternonzerothey never read.convex_area_2ddoes its ownnonzeroover the boundary mask (sparse — not redundant).Impact
Removes 3 redundant
labels_to_offsetspasses: ~9 ms / ~5% on the 1080² / 132-object tile.Verification
ruff check+ruff format --check: clean.test/test_sizeshape_numba.py: 24 passed — bit-exactness, dispatch, and edge cases unchanged.🤖 Generated with Claude Code