feat(numba/sizeshape): unified numba sizeshape backend (~3x, bit-exact)#78
Draft
timtreis wants to merge 8 commits into
Draft
feat(numba/sizeshape): unified numba sizeshape backend (~3x, bit-exact)#78timtreis wants to merge 8 commits into
timtreis wants to merge 8 commits into
Conversation
First piece of the unified numba sizeshape lane. `regionprops_table` computes the spatial/central/normalized/Hu moments (and inertia) per-region via an einsum whose contraction path is re-derived for every object. This adds a fused numba kernel (`core/numba/_sizeshape.py::_moment_kernel`) that computes the 16 raw + 16 central moment matrices in two passes over the foreground pixels (bbox-min tracked inline), using `labels_to_offsets` for the label->row map. The derived quantities (normalized / Hu / inertia) live in the shared `primitives/_moments.py` (refactored to expose `derive_normalized_hu` / `normalized_from_central` / `hu_from_normalized` / `inertia_2d`), so the numpy scatter and the numba kernel share one derivation. `spatial_moments_2d` (numba) is a drop-in for the numpy accumulator: same object order, bit-identical raw/central (0.00), normalized/Hu/inertia to round-off. Measured: spatial moments 59 -> 17 ms (3.5x), large tile. Golden tests vs the numpy accumulator AND regionprops (multi / non-contiguous / edge-touching / single-pixel / inertia / empty). NOT yet wired: the full `get_sizeshape` numba wrapper + dispatch registration, and the convex-hull / perimeter / euler kernels (next phases). Built on the integration numba stack; rebases onto main+#77 when those land. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…~2.8x) Adds `convex_area_2d` to the numba sizeshape lane. skimage's `area_convex` is the pixel count inside each object's convex hull (pixels offset by ±0.5 diamond points, counted by `grid_points_in_poly`). The slow part is the per-region hull construction (scipy QHull + python overhead); the rasteriser is fast and convention-specific. So we replace ONLY the hull construction with a fused numba monotone-chain (`_hull_kernel`) over each object's BOUNDARY pixels (hull(boundary) == hull(object), ~6x fewer points), and KEEP skimage's exact `grid_points_in_poly` for the raster. Result is bit-exact (proven 142/142) without porting the risky pnpoly. Coordinates are x2-scaled so the offset points are integers and the monotone-chain hull is exact; each hull is rasterised in its bbox-local frame. Degenerate objects (single pixel / 1-wide line) fall back to the pixel count, matching skimage. Measured: area_convex 112 -> 40 ms (~2.8x), large tile, 142/142 bit-exact. Golden tests vs regionprops (multi / irregular / non-contiguous / edge-touching / degenerate / empty). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ct, ~5.6x) Adds the three deterministic neighbour-pattern features. skimage runs them per-region with C convolutions; a whole-image label-aware numba pass reproduces them bit-exact (each object's pattern histogram equals running skimage on the isolated object — other labels read as background — and skimage's per-region 1px pad is the whole-image edge pad). - `perimeter_2d` (4-connectivity): border image (fg with a non-same-label 4-neighbour), then per border pixel value = 1 + 2*(same-object 4-conn border) + 10*(same-object diagonal border), weighted by skimage's LUT. Only border-centre (odd) values carry weight, so non-border pixels are skipped. - `crofton_euler_2d`: shares the 2x2 binary-config histogram (skimage's XF kernel, 16 bins); perimeter_crofton = crofton_coefs @ h, euler_number = euler_8conn_coefs @ h. Measured: perimeter 3.5 ms, crofton+euler 3.8 ms vs skimage's 41 ms for all three (~5.6x). Bit-exact vs regionprops: perimeter/crofton ~1e-13, euler exactly 0. Golden tests incl. a holed object (euler topology), non-contiguous, edge-touching, irregular. Lane status: moments+inertia, convex hull, perimeter/crofton/euler all implemented & bit-exact. Remaining: the get_sizeshape wrapper + dispatch registration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion B, ~3x)
Ties the kernels into the full `get_sizeshape` feature dict and registers the numba
sizeshape backend (previously a NO-GO). Option B: axes / eccentricity / orientation are
derived from the kernel central moments (`primitives/_moments.axes_eccentricity_orientation`,
bit-exact incl. the symmetric ±pi/4 fallback), so the regionprops call is fully moment-free
(verified 0 ms einsum) — only cheap props (area/bbox/centroid/extent/area_filled/image) and
the scipy Euclidean EDT radius loop stay on host.
Sources: moments/inertia + axes/ecc/orientation -> moment kernel; area_convex/solidity ->
hull kernel; perimeter/crofton/euler -> pattern kernels; radii -> scipy EDT (kept). `to_bzyx`
2D-only; 3D volumes fall back to the numpy baseline.
Also fixes `bulk._numba_registries`, which the integration merge had left as un-parseable
garbage (interleaved docstrings / em-dashes / duplicate returns) — rewritten cleanly to
compose all numba backends (intensity, granularity, zernike, radial_zernikes,
radial_distribution, texture, feret, sizeshape, coloc+costes) and registers `sizeshape`.
Measured: get_sizeshape 305 -> 100 ms (~3x), all 78 features match the numpy backend
(raw moments / area_convex / euler exact; the rest <=4e-7). Tests: end-to-end vs numpy
get_sizeshape (multi / non-contiguous / calculate_advanced+new_features variants), 3D
fallback, and set_accelerator("numba") dispatch routing.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6bacbef to
79d7336
Compare
…ives The four sizeshape primitives (spatial_moments_2d, convex_area_2d, perimeter_2d, crofton_euler_2d) each recomputed labels_to_offsets over the full raster, so it ran 4x per get_sizeshape call. Add a _Prep NamedTuple wrapping the labels_to_offsets result (lut, n, offsets) — the only prep every primitive shares — plus a _foreground_prep() helper; the primitives now take an optional `prep` arg and _sizeshape_2d threads one prep into all four. Each primitive still computes its own prep when called standalone (prep=None), so the public functions and their tests are unchanged. The full foreground pixel list (rows/cols/obj nonzero) is needed only by the moment kernel, so it stays inside spatial_moments_2d rather than the shared prep — standalone perimeter_2d/crofton_euler_2d/convex_area_2d no longer pay for a full-raster nonzero they never read. convex_area_2d does its own nonzero over the boundary mask (sparse, not redundant). Removes 3 redundant labels_to_offsets passes: ~9 ms / ~5% on the 1080^2 / 132-object tile. 24 tests unchanged (bit-exactness, dispatch, edge cases); ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…apper Address /code-review notes on the sizeshape lane: - Extract `moment_feature_dict` into `primitives/_moments.py` as the single source of truth for the 53 `calculate_advanced` moment/inertia feature names and the (p,q) orders exposed. The numba `_sizeshape_2d` now calls it instead of building those keys with inline f-string loops, removing the duplication of the numpy assembly and the f-string/constant drift risk. The numpy `get_sizeshape` adopts the same helper when #77's moment rewrite rebases onto this lane (the sizeshape golden test cross-checks the keys vs the numpy F_* constants meanwhile). - Clarify the wrapper: `pixels` is accepted only for dispatch-signature parity and is unused (sizeshape is purely geometric, like the numpy backend); pass `masks` to `to_bzyx` for axis normalisation and drop the computed-then-discarded `_pixels_zyx`. - Add an end-to-end empty-mask test exercising the full dict assembly with zero objects (the kernels had empty tests; the wrapper assembly did not). 25 tests pass (incl. new empty case); ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…_dict order Mirror the numpy-side fixes so the shared _moments.py converges: - inertia_2d clips eigenvalues to >=0 like skimage, so thin/oblique objects get axis_minor=0 / eccentricity=1 instead of NaN (axes_eccentricity_orientation reuses inertia_2d, so the fix propagates to the axis features too). - moment_feature_dict emits the moment keys in the grouped order of the PyPI 0.1.19 release (all Spatial, then Central, then Normalized, ...) instead of interleaving Spatial/Central by (p, q) — the numba get_sizeshape output column order now matches the numpy backend and the release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ssertion A missing ')' in test_set_accelerator_numba_composes_with_numpy made the whole test_backend_correctness.py file uncollectable, which masked a stale assertion: it expected sizeshape to stay on the numpy backend, but this branch adds a numba sizeshape backend, so under the numba accelerator it composes cp_measure.core.numba._sizeshape. Close the paren and assert the numba module. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
sizeshapewas the one numba NO-GO and the largest remaining component of the numba pipeline. This adds a full numba backend by reimplementing every numba-addressable primitive bit-exact and keeping only the genuine import-boundary work (scipy Euclidean EDT) on host._moment_kernel+ shared derivationarea_convex/soliditygrid_points_in_polyOption B (deriving axes/ecc/orientation from the central moments) makes the regionprops call fully moment-free — verified 0 ms einsum.
Performance
get_sizeshape: 305 → 100 ms (~3×) on the large tile (1080²/142 obj). All 78 features match the numpy backend (raw moments /area_convex/ euler exact; everything else ≤ 4e-7).Design
labels_to_offsets) feeds the kernels;to_bzyx2D-only, 3D volumes fall back to the numpy baseline.grid_points_in_polyrasteriser — no risky pnpoly port — and only replaces the slow QHull construction with a numba monotone-chain over boundary pixels (hull(boundary) == hull(object)).axes_eccentricity_orientationadded toprimitives/_moments.pyso the numpy lane (perf(sizeshape): scatter-based spatial moments + inertia, replacing regionprops einsum (~1.6x) #77) can later adopt the same einsum elimination.bulk._numba_registries(the integration merge had left it un-parseable) to compose all numba backends and registersizeshape;set_accelerator("numba")now routes it.Tests
24 golden tests: each kernel vs
regionprops(multi / non-contiguous / edge-touching / degenerate / holed-object euler / empty), end-to-endget_sizeshapevs the numpy backend (incl.calculate_advanced/new_featuresvariants), 3D fallback, and dispatch routing. ruff clean.