[2/4] Add calculate_image_features with skimage + squidpy-native features#1190
Conversation
Introduces sq.experimental.im.calculate_image_features for per-cell feature extraction from segmentation masks. This PR ships the orchestration scaffold and the two non-cp_measure feature paths: - skimage: regionprops (mask + label+image, group-level and fine-grained) - squidpy: summary statistics, GLCM texture, colour histograms cp_measure integration (`cpmeasure:*` feature flags) and the `align_mode="rasterize"` branch land in follow-up PRs. Both are recognised here so users get a clear `NotImplementedError` rather than "unknown feature". `features=None` likewise raises -- the cp_measure-as-default behaviour from PR scverse#982 lands with cp_measure. `_classify_dropped_cells` (the per-cell drop-reason diagnostic) is deferred to a follow-up; `DropReport` still tracks `empty_tiles`. No new hard dependencies. Test plan: 29 tests in tests/experimental/test_calculate_image_features.py cover orchestration, skimage features, squidpy-native features, align_mode='strict' (happy path + raises-on-mismatch), cpmeasure:* rejection, align_mode='rasterize' rejection, channel selection, tiled-vs-single equivalence.
- Trim DropReport to `empty_tiles` (the only field this code ever increments) - Narrow `align_mode` Literal to "strict" and add a runtime guard that catches dynamic callers passing other values - Exclude `cpmeasure:*` flag names from the "available" list in the unknown-feature error (they always raise NotImplementedError; listing them as available is misleading) - Raise on ambiguous mixes of `skimage:label` with `skimage:label:<prop>` (and same for `skimage:label+image`); the previous order-dependent behaviour silently expanded the narrowing form - Collapse `pd.concat -> replace([inf,-inf],0) -> fillna(0) -> .values.astype(float32)` into a single numpy pass via `np.nan_to_num`; saves two full-table copies - Use `pd.Categorical.from_codes` for the region column to avoid allocating an N-element Python list for a one-level categorical - Hoist the labels_key/shapes_key XOR pick to one local - Add `experimental.im.calculate_image_features` to docs/api.md Also: remove all references to the multi-PR split (follow-up PRs, PR-2, PR-4, "in this PR", "cp_measure-as-default behaviour from PR scverse#982", TestPR982Concerns, "Concern N" markers) from code and tests.
| labels_key: str | None = None, | ||
| shapes_key: str | None = None, | ||
| scale: str | None = None, | ||
| channels: list[str] | list[int] | None = None, |
There was a problem hiding this comment.
list[int] errors I think, is this defered to the next PR's?
There was a problem hiding this comment.
Yeah exactly. Splitting the functionality up in some many PRs massively bloats the PR quite a bit so I didn't want to add code just to remove it tomorrow
| features | ||
| Which features to compute. Required (``None`` is rejected). | ||
| Accepts a list of strings: | ||
|
|
||
| - ``"skimage:label"`` (all mask props), ``"skimage:label:area"`` | ||
| (single prop), ``"skimage:label+image"`` (all intensity props), | ||
| ``"skimage:label+image:intensity_mean"`` (single prop) | ||
| - ``"squidpy:summary"``, ``"squidpy:texture"``, | ||
| ``"squidpy:color_hist"`` | ||
|
|
||
| ``cpmeasure:*`` flag names are recognised but currently raise |
There was a problem hiding this comment.
There should be ideally an explanation on each one here.
There was a problem hiding this comment.
Will defer this to the notebook that showcases the functionality. The names are fairly descriptive, these are distinct feature groups within the packages
selmanozleyen
left a comment
There was a problem hiding this comment.
label + image naming
Names skimage:label and skimage:label+image really don't make sense to me. If we are always going to need a label and an image why would we need to name it label+image? Why not call it morphology for "label" and "intensity" for "label+image" for example? Adding the + sign makes the syntax more confusing and is named based on implementation specific jargon.
mandatory image_key
I can see users getting annoyed when they just want "skimage:label" but don't want to provide image_key. Why is it not optional? I am not sure why the implementation here always needs an image for features that don't need an image. This needs to be solved somehow. Would this make sense calculate_image_features(sdata, labels_key="seg", features=["skimage:morphology"]) ?
|
Another comment I'd like to make is the folder and file structure. At this point I would rather have smaller but more files and folder nesting. I am saying this in general but also for this function I am not sure why it's not called |
Responds to @selmanozleyen's review of calculate_image_features. - Perf (L322): rewrite _compute_skimage_features to skimage.measure.regionprops_table - one vectorised Cython pass per branch instead of a per-region Python getattr loop; intensity props use a single multichannel call (moveaxis -> (y,x,c)) computing all channels at once, then remapped to <prop>_<channel>. Morphology props now use skimage's native flattened names (centroid-0, inertia_tensor-0-0). Verified value parity against the old path on an adversarial fixture (degenerate 1-2px cells, empty tile, multi-dim props, multi/single channel); regionprops_table is robust on degenerate cells (no raise/NaN), so no special-casing needed. Deletes _regionprops_to_row (+ its now-unused Any import). - API (L587): narrow `channels: list[str] | list[int]` -> `list[str]` (the body already rejects ints and the docstring already says names-only; the hint was lying). - Docs (L632): rewrite the `features` docstring into a brief grouping paragraph saying what each flag computes; add a short Examples block; drop the `cpmeasure:*` mention (the code still recognizes them and raises NotImplementedError - just undocumented until the cp-measure PR lands). Deferred (discussed separately with reviewer): numba-ifying _compute_squidpy_per_cell (L337) and the compute_cell_info_tiled centroid path (L573, in already-merged _tiling.py). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both branches of _compute_skimage_features repeated the same
table.pop('label') + pd.DataFrame(table, index=index) idiom (the intensity
branch with an extra column rename). Fold it into a small helper with an
optional rename callable. Behaviour-preserving (parity unchanged).
…s-scaffold # Conflicts: # src/squidpy/experimental/im/__init__.py
|
Hi one other comment I can make here is the codecov has many missing lines. Maybe we can add tests for those branches |
Add tests for previously-untested user-facing surface: the shapes_key rasterize path (with index<->cell correspondence), multiscale DataTree input, fine-grained feature-parse error messages, all-zero labels, the rasterize-failure wrapper, and the degenerate-GLCM branch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Increased coverage in this PR to 87 % 👌 |
…s.py Name the module after its public function, consistent with the sibling _detect_tissue.py / _make_tiles.py / _qc_image.py modules in experimental/im. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Rename skimage:label -> skimage:morphology and skimage:label+image -> skimage:intensity (fine-grained forms retained); the old names were skimage API jargon. Make image_key optional: it is required only for intensity / squidpy features and for shapes_key, so morphology features run from the labels alone, e.g. calculate_image_features(sdata, labels_key="seg", features=["skimage:morphology"]). Addresses review feedback (Selman) on the confusing flag names and the mandatory image. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse _image_requiring_features to a comprehension, dedup the pixel-grid check, fold the rasterize and summary-stats branches into expressions, drop comments that only restated the code, and convert remaining unicode (em-dash, arrows, x) to ASCII. Consolidate the optional-image error tests into one parametrized case and drop two redundant morphology tests. No behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Also renamed |
selmanozleyen
left a comment
There was a problem hiding this comment.
looks good, thanks for incorperating the changes
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params - purge non-ASCII em/en-dashes from text files repo-wide (ASCII-only policy)
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params - skimage backend: single regionprops_table pass for morphology+intensity (one find_objects instead of two) - purge non-ASCII em/en-dashes from text files repo-wide (ASCII-only policy)
Final piece of scverse#982 (split from scverse#1189, scverse#1190): wires cp_measure into calculate_image_features and adds image/labels coordinate-system alignment (align_mode strict/rasterize), with boundary-cell drop accounting. - deps: cp-measure>=0.1.19,<0.2, centrosome>=1.2.3 (hard) - features=None now runs all features across all backends - _tiling: bbox_y0/bbox_x0 on CellInfo; _Accum accumulator; shared yx_size - key_added (was adata_key_added); keyword-only params - skimage backend: single regionprops_table pass for morphology+intensity - dedupe skimage morphology vs cp_measure:sizeshape when both requested (cp wins, computed identically; skimage-only props kept) - purge non-ASCII em/en-dashes from text files repo-wide (ASCII-only policy)
PR 2/4 of the #982 split (plan: PR-1 tests, PR-2 skimage+squidpy features, PR-3 alignment, PR-4 cp_measure).
Introduces
squidpy.experimental.im.calculate_image_featuresfor per-cell feature extraction from segmentation masks. This is the foundation that PR-3 (alignment) and PR-4 (cp_measure) build on; cp_measure integration itself lands in PR-4 so the new hard-dependency conversation (cp-measure,centrosome) is isolated.What's in
regionpropsfeatures (skimage:label,skimage:label+image, fine-grained per-property).squidpy:summary,squidpy:texture(GLCM),squidpy:color_hist.DropReport, validation, lazy_prepare_lazy, centroid computation, parallel tile dispatch viajoblib.calculate_image_features(sdata, image_key, labels_key=None, shapes_key=None, features=..., align_mode="strict", ...).What's deferred to follow-up PRs
cpmeasure:*feature flags (PR-4). Flag names are recognised here and raiseNotImplementedErrorpointing to the follow-up, not "unknown feature".align_mode="rasterize"(PR-3). RaisesNotImplementedError. PR-2 supports onlyalign_mode="strict"(image and labels must share the same pixel grid)._classify_dropped_cells(per-cell drop-reason diagnostic). Deferred;DropReportstill tracksempty_tiles.features=Nonenow raises. The cp_measure-as-default behaviour lands with PR-4.No new hard dependencies
PR-2 adds zero new
pyproject.tomldeps.cp-measureandcentrosomeland with PR-4.