fix: X shape check slices to first 2 shape values#2432
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2432 +/- ##
==========================================
- Coverage 87.62% 85.70% -1.92%
==========================================
Files 49 49
Lines 7690 7729 +39
==========================================
- Hits 6738 6624 -114
- Misses 952 1105 +153
|
| shape = getattr(value, "shape", None) | ||
| if shape is None: | ||
| return None | ||
| ndim = len(shape) |
There was a problem hiding this comment.
this also means that an array of shape (m, n, 1) can't be written to disk anymore, before this was ok.
Do you want to allow this or not?
|
Seems I can't request reviews, but I'd be interested in your comments at this stage :) |
| # Older / non-conforming files may contain higher-dimensional `X` or | ||
| # `layers`. The on-disk spec forbids that; surface it as a warning so | ||
| # the user knows, but still construct the AnnData with what's there. | ||
| _warn_if_x_or_layers_3d_kwargs(d) |
There was a problem hiding this comment.
I would move this warning into the construction of the AnnData object instead of reading (also in the other spots) so that people who do this in-memory will know their data is technically unwritable
There was a problem hiding this comment.
Adjusted to this, I updated the first PR comment accordingly to this behaviour. Assignment to .X and .layers[<key>] raise this warning now, too.
flying-sheep
left a comment
There was a problem hiding this comment.
Not much to add to what Ilan said, thank you this looks clean!
ilan-gold
left a comment
There was a problem hiding this comment.
Please add a release note and semantic commit title (see failing CI) :)
|
OK, this doesn’t actually close #2430 yet, so I changed that in the description. I also adapted the title. |
|
OK @ilan-gold in 3ce3ead, I made it so validation only happens when setting aligned mappings, with the goal of allowing to raise warnings in there without the warnings being displayed on each access, and the side-benefit of increased performance by not re-validating everything on access. This caused a very small breaking change illustrated by the change in 0e13d18: since we don’t go through all aligned mappings and re-validate everything whenever accessing one, we don’t aggressively re-sync |
ilan-gold
left a comment
There was a problem hiding this comment.
Nice, just the one thing!
ClosesOne step towardslayerscan be 3D on-disk #2430Before
higher-than-2D
Xwas rejected at construction; you could not hold a higher-dimensionalXin memory:higher-than-2D
layersslipped through the in-memory shape check (only axes 0 and 1 are validated) and the writer happily wrote them to disk, silently violating the spec.After
In-memory: higher-than-2D
Xandlayerswarns but still succeeds (the shape check usesX.shape[:2]).Writing a non-2D
X/ layer now hard-fails.Reading a non-conforming file warns but still succeeds:
Same applies to
layers["L"](error/warning message saysLayer 'L'instead ofX), and to thezarrIO path.