Fix: seed sequence permutation tests and make results independent of n_jobs #1234
Fix: seed sequence permutation tests and make results independent of n_jobs #1234selmanozleyen wants to merge 7 commits into
Conversation
The seeds for permutation/simulation tests are now spawned per permutation from a numpy.random.SeedSequence. This makes results independent of `n_jobs`/`backend`, but changes the results obtained with a given `seed` relative to earlier squidpy versions. Document this with a `.. versionchanged:: 1.8.4` note on the affected public functions (`ligrec`, `nhood_enrichment`, `spatial_autocorr`, `ripley`). The shared note for the permutation-based functions lives in a single docrep template (`seed_versionchanged`) to avoid duplication; `ripley` keeps a tailored note as it concerns simulations. Also add a release-notes entry. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9a6008e to
dbc04f7
Compare
please explain: why does the seed have to go inside of numba? Numba supports |
Oof, I didn't know about this. Thank you! Then we can fully modernize as well |
|
first check if the methods you need work, I tried before and it failed since whatever distribution I tried didn’t work |
|
now I remember why I discarded generators. It was because it was documented that Generator was not thread safe but I guess it should be fine if we have a generator for each permutation that's used once I guess... But another point is |
Fixes: #1233 and fixes #1232 as a side effect.
For each permutation we have a separate seed. The seeds are generated by
SeeqSequence(root_seed).spawn(x)routine. This will prevent having correlated results from sequential seeds mentioned in the numpy docs. I tried to use the best practice but since we need to use numba in the future our seeds can be onlySequence[int],instead of Sequence[np.random.SeedSequence]. Hence:Once you confirm. I will write warnings in docstrings and in notebooks about this behavioral change to users.