Fix integer-overflow in decluster type dispatch (#546)#620
Open
gaoflow wants to merge 1 commit into
Open
Conversation
decluster and decluster_distance_time chose the ctypes integer type and matching C routine by looping over [index.max(), trig_int] and letting the last value win. A large index.max() (e.g. many days of samples) that needs c_longlong was therefore silently downgraded back to c_long whenever the trailing trig_int fit in c_long. On platforms where c_long is 32-bit (e.g. Windows) this truncated the index array, producing corrupt declustering and the reported 'index 0 is out of bounds' IndexError downstream. Select the narrowest type wide enough for *all* values via a new _get_func_and_type helper, and add a platform-independent regression test that injects a 32-bit c_long / 64-bit c_longlong layout.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes the unreported integer-overflow in
decluster/decluster_distance_timereported in #546.Both functions selected the ctypes integer type and the matching C routine with this loop:
The loop has no
break, so the last value (trig_int) decides the type. Whenindex.max()is large enough to needc_longlong(e.g. multiple days of samples) but the trailingtrig_intfits inc_long, the choice made forindex.max()is overwritten and the narrowc_longroutine is used.On platforms where
c_longis 32-bit (e.g. Windows, which is where #546 was reported), the index array is then cast to 32-bit and truncated, corrupting the declustering and producing the reportedIndexError: index 0 is out of bounds for axis 0 with size 0downstream.Reproduction
Modelling the Windows layout (
c_long= 32-bit,c_longlong= 64-bit), the old dispatch on a large index with a small trigger interval:The existing
*_longlongtests don't catch this because they scale bothindexandtrig_intby1e10, so the trailingtrig_intis also large and the last iteration happens to picklonglongtoo.Fix
Select the narrowest integer type wide enough for every value, via a small
_get_func_and_typehelper used by both functions. Behaviour is unchanged on 64-bit-longplatforms and in the both-values-large case; only the previously-broken large-index/small-trig case now correctly dispatches to the_llroutine.Tests
Added
test_dispatch_picks_widest_type, a platform-independent regression test that injects a 32-bitc_long/ 64-bitc_longlonglayout so the regression is exercised even wherec_longis natively 64-bit. It fails against the old last-iteration-wins logic and passes with the fix. (The compiledlibutilsis not required for this test.)