Skip to content

Fix integer-overflow in decluster type dispatch (#546)#620

Open
gaoflow wants to merge 1 commit into
eqcorrscan:masterfrom
gaoflow:fix/decluster-index-overflow-546
Open

Fix integer-overflow in decluster type dispatch (#546)#620
gaoflow wants to merge 1 commit into
eqcorrscan:masterfrom
gaoflow:fix/decluster-index-overflow-546

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 1, 2026

Copy link
Copy Markdown

What does this PR do?

Fixes the unreported integer-overflow in decluster / decluster_distance_time reported in #546.

Both functions selected the ctypes integer type and the matching C routine with this loop:

for var in [index.max(), trig_int]:
    if var == ctypes.c_long(var).value:
        long_type = ctypes.c_long;     func = utilslib.decluster
    elif var == ctypes.c_longlong(var).value:
        long_type = ctypes.c_longlong; func = utilslib.decluster_ll
    else:
        raise OverflowError(...)

The loop has no break, so the last value (trig_int) decides the type. When index.max() is large enough to need c_longlong (e.g. multiple days of samples) but the trailing trig_int fits in c_long, the choice made for index.max() is overwritten and the narrow c_long routine is used.

On platforms where c_long is 32-bit (e.g. Windows, which is where #546 was reported), the index array is then cast to 32-bit and truncated, corrupting the declustering and producing the reported IndexError: index 0 is out of bounds for axis 0 with size 0 downstream.

Reproduction

Modelling the Windows layout (c_long = 32-bit, c_longlong = 64-bit), the old dispatch on a large index with a small trigger interval:

index.max() = 5e13, trig_int = 100
OLD picks: c_long / decluster     <-- wrong: 5e13 overflows a 32-bit long
index.max() truncated to 32-bit: -2009260032  (should be 50000000000000)

The existing *_longlong tests don't catch this because they scale both index and trig_int by 1e10, so the trailing trig_int is also large and the last iteration happens to pick longlong too.

Fix

Select the narrowest integer type wide enough for every value, via a small _get_func_and_type helper used by both functions. Behaviour is unchanged on 64-bit-long platforms and in the both-values-large case; only the previously-broken large-index/small-trig case now correctly dispatches to the _ll routine.

Tests

Added test_dispatch_picks_widest_type, a platform-independent regression test that injects a 32-bit c_long / 64-bit c_longlong layout so the regression is exercised even where c_long is natively 64-bit. It fails against the old last-iteration-wins logic and passes with the fix. (The compiled libutils is not required for this test.)

decluster and decluster_distance_time chose the ctypes integer type and
matching C routine by looping over [index.max(), trig_int] and letting the
last value win. A large index.max() (e.g. many days of samples) that needs
c_longlong was therefore silently downgraded back to c_long whenever the
trailing trig_int fit in c_long. On platforms where c_long is 32-bit (e.g.
Windows) this truncated the index array, producing corrupt declustering and
the reported 'index 0 is out of bounds' IndexError downstream.

Select the narrowest type wide enough for *all* values via a new
_get_func_and_type helper, and add a platform-independent regression test
that injects a 32-bit c_long / 64-bit c_longlong layout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant