Skip to content

Fix EliasFanoVec::rank returning out-of-range index for last element of large bucket#41

Merged
Cydhra merged 4 commits into
Cydhra:masterfrom
rustamch:fix-elias-fano-rank-oob-bin-search-fallback
Apr 17, 2026
Merged

Fix EliasFanoVec::rank returning out-of-range index for last element of large bucket#41
Cydhra merged 4 commits into
Cydhra:masterfrom
rustamch:fix-elias-fano-rank-oob-bin-search-fallback

Conversation

@rustamch
Copy link
Copy Markdown
Contributor

@rustamch rustamch commented Apr 16, 2026

Summary

EliasFanoVec::rank(v) can return a value >= len() for a value v that is actually present in the vector, when v is the last element of an EF upper-bucket whose size exceeds BIN_SEARCH_THRESHOLD (= 4).

The bug is a single-line off-by-start_index_lower in the binary-search fallback of search_element_in_block::<INDEX=true, UPWARD=true> at elias_fano/mod.rs:386:

if INDEX {
    cursor = final_bound as isize + direction;  // ABSOLUTE
}
break;
// ... falls through to:
return if INDEX {
    start_index_lower as u64 + cursor as u64    // treats cursor as RELATIVE
}

final_bound is an absolute index into lower_vec, but the fallthrough return adds start_index_lower to cursor, double-counting it.

Discovery

Found via property testing in a downstream crate that stores two parallel EliasFanoVecs (term IDs and a per-term auxiliary value) and looks up the auxiliary via vec_b.get_unchecked(vec_a.rank(found)). For certain inputs, the bogus rank exceeds vec_b.len() and causes an out-of-bounds panic at bit_vec/mod.rs:1187 (get_bits_unchecked's data[pos / WORD_SIZE]).

A minimal counterexample (shrunk from 368 → 14 elements) is pinned as test_rank_last_element_of_large_bucket. For the 14-element vector, indices 5..9 all fall into upper-bucket 4; rank(term_at_idx_9) returns 14 (= len) before the fix and 9 after.

Fix

Subtract start_index_lower from cursor when reassigning it in the fallthrough path so the final return remains correct:

if INDEX {
    cursor = final_bound as isize + direction - start_index_lower as isize;
}

Verification

  • New regression test test_rank_last_element_of_large_bucket (fails before, passes after).
  • Full test suite: 199 unit + 72 doc tests pass (cargo test and cargo test --release).

Notes

Only search_element_in_block's INDEX=true fallthrough branch is affected. successor, predecessor, and get_unchecked — and rank for inputs whose bucket size is ≤ BIN_SEARCH_THRESHOLD — are unaffected. No public API changes.

rustamch and others added 4 commits April 16, 2026 22:01
`search_element_in_block::<INDEX=true, UPWARD=true>`'s binary-search
fallback sets `cursor = final_bound + direction`, but `final_bound` is
an ABSOLUTE position in `lower_vec` while the subsequent fallthrough
return `start_index_lower + cursor` treats `cursor` as RELATIVE. That
double-counts `start_index_lower`, so `rank(v)` can return a value
`>= len()` for a `v` that is actually in the vec.

Trigger: the query equals the last element of an EF upper-bucket of
size > BIN_SEARCH_THRESHOLD (= 4). Found via property testing; a
minimal 14-element case is pinned as `test_rank_last_element_of_large_bucket`.

Downstream, this bogus rank fed into a parallel EF vec's `get_unchecked`
caused an OOB panic at `bit_vec/mod.rs:1187` in production.

Fix: subtract `start_index_lower` when reassigning `cursor`, so the
fallthrough return still yields the correct absolute index.
@Cydhra Cydhra added the bug Something isn't working label Apr 17, 2026
@Cydhra Cydhra force-pushed the fix-elias-fano-rank-oob-bin-search-fallback branch from 85d0f38 to 6fea488 Compare April 17, 2026 09:01
@Cydhra Cydhra merged commit 91b91ea into Cydhra:master Apr 17, 2026
3 checks passed
@rustamch
Copy link
Copy Markdown
Contributor Author

Thanks, for merging fix so quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants