Skip to content

Conversation

@theirix
Copy link
Contributor

@theirix theirix commented Jan 29, 2026

Which issue does this PR close?

Rationale for this change

Similar to issue #19749 and the optimisation of left in #19980, it's worth doing the same for right

What changes are included in this PR?

  • Improve efficiency of the function by making fewer memory allocations and going directly to bytes, based on char boundaries

  • Provide a specialisation for StringView with buffer zero-copy

  • Use arrow_array::buffer::make_view for low-level view manipulation (we still need to know about a magic constant 12 for a buffer layout)

  • Benchmark - up to 90% performance improvement

right size=1024/string_array positive n/1024
                        time:   [24.286 µs 24.658 µs 25.087 µs]
                        change: [−86.881% −86.662% −86.424%] (p = 0.00 < 0.05)
                        Performance has improved.
right size=1024/string_array negative n/1024
                        time:   [29.996 µs 30.737 µs 31.511 µs]
                        change: [−89.442% −89.229% −89.003%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

right size=4096/string_array positive n/4096
                        time:   [105.58 µs 109.39 µs 113.51 µs]
                        change: [−86.119% −85.788% −85.497%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe
right size=4096/string_array negative n/4096
                        time:   [136.48 µs 138.34 µs 140.36 µs]
                        change: [−88.007% −87.848% −87.692%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

right size=1024/string_view_array positive n/1024
                        time:   [25.054 µs 25.500 µs 26.033 µs]
                        change: [−82.569% −82.285% −81.891%] (p = 0.00 < 0.05)
                        Performance has improved.
right size=1024/string_view_array negative n/1024
                        time:   [41.281 µs 42.730 µs 44.432 µs]
                        change: [−73.832% −73.288% −72.716%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

right size=4096/string_view_array positive n/4096
                        time:   [129.38 µs 133.69 µs 137.61 µs]
                        change: [−79.497% −78.998% −78.581%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
right size=4096/string_view_array negative n/4096
                        time:   [218.16 µs 229.41 µs 243.30 µs]
                        change: [−65.405% −63.622% −61.515%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

Are these changes tested?

  • Existing unit tests for right

  • Added more unit tests

  • Added bench similar to right.rs

  • Existing SLTs pass

Are there any user-facing changes?

No

@github-actions github-actions bot added the functions Changes to functions implementation label Jan 29, 2026
@theirix theirix marked this pull request as ready for review January 29, 2026 23:17
@theirix
Copy link
Contributor Author

theirix commented Jan 29, 2026

cc @Jefffrey


/// Calculate the byte length of the substring of last `n` chars from string `string`
/// (or all but first `|n|` chars if n is negative)
fn right_byte_length(string: &str, n: i64) -> usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked too closely, but I feel we can deduplicate right + left implementation code as the main difference is this byte length function? In that it flips which side it looks from?

},
(Some(string), Some(n)) => {
let byte_length = right_byte_length(string, n);
// println!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code accidentally added here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: optimise right for byte access and StringView

2 participants