Faster split_at_semicolon by b-i-z · Pull Request #17 · jonhoo/brrr

b-i-z · 2025-12-08T02:40:35Z

The semicolon is either found at buffer[buffer.len()-4] or buffer[buffer.len()-5], otherwise is assumed to be at buffer[buffer.len()-6]. About 5% faster overall with no looping.

Buffer always has length >= 5 characters, because station name has a least 1 byte and there's a semicolon, so it's safe to check at buffer.len()-5.

ruanLN · 2025-12-11T11:48:58Z

+        if !found {
+            total += 1
+        };
+        found |= *buffer.get_unchecked(pos - 1) == b';';
+        if !found {
+            total += 1
+        };
+        pos = pos - total;


How does changing pos directly compare? it should save a couple operations

Suggested change

if !found {

total += 1

};

found |= *buffer.get_unchecked(pos - 1) == b';';

if !found {

total += 1

};

pos = pos - total;

if !found {

pos -= 1;

};

found |= *buffer.get_unchecked(pos) == b';';

if !found {

pos -= 1

};

another idea with some bool magic to be branchless:

Suggested change

if !found {

total += 1

};

found |= *buffer.get_unchecked(pos - 1) == b';';

if !found {

total += 1

};

pos = pos - total;

pos -= (1 - found as i16)

found |= *buffer.get_unchecked(pos) == b';';

pos -= (1 - found as i16)

I think that changing pos for the next read based on the previous read could cause a pipeline stall, or at least a delay for a few cycles until it knows what value was read (Note: using found |= ... is not short-circuiting, so it will still read from the location. Using found = found || .. would short circuit, but would introduce branching). Perhaps the compiler could mask the delay by inserting other instructions for the CPU to perform while waiting for the result, but perhaps not. Reading just the 4th and 5th bytes from the end and basing all calculations on those bytes would probably be better.

It could be written a bit faster/clearer perhaps:

Suggested change

if !found {

total += 1

};

found |= *buffer.get_unchecked(pos - 1) == b';';

if !found {

total += 1

};

pos = pos - total;

let not_found1 = (*buffer.get_unchecked(pos) != b';') as usize;

let not_found2 = (*buffer.get_unchecked(pos - 1) != b';') as usize;

pos -= (not_found1 + (not_found1 & not_found2));

Aside: Parsing the temperature would probably be faster if it were based solely on the last 5 bytes of the line, rather than being split into a variable length slice (but semicolon location is still useful for the string length).

I think you will be right about the stall of the operations as the cpu will not be able to predict the position.

Faster split_at_semicolon, only checking 2 bytes and no looping.

142a851

ruanLN reviewed Dec 11, 2025

View reviewed changes

b-i-z added 2 commits December 16, 2025 18:42

Update main.rs

2b1d1c6

Merge branch 'jonhoo:main' into faster_split_at_semicolon

2b95f3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster split_at_semicolon#17

Faster split_at_semicolon#17
b-i-z wants to merge 3 commits intojonhoo:mainfrom
b-i-z:faster_split_at_semicolon

b-i-z commented Dec 8, 2025

Uh oh!

ruanLN Dec 11, 2025

Uh oh!

b-i-z Dec 12, 2025

Uh oh!

ruanLN Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

b-i-z commented Dec 8, 2025

Uh oh!

ruanLN Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

b-i-z Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

ruanLN Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants