Skip to content

Faster parsing#9

Open
drystone wants to merge 2 commits intojonhoo:mainfrom
drystone:parsing
Open

Faster parsing#9
drystone wants to merge 2 commits intojonhoo:mainfrom
drystone:parsing

Conversation

@drystone
Copy link
Copy Markdown

@drystone drystone commented Dec 3, 2025

Search only for semicolons.
Use temperature parsing to determine where newlines are.

@jonhoo
Copy link
Copy Markdown
Owner

jonhoo commented Dec 6, 2025

Oh, that's an interesting idea. Is it actually faster, and if so, by how much?

@drystone
Copy link
Copy Markdown
Author

drystone commented Dec 6, 2025

Yes, faster here by 20-30% on 8 cores (very erratic so it's hard to say exactly!):

john@ember:~/projects/brrr$ cargo build --release
   Compiling brrr v0.1.0 (/home/john/projects/brrr)
    Finished `release` profile [optimized + debuginfo] target(s) in 4.89s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m23.368s
user    2m17.429s
sys     0m9.485s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m23.860s
user    2m20.970s
sys     0m9.552s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m23.678s
user    2m19.277s
sys     0m9.611s
john@ember:~/projects/brrr$ git checkout -
HEAD is now at 0234dab Faster parsing
john@ember:~/projects/brrr$ cargo build --release
   Compiling brrr v0.1.0 (/home/john/projects/brrr)
    Finished `release` profile [optimized + debuginfo] target(s) in 4.75s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m12.057s
user    0m55.299s
sys     0m6.803s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m15.717s
user    1m17.742s
sys     0m9.271s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m15.207s
user    1m16.319s
sys     0m8.978s

@drystone
Copy link
Copy Markdown
Author

drystone commented Dec 6, 2025

Out of curiosity, I also tried a very sparse temperature lookup based on the last 4 bytes of the temperature string. This makes little difference (I was surprised), but it might be worth a try on a more powerful (more RAM) PC ...

    let mut stats = BTreeMap::new();
    let mut temps = vec![vec![0; 1_000_000_000], vec![0; 1_000_000_000]];
    for temp in 0..=999 {
        let strtemp = format!("{:.1}\n", temp as f32 / 10.);
        let slot = unsafe { u32::from_ne_bytes(*(strtemp.as_ptr() as *const [u8; 4])) };
        temps[0][slot as usize] = temp;
        temps[1][slot as usize] = -temp;
    }

and

fn parse_temperature(t: &[u8], temps: &[Vec<i16>]) -> (i16, usize) {
    let has_sign = (t[0] == b'-') as usize;
    let has_tens = (t[has_sign + 2] == b'.') as usize;
    let slot = unsafe { u32::from_ne_bytes(*(t.get_unchecked(has_sign..).as_ptr() as *const [u8; 4])) };
    (temps[has_sign][slot as usize], 3 + has_sign + has_tens)
}

Search only for semicolons.
Use temperature parsing to determine where newlines are.
@jonhoo
Copy link
Copy Markdown
Owner

jonhoo commented Dec 6, 2025

I'm curious how it does against what we have on main, which got quite a bit faster with #2 🤔

@drystone
Copy link
Copy Markdown
Author

drystone commented Dec 6, 2025

I rebased, so perhaps give it a try. Sadly new main now takes longer to run here ...

EDIT: Maybe 10% - but seems to be slower now, in combination with the new main:

HEAD is now at 8cc3e81 Make OS check run 10% of 1brc (#14)
john@ember:~/projects/brrr$ cargo build --release
   Compiling brrr v0.1.0 (/home/john/projects/brrr)
    Finished `release` profile [optimized + debuginfo] target(s) in 4.59s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m16.465s
user    1m29.360s
sys     0m7.845s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m19.280s
user    1m48.560s
sys     0m10.004s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m19.461s
user    1m48.576s
sys     0m10.058s
john@ember:~/projects/brrr$ git checkout -
Previous HEAD position was 8cc3e81 Make OS check run 10% of 1brc (#14)
Switched to branch 'parsing'
Your branch is up to date with 'origin/parsing'.
john@ember:~/projects/brrr$ cargo build --release
   Compiling brrr v0.1.0 (/home/john/projects/brrr)
    Finished `release` profile [optimized + debuginfo] target(s) in 4.86s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m15.159s
user    1m18.878s
sys     0m8.280s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m17.801s
user    1m37.112s
sys     0m9.765s
john@ember:~/projects/brrr$ time ./target/release/brrr >/dev/null
 
real    0m17.983s
user    1m37.932s
sys     0m9.777s

@drystone
Copy link
Copy Markdown
Author

drystone commented Dec 6, 2025

So on a better laptop, I'm seeing just shy of 10% improvement (1.2 seconds vs 1.3 seconds). Significantly though, this is now the fastest 1brc solution I have run (can we ignore the shell script wrapper?). This makes me breathe a huge sigh of relief 😅

john@ZA25080006-L:~/tmp/1brc$ time ./calculate_average_thomaswue.sh >/dev/null
Chosing to run the app in JVM mode as no native image was found, use prepare_thomaswue.sh to generate.

real    0m1.314s
user    0m23.644s
sys     0m0.690s
john@ZA25080006-L:~/tmp/1brc$ cd -
/home/john/tmp/brrr
john-hedges@ZA25080006-L:~/tmp/brrr$ cargo build --release
    Finished `release` profile [optimized + debuginfo] target(s) in 0.02s
john@ZA25080006-L:~/tmp/brrr$ time ./target/release/brrr >/dev/null

real    0m1.182s
user    0m19.102s
sys     0m0.561s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants