Skip to content

Parser architecture rework#4

Draft
xnacly wants to merge 8 commits intomasterfrom
parser-architecture-rework
Draft

Parser architecture rework#4
xnacly wants to merge 8 commits intomasterfrom
parser-architecture-rework

Conversation

@xnacly
Copy link
Owner

@xnacly xnacly commented Feb 18, 2026

This pr aims to rework the parser to remove intermediate values, omittable allocations and in general be faster due to less recursion and less gc pressure.

Goals:

  • replace recursion in the parser with an explicit stack for intermediate containers and jumping around inside of Parser.parse based on token type (this should also account for feat: recursive json object test #2, since the stack overflow for large recursion is now replaced by OOM 😹)
  • replace inline allocations with a preallocated sync.Pool of possible json values
  • replace io.ReadAll with a syscall.Mmap in a new libjson.FromFile function
  • deal with escapes in strings, which somehow is still missing
  • add t_string_escapes to only pay the cost of calling unescapeInPlace for strings containing escapes

Pre pr Benchmarks:

Test input is generated with test/gen.py

Input size library time faster
1MB libjson 9.6ms 1.57x
encoding/json 15.0ms
2MB libjson 36.2ms 1.82x
encoding/json 65.9ms
5MB libjson 74.1ms 1.71x
encoding/json 126.6ms

encoding/json

nogc
$ go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 120ms, 100% of 120ms total
Showing top 10 nodes out of 36
      flat  flat%   sum%        cum   cum%
      30ms 25.00% 25.00%       40ms 33.33%  encoding/json.(*Decoder).readValue
      10ms  8.33% 33.33%       10ms  8.33%  encoding/json.(*decodeState).convertNumber
      10ms  8.33% 41.67%       20ms 16.67%  encoding/json.(*decodeState).literalInterface
      10ms  8.33% 50.00%       10ms  8.33%  encoding/json.stateEndValue
      10ms  8.33% 58.33%       10ms  8.33%  internal/chacha8rand.(*State).Next
      10ms  8.33% 66.67%       10ms  8.33%  internal/runtime/maps.(*ctrlGroup).setEmpty
      10ms  8.33% 75.00%       10ms  8.33%  runtime.convTslice
      10ms  8.33% 83.33%       10ms  8.33%  runtime.getMCache
      10ms  8.33% 91.67%       10ms  8.33%  runtime.memclrNoHeapPointers
      10ms  8.33%   100%       10ms  8.33%  runtime.memmove
$ hyperfine "go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json
  Time (mean ± σ):     163.9 ms ±   3.4 ms    [User: 130.0 ms, System: 88.7 ms]
  Range (min … max):   159.0 ms … 171.9 ms    17 runs
gc
$  go run cmd/lj.go -libjson=false -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 150ms, 83.33% of 180ms total
Showing top 10 nodes out of 55
      flat  flat%   sum%        cum   cum%
      20ms 11.11% 11.11%       40ms 22.22%  encoding/json.(*Decoder).readValue
      20ms 11.11% 22.22%       20ms 11.11%  internal/runtime/gc/scan.scanSpanPackedAVX512
      20ms 11.11% 33.33%       20ms 11.11%  runtime.memclrNoHeapPointers
      20ms 11.11% 44.44%       20ms 11.11%  runtime.memmove
      20ms 11.11% 55.56%       20ms 11.11%  runtime.suspendG
      10ms  5.56% 61.11%       30ms 16.67%  encoding/json.(*decodeState).scanWhile
      10ms  5.56% 66.67%       10ms  5.56%  encoding/json.(*scanner).pushParseState
      10ms  5.56% 72.22%       20ms 11.11%  encoding/json.stateBeginValue
      10ms  5.56% 77.78%       10ms  5.56%  encoding/json.stateEndValue
      10ms  5.56% 83.33%       10ms  5.56%  internal/runtime/maps.(*ctrlGroup).setEmpty
$ hyperfine "go run cmd/lj.go -libjson=false -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -libjson=false -s -pprof test/10MB.json
  Time (mean ± σ):     160.5 ms ±   3.0 ms    [User: 186.3 ms, System: 79.4 ms]
  Range (min … max):   156.3 ms … 167.9 ms    18 runs

libjson

nogc
$ go run cmd/lj.go -nogc -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 60ms, 100% of 60ms total
Showing top 10 nodes out of 31
      flat  flat%   sum%        cum   cum%
      20ms 33.33% 33.33%       20ms 33.33%  runtime.memclrNoHeapPointers
      10ms 16.67% 50.00%       10ms 16.67%  github.com/xnacly/libjson.(*lexer).next
      10ms 16.67% 66.67%       10ms 16.67%  github.com/xnacly/libjson.pow10 (inline)
      10ms 16.67% 83.33%       10ms 16.67%  internal/runtime/maps.(*ctrlGroup).setEmpty (inline)
      10ms 16.67%   100%       10ms 16.67%  runtime.rand
         0     0%   100%       10ms 16.67%  github.com/xnacly/libjson.(*parser).advance
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).array
         0     0%   100%       20ms 33.33%  github.com/xnacly/libjson.(*parser).atom
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).expression
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).object
$ hyperfine "go run cmd/lj.go -nogc -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -nogc -s -pprof test/10MB.json
  Time (mean ± σ):     106.6 ms ±   2.1 ms    [User: 84.9 ms, System: 76.6 ms]
  Range (min … max):   103.8 ms … 110.7 ms    28 runs
gc
$ go run cmd/lj.go -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 100ms, 100% of 100ms total
Showing top 10 nodes out of 39
      flat  flat%   sum%        cum   cum%
      10ms 10.00% 10.00%       10ms 10.00%  github.com/xnacly/libjson.pow10
      10ms 10.00% 20.00%       10ms 10.00%  internal/runtime/gc/scan.scanSpanPackedAVX512
      10ms 10.00% 30.00%       10ms 10.00%  internal/runtime/maps.(*ctrlGroup).setEmpty
      10ms 10.00% 40.00%       10ms 10.00%  runtime.acquirem (inline)
      10ms 10.00% 50.00%       10ms 10.00%  runtime.findObject
      10ms 10.00% 60.00%       10ms 10.00%  runtime.heapArenaOf
      10ms 10.00% 70.00%       30ms 30.00%  runtime.mallocgcSmallScanNoHeader
      10ms 10.00% 80.00%       10ms 10.00%  runtime.memclrNoHeapPointers
      10ms 10.00% 90.00%       10ms 10.00%  runtime.nextFreeFast (inline)
      10ms 10.00%   100%       10ms 10.00%  runtime.typePointers.next
$ hyperfine "go run cmd/lj.go -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -s -pprof test/10MB.json
  Time (mean ± σ):     105.5 ms ±   2.5 ms    [User: 116.6 ms, System: 77.9 ms]
  Range (min … max):   101.5 ms … 110.0 ms    27 runs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments