Skip to content

Improve pure-Ruby parser caching and literal alternatives#22

Open
suleman-uzair wants to merge 20 commits into
mainfrom
parser-performance-and-dynamic-callbacks
Open

Improve pure-Ruby parser caching and literal alternatives#22
suleman-uzair wants to merge 20 commits into
mainfrom
parser-performance-and-dynamic-callbacks

Conversation

@suleman-uzair

@suleman-uzair suleman-uzair commented May 21, 2026

Copy link
Copy Markdown

This PR:

  • improves pure-Ruby parser-class cache defaults so recursive grammars can memoize immediately without changing direct atom-level defaults
  • scopes parser cache entries by consume_all mode and preserves the Parslet-compatible prefix-success reuse for ordered choice
  • never memoizes results that evaluate dynamic {} blocks or write captures, so cache thresholds stay performance-only
  • adds internal literal-prefix indexing for large Alternative choices while preserving ordered-choice semantics and complete failure reporting
  • keeps unsafe/dynamic branches on the normal scan path and freezes alternatives so the lazily built index cannot go stale
  • fixes repetition interval/tree memoization replay, consume-all rechecks, and stale-entry eviction
  • improves interval-tree same-start lookup used by interval-cache prefix-success checks
  • fixes correctness issues in Slice#hash, Buffer#clear!, and Source::LineCache
  • repairs benchmark/benchmark_suite.rb and removes the legacy ffi-hash/ffi-json benchmark plumbing superseded by the unified parsanol-rs parse() API
  • updates the benchmark docs and HISTORY.txt to match the current backends and the behavior changes above
  • adds cache-threshold, literal-choice, interval-cache, cache-safety, and parser regression coverage

This affects the pure-Ruby parser path only. Native parsing continues through the Rust-backed parser path.

@suleman-uzair suleman-uzair changed the title [WIP] Speed up parser workloads and fix dynamic callbacks [WIP] Improve parser-class cache threshold defaults May 22, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines memoization behavior by introducing a separate adaptive cache threshold fallback for parser-class parsing contexts (defaulting to immediate caching), while keeping the existing atom-level default threshold. It also adds an optimization to Alternative to reduce work for large ordered choices via literal-prefix indexing, plus accompanying specs and a benchmark case to compare cache-threshold behaviors.

Changes:

  • Add a :parser_default adaptive cache threshold for unnamed/unnamed-override parser classes (set to 0), while keeping the atom-level default at 1000.
  • Adjust cache keying to better separate strict (consume-all) vs prefix behavior while ensuring successful prefix parses are cached consistently.
  • Add literal-prefix indexing for large Alternative choices, plus focused specs and a benchmark input/parser to compare cache-threshold approaches.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
spec/parsanol/source_spec.rb Adds coverage for Source#remaining.
spec/parsanol/atoms/context_spec.rb Adds specs for atom vs parser-class cache thresholds and consume-all boundary behavior.
spec/parsanol/atoms/alternative_spec.rb Adds specs validating literal-prefix indexing behavior and error-detail preservation.
spec/parsanol/atom_results_spec.rb Adds coverage for cached prefix successes across strict named subexpressions.
lib/parsanol/source.rb Adds Source#remaining for non-advancing access to the unconsumed input.
lib/parsanol/atoms/context.rb Introduces :parser_default threshold and refines memoization keying/lookup behavior.
lib/parsanol/atoms/alternative.rb Adds large-choice literal indexing and lazy choice-error generation.
benchmark/run_all.rb Adds cache-threshold comparison parsers and input-type selection plumbing.
benchmark/README.md Documents the new cache-threshold benchmark input type and CLI flag.
benchmark/parsers/cache_threshold_parsanol.rb Adds a synthetic recursive grammar to compare cache-threshold defaults.
benchmark/inputs/tiny/cache_threshold.txt Adds tiny cache-threshold benchmark input.
benchmark/inputs/small/cache_threshold.txt Adds small cache-threshold benchmark input.
benchmark/inputs/medium/cache_threshold.txt Adds medium cache-threshold benchmark input.
benchmark/inputs/large/cache_threshold.txt Adds large cache-threshold benchmark input.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/parsanol/atoms/alternative.rb Outdated
Comment thread benchmark/run_all.rb Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.

Comment thread lib/parsanol/atoms/context.rb
Comment thread lib/parsanol/interval_tree.rb Outdated
Comment thread lib/parsanol/atoms/alternative.rb
Comment thread lib/parsanol/atoms/alternative.rb
@suleman-uzair suleman-uzair changed the title [WIP] Improve parser-class cache threshold defaults [WIP] Improve pure-Ruby parser caching and literal alternatives Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Comment thread lib/parsanol/source.rb
Comment thread spec/parsanol/atoms/alternative_spec.rb

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Comment thread benchmark/run_all.rb
Comment thread lib/parsanol/atoms/alternative.rb

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comment thread lib/parsanol/atoms/context.rb

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comment thread lib/parsanol/atoms/context.rb Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.

Comment thread lib/parsanol/atoms/context.rb
Comment thread benchmark/run_all.rb

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated no new comments.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated no new comments.

@suleman-uzair suleman-uzair changed the title [WIP] Improve pure-Ruby parser caching and literal alternatives Improve pure-Ruby parser caching and literal alternatives Jun 10, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.

Comment thread spec/benchmark/benchmark_suite_spec.rb
Comment thread spec/benchmark/benchmark_suite_spec.rb
Comment thread spec/benchmark/benchmark_runner_spec.rb
Comment thread spec/benchmark/benchmark_runner_spec.rb
@suleman-uzair suleman-uzair marked this pull request as ready for review June 11, 2026 15:05
@suleman-uzair suleman-uzair requested a review from ronaldtse June 11, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants