rmatch

Current Performance Comparison

Metric	rmatch	Java Regex	Ratio (rmatch/java)
5000 patterns	2.4s	4.1s	1.7x faster
Peak Memory	39MB	19MB	2.1x more memory
Pattern Loading	20MB	1MB	20.0x more memory
Matching Phase	5MB	4MB	1.2x more memory

Latest benchmark comparison between rmatch and native Java regex (java.util.regex.Pattern) on 5000 regex patterns against Wuthering Heights corpus. Updated: 2025-12-18 21:42 UTC

Performance Timeline Charts

rmatch Performance History

Java Regex Performance History

Performance Comparison (rmatch vs Java Regex)

Live performance tracking from macro benchmarks. Individual charts show execution time and memory usage patterns over time, while the comparison chart shows rmatch performance ratios relative to Java regex (values > 1.0 mean rmatch is slower/uses more memory).

rmatch

The project is getting closer to a state where it may be useful for others than myself, but it's not quite there yet. Be patient ;)

Key Performance Metrics

Benchmark Data Sources: All performance data is sourced from benchmarks/results/
JMH Microbenchmarks: Precise timing measurements with statistical confidence intervals
Macro Benchmarks: End-to-end performance testing with real workloads
Automated Tracking: Performance evolution tracked continuously via GitHub Actions

🚨 CRITICAL: Performance Validation Rule

"I will not merge anything to main that does not provably improve performance."

Development Guidelines for Performance Changes

ALL performance optimizations MUST:

✅ Use Production-Scale Testing: Test with 5000+ regex workloads against real text corpora
✅ Show Measurable Improvement: Demonstrate clear performance gains in comprehensive benchmarks
❌ Never Trust Micro-Benchmarks: Small-scale synthetic tests are insufficient and often misleading
❌ Never Trust Theoretical Improvements: Code that "should be faster" must prove it IS faster

Lessons Learned

Enum switching theoretical 13.6% improvement → Actually slower in production
Pattern matching instanceof → 1.7% performance regression
Character classification optimizations → 2-9% performance regression

The Rule Exists Because: Brilliant optimization ideas are necessary, but they must be proven in our specific use case with realistic workloads before adoption.

GC Optimization for Java 25

rmatch includes tools to experiment with different Garbage Collector (GC) configurations on Java 25 to optimize memory usage and performance.

Quick Start

Validate GC configurations:

scripts/validate_gc_configs.sh

Run benchmarks with all GC variants:

make bench-gc-experiments

Available GC Configurations

G1 (default): General-purpose GC with good throughput
ZGC Generational: Low-latency GC for large heaps
Shenandoah: Concurrent GC with predictable pause times
Compact Object Headers: Reduces memory footprint (Java 25 feature)

Documentation

See GC_EXPERIMENTS.md for:

Detailed usage instructions
How to analyze results
Applying findings to improve performance
References to Java 25 GC improvements

The experiments help identify optimal GC settings for regex engines with high object churn, following recommendations from the JDK 25 Performance Improvements article.

⚡ Java 25 JIT Optimization (Production Ready)

rmatch includes proven JIT optimization techniques for Java 25 that provide significant performance improvements in production workloads.

Recommended Production Configuration

For optimal performance across all workload sizes:

export JAVA_OPTS="-Drmatch.engine=fastpath -Drmatch.prefilter=aho -XX:+TieredCompilation -XX:CompileThreshold=500"

Performance Results

Test Environment: Apple M2 Max (aarch64), macOS 26.0.1, OpenJDK 25 (Temurin-25+36-LTS)

Configuration	5K Patterns	10K Patterns
Baseline	10,230ms	21,656ms
Optimized	9,895ms (+3.3%)	19,297ms (+10.9%)

Performance characteristics may vary across different architectures. See FASTPATH_PERFORMANCE_ANALYSIS.md for detailed architecture specifications and cross-platform considerations.

Key Benefits

✅ Scales with workload size: 3.3% improvement at 5K, 10.9% at 10K patterns
✅ Consistent performance: 8% coefficient of variation across runs
✅ Production validated: Tested with comprehensive benchmarks using real text corpora
✅ Easy to apply: Simple environment variable configuration

Advanced Configuration

For specific workload tuning:

# Small workloads (≤5K patterns): Pure ASCII optimization
export JAVA_OPTS="-Drmatch.engine=fastpath -Drmatch.prefilter.threshold=99999 -XX:+TieredCompilation -XX:CompileThreshold=500"

# Large workloads (≥10K patterns): Maximum prefilter benefits  
export JAVA_OPTS="-Drmatch.engine=fastpath -Drmatch.prefilter.threshold=5000 -XX:+TieredCompilation -XX:CompileThreshold=500"

Documentation

See FASTPATH_PERFORMANCE_ANALYSIS.md for:

Complete performance analysis and validation methodology
JIT technique explanations and benchmark results
Component-by-component optimization breakdown

Dispatch Optimization Experiments

rmatch includes benchmarks to test modern Java language features for dispatch pattern optimization on Java 25.

Quick Start

Run dispatch optimization benchmarks:

make bench-dispatch

Or run the script directly:

scripts/run_dispatch_benchmarks.sh

What's Tested

The benchmarks evaluate three optimization strategies:

Pattern Matching for instanceof - Java 16+ pattern matching vs traditional cast
Switch Expressions - Enhanced switch with arrow syntax vs if-else chains
Enum Dispatch - Switch expressions vs if-else for enum handling

Key Findings

Based on empirical testing:

✅ Enum switch expressions: 13.6% faster than if-else chains - RECOMMENDED
❌ Pattern matching instanceof: No measurable benefit (0.08%)
✅ If-else for char ranges: 19.5% faster than switch - keep current approach

Documentation

See DISPATCH_OPTIMIZATION_RESULTS.md for:

Detailed benchmark results
Performance analysis
Specific recommendations for code changes
Examples of patterns to refactor

These experiments follow the same methodology as GC experiments to provide data-driven guidance on whether modern language features improve performance.

Name		Name	Last commit message	Last commit date
Latest commit History 843 Commits
.github		.github
.idea/dictionaries		.idea/dictionaries
.ipynb_checkpoints		.ipynb_checkpoints
.mvn/wrapper		.mvn/wrapper
analysis		analysis
benchmarks		benchmarks
charts		charts
prd-repo		prd-repo
proposals		proposals
rmatch-tester		rmatch-tester
rmatch		rmatch
scripts		scripts
.gitignore		.gitignore
.gitignore-performance		.gitignore-performance
ARCHITECTURE_AWARE_BENCHMARKING.md		ARCHITECTURE_AWARE_BENCHMARKING.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DISPATCH_OPTIMIZATION_GUIDE.md		DISPATCH_OPTIMIZATION_GUIDE.md
DISPATCH_OPTIMIZATION_RESULTS.md		DISPATCH_OPTIMIZATION_RESULTS.md
ENHANCED_TESTING_FRAMEWORK.md		ENHANCED_TESTING_FRAMEWORK.md
FASTPATH_OPTIMIZATION.md		FASTPATH_OPTIMIZATION.md
FASTPATH_PERFORMANCE_ANALYSIS.md		FASTPATH_PERFORMANCE_ANALYSIS.md
GC_EXPERIMENTS.md		GC_EXPERIMENTS.md
GC_OPTIMIZATION_RESULTS.md		GC_OPTIMIZATION_RESULTS.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
Makefile		Makefile
OPTIMAL_CONFIGURATION_GUIDE.md		OPTIMAL_CONFIGURATION_GUIDE.md
PERFORMANCE_AUTOMATION.md		PERFORMANCE_AUTOMATION.md
PERFORMANCE_CHARTS.md		PERFORMANCE_CHARTS.md
PERFORMANCE_TESTING.md		PERFORMANCE_TESTING.md
PERFORMANCE_TESTING_LESSONS.md		PERFORMANCE_TESTING_LESSONS.md
PERFORMANCE_TRACKING.md		PERFORMANCE_TRACKING.md
QUICKSTART_GC.md		QUICKSTART_GC.md
README.md		README.md
README.md.backup		README.md.backup
TODO.md		TODO.md
benchmarks.ipynb		benchmarks.ipynb
checkstyle-unused-imports.xml		checkstyle-unused-imports.xml
examples-for-visualization.py~		examples-for-visualization.py~
foo.sh		foo.sh
foo.sh~		foo.sh~
java-maven-files.tgz		java-maven-files.tgz
java_performance_timeline.png		java_performance_timeline.png
mvnw		mvnw
mvnw.cmd		mvnw.cmd
performance_comparison.png		performance_comparison.png
performance_timeline.png		performance_timeline.png
pom.xml		pom.xml
qodana.yaml		qodana.yaml
requirements.txt		requirements.txt
rmatch-infra-bootstrap.tgz		rmatch-infra-bootstrap.tgz
summarize-large-corpus-trials.sql		summarize-large-corpus-trials.sql
test_jit_comparison.sh		test_jit_comparison.sh
test_profile_guided.sh		test_profile_guided.sh
test_warmup_benchmark.sh		test_warmup_benchmark.sh
validate_jit_config.sh		validate_jit_config.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

rmatch

Current Performance Comparison

Performance Timeline Charts

rmatch Performance History

Java Regex Performance History

Performance Comparison (rmatch vs Java Regex)

Key Performance Metrics

🚨 CRITICAL: Performance Validation Rule

Development Guidelines for Performance Changes

Lessons Learned

GC Optimization for Java 25

Quick Start

Available GC Configurations

Documentation

⚡ Java 25 JIT Optimization (Production Ready)

Recommended Production Configuration

Performance Results

Key Benefits

Advanced Configuration

Documentation

Dispatch Optimization Experiments

Quick Start

What's Tested

Key Findings

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

la3lma/rmatch

Folders and files

Latest commit

History

Repository files navigation

rmatch

Current Performance Comparison

Performance Timeline Charts

rmatch Performance History

Java Regex Performance History

Performance Comparison (rmatch vs Java Regex)

Key Performance Metrics

🚨 CRITICAL: Performance Validation Rule

Development Guidelines for Performance Changes

Lessons Learned

GC Optimization for Java 25

Quick Start

Available GC Configurations

Documentation

⚡ Java 25 JIT Optimization (Production Ready)

Recommended Production Configuration

Performance Results

Key Benefits

Advanced Configuration

Documentation

Dispatch Optimization Experiments

Quick Start

What's Tested

Key Findings

Documentation

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages