Skip to content

coregex FindAllStringIndex slower than stdlib #153

@tjbrains

Description

@tjbrains

To find variables in a string, we use regexp to match \${[@\w.|-]+}:

func BenchmarkFindStringIndex_Found_regexp(b *testing.B) {
	var reg = regexp.MustCompile(`\${[@\w.|-]+}`)

	b.ReportAllocs()
	b.ResetTimer()

	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			var indexes = reg.FindAllStringIndex("hello, ${name}, ${age}, ${gender}, ${home}, world", -1)
			if len(indexes) != 4 {
				b.Fatal(len(indexes))
			}
		}
	})
}

func BenchmarkFindStringIndex_Found_coregex(b *testing.B) {
	var reg = coregex.MustCompile(`\${[@\w.|-]+}`)

	b.ReportAllocs()
	b.ResetTimer()

	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			var indexes = reg.FindAllStringIndex("hello, ${name}, ${age}, ${gender}, ${home}, world", -1)
			if len(indexes) != 4 {
				b.Fatal(len(indexes))
			}
		}
	})
}

func BenchmarkFindStringIndex_NotFound_regexp(b *testing.B) {
	var reg = regexp.MustCompile(`\${[@\w.|-]+}`)

	b.ReportAllocs()
	b.ResetTimer()

	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			var indexes = reg.FindAllStringIndex("hello, world", -1)
			if len(indexes) != 0 {
				b.Fatal(len(indexes))
			}
		}
	})
}

func BenchmarkFindStringIndex_NotFound_coregex(b *testing.B) {
	var reg = coregex.MustCompile(`\${[@\w.|-]+}`)

	b.ReportAllocs()
	b.ResetTimer()

	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			var indexes = reg.FindAllStringIndex("hello, world", -1)
			if len(indexes) != 0 {
				b.Fatal(len(indexes))
			}
		}
	})
}

benchmark result:

goos: darwin
goarch: arm64
cpu: Apple M2 Max
BenchmarkFindStringIndex_Found_regexp-12                 7014567               171.8 ns/op           324 B/op          5 allocs/op
BenchmarkFindStringIndex_Found_coregex-12                1586851               757.5 ns/op           161 B/op          5 allocs/op
BenchmarkFindStringIndex_NotFound_regexp-12             124449542                9.214 ns/op           0 B/op          0 allocs/op
BenchmarkFindStringIndex_NotFound_coregex-12            14020056                84.74 ns/op            0 B/op          0 allocs/op
PASS
ok 

As we see, coregex is ~6x-8x slower than stdlib.

Metadata

Metadata

Assignees

Labels

type: performanceSpeed/memory improvement or regression

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions