match: Use an aggregate equality comparison for constant array/slice patterns when possible by jakubadamw · Pull Request #155216 · rust-lang/rust

jakubadamw · 2026-04-12T21:59:41Z

When every element in an array or slice pattern is a constant and there is no .. subpattern, the match builder will now emit a single call to PartialEq::eq instead of comparing each element from the value one by one against the respective constant in the pattern.

This drastically reduces the number of MIR basic blocks for large constant-array matches – e.g. a 64-element [u8; 64] match previously generated 64 separate comparison blocks and now generates just one PartialEq::eq call that LLVM can lower to a memcmp(). The optimisation is gated on having at least two constant elements, meaning single-element arrays will still use a plain scalar comparison.

Example:

const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef";

pub fn foo(x: &[u8; 64]) -> bool {
    // Before: 64 basic blocks, one per byte.
    // After:  a single `PartialEq::eq()` call.
    matches!(x, &FOO)
}

Closes #103073.
Closes #110870.

rustbot · 2026-04-12T21:59:50Z

Some changes occurred in match lowering

cc @Nadrieril

rustbot · 2026-04-12T21:59:52Z

r? @JonathanBrouwer

rustbot has assigned @JonathanBrouwer.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: compiler, mir
compiler, mir expanded to 69 candidates
Random selection from 12 candidates

JonathanBrouwer · 2026-04-13T18:15:32Z

@bors try @rust-timer queue

match: Use an aggregate equality comparison for constant array/slice patterns when possible

rust-bors · 2026-04-13T20:24:30Z

☀️ Try build successful (CI)
Build commit: 5a6dba6 (5a6dba60b1150a8b57fc739b0829fa4a65c5b8b3, parent: 14196dbfa3eb7c30195251eac092b1b86c8a2d84)

rust-timer · 2026-04-13T21:13:34Z

Finished benchmarking commit (5a6dba6): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.2%	[0.2%, 0.2%]	1
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.4%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.4%	[-0.4%, -0.4%]	1

Max RSS (memory usage)

Results (primary -0.0%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.9%	[1.0%, 2.8%]	2
Regressions ❌ (secondary)	2.7%	[1.2%, 5.0%]	3
Improvements ✅ (primary)	-3.9%	[-3.9%, -3.9%]	1
Improvements ✅ (secondary)	-1.8%	[-2.2%, -1.6%]	3
All ❌✅ (primary)	-0.0%	[-3.9%, 2.8%]	3

Cycles

Results (primary -2.3%, secondary 14.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	14.4%	[2.4%, 26.4%]	2
Improvements ✅ (primary)	-2.3%	[-3.1%, -1.7%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.3%	[-3.1%, -1.7%]	4

Binary size

Results (primary -0.1%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.1%, 0.1%]	1
Regressions ❌ (secondary)	0.1%	[0.1%, 0.1%]	2
Improvements ✅ (primary)	-0.2%	[-0.2%, -0.1%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-0.2%, 0.1%]	5

Bootstrap: 491.114s -> 490.988s (-0.03%)
Artifact size: 394.23 MiB -> 394.33 MiB (0.03%)

JonathanBrouwer · 2026-04-15T19:44:15Z

@rustbot reroll
Not familiar enough with this code to comfortably review this, but from a quick glance this looks great, thanks <3

dianne · 2026-04-17T21:38:39Z

+            if let PatKind::Constant { value } = pat.kind {
+                Some(ty::Const::new_value(tcx, value.valtree, value.ty))
+            } else {
+                None
+            }


It might also be worth reconstructing aggregate constants for arrays/slices of arrays of constants, etc.? I'm not a specialization expert, but it looks like arrays of bytewise-comparable things are also bytewise-comparable, at least for common array lengths¹. Since array and slice equality are specialized based on their element types' bytewise-comparability, we should be able to get better codegen for nested array patterns too (as long as the inner arrays are of one of those common lengths), I think?

View changes since the review

Footnotes

https://github.com/rust-lang/rust/blob/f29256dd1420dc681bf4956e3012ffe9eccdc7e7/library/core/src/cmp/bytewise.rs#L74-L85 ↩

@dianne, interesting. I’ll look into this next! 🙂

dianne · 2026-04-17T21:49:51Z

+                // When there is no `..`, all elements are constants, and
+                // there are at least two of them, collapse the individual
+                // element subpairs into a single aggregate comparison that
+                // is performed after the length check.
+                if slice.is_none()


An additional possibility: even if there is a .., the comparisons for the sub-slices before and after the .. could be done via aggregate equality when applicable. Credit to #121540, which I think did this?

Edit: assuming prefixes and suffixes are typically small and hand-written, it's probably not worth the trouble to use aggregate equality for them.

Even if only handling the case with no .., it might be worth moving the special-casing into prefix_slice_suffix to share it between PatKind::Slice and PatKind::Array, since that's where the commonalities live.

Edit: after prefix_slice_suffix's cleanup in #154943, I don't think it makes much sense to put this in there. I still think the logic for deciding whether to use aggregate equality is complex enough that it could be worth factoring out, but that's probably not the way to do it.

View changes since the review

Nadrieril · 2026-04-22T07:47:42Z

The general approach feels unfortunate: we transformed constants into patterns in const_to_pat, and now we try to reverse that transformation. Have we tried keeping the original constant around in the output of const_to_pat and using it at runtime? Tho this has the same issue of const-dependent MIR lowering that @dianne pointed out.

View changes since the review

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

I dunno, small handwritten arrays behave pretty much like tuples, it could be better codegen sometimes not to make them into constants:

match foo { // compiles to three nested `if`s today, would become 8 sequential `if`s if turned into constants [false, false, false] => ..., [false, false, true] => ..., [false, true, false] => ..., ... }

It's admittedly a stretch but I'm tempted to err on the side of respecting user intent especially before the MIR boundary.

Oh, right. Maybe we should have a test for that in mir-opt/building/match/sort_candidates.rs or such if we start optimizing array/slice comparisons? I think as-is this PR may also turn that into 8 sequential tests, each a TestKind::AggregateEq for a different constant.

@Nadrieril, as an alternative to being guided solely by user “intent”, would it be sensible to raise the threshold on the number of elements in an array pattern where we would use the aggregate equality? Right now it’s <= 2. Perhaps 4 would work better? I suppose a quantitative comparison with benchmarks could be of use here, but sadly I can’t commit to that with my present schedule.

jackh726 · 2026-04-23T18:43:22Z

@dianne do you want to take over review here?

dianne

I don't have much context on the ctfe or const traits side of things to evaluate the approach. cc @oli-obk maybe? It feels like we have to make some sort of compromise here to avoid calling anything const-unstable. Possibly we could block on const_cmp stabilizing, or possibly we could have some workaround until then if we want to land this first.

I think I can handle the technical review, at least. Tentatively, r? me (though feel free to steal the assignment if that'd be easier ^^)

View changes since this review

dianne · 2026-04-23T23:26:24Z

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

jakubadamw · 2026-04-26T21:20:08Z

@dianne, @Nadrieril, thank you very much for your review and feedback, I really appreciate it. I’ve rebased and addressed the comments. The refactor where we would preserve a trace of the original constant in const_to_pat() and thus honour user’s intent better when deciding what MIR to produce, is something I’ll try to do next week time permitting, if it’s really what we’d like to see here.

…s when possible When every element in an array or slice pattern is a constant and there is no `..` subpattern, the match builder now emits a single call to `PartialEq::eq` instead of comparing each element one by one. This drastically reduces the number of MIR basic blocks for large constant-array matches – e.g. a 64-element `[u8; 64]` match previously generated 64 separate comparison blocks and now generates just one `PartialEq::eq` call that LLVM can lower to a `memcmp()` The optimisation is gated on having at least two constant elements. Single-element arrays still use a plain scalar comparison. Example: ```rust const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"; pub fn foo(x: &[u8; 64]) -> bool { // Before: 64 basic blocks, one per byte. // After: a single `PartialEq::eq()` call. matches!(x, &FOO) } ```

…on a large fixed-length array

…t contexts This is unless the `const_cmp` feature is enabled, in which case `PartialEq` becomes available in said contexts.

…n't get captured by this logic

…` features

rustbot · 2026-04-26T21:35:43Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 12, 2026

rustbot assigned JonathanBrouwer Apr 12, 2026