Skip to content

match: Use an aggregate equality comparison for constant array/slice patterns when possible#155216

Open
jakubadamw wants to merge 10 commits intorust-lang:mainfrom
jakubadamw:issue-110870-103073
Open

match: Use an aggregate equality comparison for constant array/slice patterns when possible#155216
jakubadamw wants to merge 10 commits intorust-lang:mainfrom
jakubadamw:issue-110870-103073

Conversation

@jakubadamw
Copy link
Copy Markdown
Contributor

@jakubadamw jakubadamw commented Apr 12, 2026

View all comments

When every element in an array or slice pattern is a constant and there is no .. subpattern, the match builder will now emit a single call to PartialEq::eq instead of comparing each element from the value one by one against the respective constant in the pattern.

This drastically reduces the number of MIR basic blocks for large constant-array matches – e.g. a 64-element [u8; 64] match previously generated 64 separate comparison blocks and now generates just one PartialEq::eq call that LLVM can lower to a memcmp(). The optimisation is gated on having at least two constant elements, meaning single-element arrays will still use a plain scalar comparison.

Example:

const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef";

pub fn foo(x: &[u8; 64]) -> bool {
    // Before: 64 basic blocks, one per byte.
    // After:  a single `PartialEq::eq()` call.
    matches!(x, &FOO)
}

Closes #103073.
Closes #110870.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 12, 2026

Some changes occurred in match lowering

cc @Nadrieril

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 12, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 12, 2026

r? @JonathanBrouwer

rustbot has assigned @JonathanBrouwer.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler, mir
  • compiler, mir expanded to 69 candidates
  • Random selection from 12 candidates

@rust-log-analyzer

This comment has been minimized.

@jakubadamw jakubadamw marked this pull request as draft April 12, 2026 23:12
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2026
@jakubadamw jakubadamw marked this pull request as ready for review April 12, 2026 23:50
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 12, 2026
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Apr 13, 2026
match: Use an aggregate equality comparison for constant array/slice patterns when possible
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 13, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 13, 2026

☀️ Try build successful (CI)
Build commit: 5a6dba6 (5a6dba60b1150a8b57fc739b0829fa4a65c5b8b3, parent: 14196dbfa3eb7c30195251eac092b1b86c8a2d84)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (5a6dba6): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.4% [-0.4%, -0.4%] 1

Max RSS (memory usage)

Results (primary -0.0%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.9% [1.0%, 2.8%] 2
Regressions ❌
(secondary)
2.7% [1.2%, 5.0%] 3
Improvements ✅
(primary)
-3.9% [-3.9%, -3.9%] 1
Improvements ✅
(secondary)
-1.8% [-2.2%, -1.6%] 3
All ❌✅ (primary) -0.0% [-3.9%, 2.8%] 3

Cycles

Results (primary -2.3%, secondary 14.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
14.4% [2.4%, 26.4%] 2
Improvements ✅
(primary)
-2.3% [-3.1%, -1.7%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.3% [-3.1%, -1.7%] 4

Binary size

Results (primary -0.1%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.1%] 1
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 2
Improvements ✅
(primary)
-0.2% [-0.2%, -0.1%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.1% [-0.2%, 0.1%] 5

Bootstrap: 491.114s -> 490.988s (-0.03%)
Artifact size: 394.23 MiB -> 394.33 MiB (0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 13, 2026
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

@rustbot reroll
Not familiar enough with this code to comfortably review this, but from a quick glance this looks great, thanks <3

@rustbot rustbot assigned jackh726 and unassigned JonathanBrouwer Apr 15, 2026
Comment thread compiler/rustc_mir_build/src/builder/matches/match_pair.rs Outdated
Comment thread compiler/rustc_mir_build/src/builder/matches/match_pair.rs
Comment on lines +32 to +36
if let PatKind::Constant { value } = pat.kind {
Some(ty::Const::new_value(tcx, value.valtree, value.ty))
} else {
None
}
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth reconstructing aggregate constants for arrays/slices of arrays of constants, etc.? I'm not a specialization expert, but it looks like arrays of bytewise-comparable things are also bytewise-comparable, at least for common array lengths1. Since array and slice equality are specialized based on their element types' bytewise-comparability, we should be able to get better codegen for nested array patterns too (as long as the inner arrays are of one of those common lengths), I think?

View changes since the review

Footnotes

  1. https://github.com/rust-lang/rust/blob/f29256dd1420dc681bf4956e3012ffe9eccdc7e7/library/core/src/cmp/bytewise.rs#L74-L85

Copy link
Copy Markdown
Contributor Author

@jakubadamw jakubadamw Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dianne, interesting. I’ll look into this next! 🙂

Comment on lines +322 to +326
// When there is no `..`, all elements are constants, and
// there are at least two of them, collapse the individual
// element subpairs into a single aggregate comparison that
// is performed after the length check.
if slice.is_none()
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An additional possibility: even if there is a .., the comparisons for the sub-slices before and after the .. could be done via aggregate equality when applicable. Credit to #121540, which I think did this?

Edit: assuming prefixes and suffixes are typically small and hand-written, it's probably not worth the trouble to use aggregate equality for them.

Even if only handling the case with no .., it might be worth moving the special-casing into prefix_slice_suffix to share it between PatKind::Slice and PatKind::Array, since that's where the commonalities live.

Edit: after prefix_slice_suffix's cleanup in #154943, I don't think it makes much sense to put this in there. I still think the logic for deciding whether to use aggregate equality is complex enough that it could be worth factoring out, but that's probably not the way to do it.

View changes since the review

@rust-bors

This comment has been minimized.

Copy link
Copy Markdown
Member

@Nadrieril Nadrieril Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach feels unfortunate: we transformed constants into patterns in const_to_pat, and now we try to reverse that transformation. Have we tried keeping the original constant around in the output of const_to_pat and using it at runtime? Tho this has the same issue of const-dependent MIR lowering that @dianne pointed out.

View changes since the review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, small handwritten arrays behave pretty much like tuples, it could be better codegen sometimes not to make them into constants:

match foo {
  // compiles to three nested `if`s today, would become 8 sequential `if`s if turned into constants
  [false, false, false] => ...,
  [false, false, true] => ...,
  [false, true, false] => ...,
  ...
}

It's admittedly a stretch but I'm tempted to err on the side of respecting user intent especially before the MIR boundary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. Maybe we should have a test for that in mir-opt/building/match/sort_candidates.rs or such if we start optimizing array/slice comparisons? I think as-is this PR may also turn that into 8 sequential tests, each a TestKind::AggregateEq for a different constant.

Copy link
Copy Markdown
Contributor Author

@jakubadamw jakubadamw Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nadrieril, as an alternative to being guided solely by user “intent”, would it be sensible to raise the threshold on the number of elements in an array pattern where we would use the aggregate equality? Right now it’s <= 2. Perhaps 4 would work better? I suppose a quantitative comparison with benchmarks could be of use here, but sadly I can’t commit to that with my present schedule.

@jackh726
Copy link
Copy Markdown
Member

@dianne do you want to take over review here?

Copy link
Copy Markdown
Contributor

@dianne dianne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much context on the ctfe or const traits side of things to evaluate the approach. cc @oli-obk maybe? It feels like we have to make some sort of compromise here to avoid calling anything const-unstable. Possibly we could block on const_cmp stabilizing, or possibly we could have some workaround until then if we want to land this first.

I think I can handle the technical review, at least. Tentatively, r? me (though feel free to steal the assignment if that'd be easier ^^)

View changes since this review

Comment thread compiler/rustc_mir_build/src/builder/matches/test.rs Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

@rustbot

This comment has been minimized.

@jakubadamw
Copy link
Copy Markdown
Contributor Author

jakubadamw commented Apr 26, 2026

@dianne, @Nadrieril, thank you very much for your review and feedback, I really appreciate it. I’ve rebased and addressed the comments. The refactor where we would preserve a trace of the original constant in const_to_pat() and thus honour user’s intent better when deciding what MIR to produce, is something I’ll try to do next week time permitting, if it’s really what we’d like to see here.

…s when possible

When every element in an array or slice pattern is a constant and there
is no `..` subpattern, the match builder now emits a single call to
`PartialEq::eq` instead of comparing each element one by one.

This drastically reduces the number of MIR basic blocks for large
constant-array matches – e.g. a 64-element `[u8; 64]` match previously
generated 64 separate comparison blocks and now generates just one
`PartialEq::eq` call that LLVM can lower to a `memcmp()`

The optimisation is gated on having at least two constant elements.
Single-element arrays still use a plain scalar comparison.

Example:

```rust
const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef";

pub fn foo(x: &[u8; 64]) -> bool {
    // Before: 64 basic blocks, one per byte.
    // After:  a single `PartialEq::eq()` call.
    matches!(x, &FOO)
}
```
…t contexts

This is unless the `const_cmp` feature is enabled, in which case `PartialEq`
becomes available in said contexts.
@jakubadamw jakubadamw force-pushed the issue-110870-103073 branch from 6de0afc to a287bc9 Compare April 26, 2026 21:35
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 26, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Large amount of generated code for match statements with large arrays Weird Match Statement Codegen With Byte Strings

8 participants