From f98ec6e94492155c40809ac9357454de98edf798 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:07:34 +0200 Subject: [PATCH 1/9] sembr src/thir.md --- src/thir.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/src/thir.md b/src/thir.md index 07eba3eec..802df2286 100644 --- a/src/thir.md +++ b/src/thir.md @@ -15,19 +15,23 @@ the types have been filled in, which is possible after type checking has complet But it has some other interesting features that distinguish it from the HIR: - Like the MIR, the THIR only represents bodies, i.e. "executable code"; this includes - function bodies, but also `const` initializers, for example. Specifically, all [body owners] have - THIR created. Consequently, the THIR has no representation for items like `struct`s or `trait`s. + function bodies, but also `const` initializers, for example. + Specifically, all [body owners] have THIR created. + Consequently, the THIR has no representation for items like `struct`s or `trait`s. - Each body of THIR is only stored temporarily and is dropped as soon as it's no longer needed, as opposed to being stored until the end of the compilation process (which is what is done with the HIR). - Besides making the types of all nodes available, the THIR also has additional - desugaring compared to the HIR. For example, automatic references and dereferences + desugaring compared to the HIR. + For example, automatic references and dereferences are made explicit, and method calls and overloaded operators are converted into - plain function calls. Destruction scopes are also made explicit. + plain function calls. + Destruction scopes are also made explicit. -- Statements, expressions, match arms, blocks, and parameters are stored separately. For example, +- Statements, expressions, match arms, blocks, and parameters are stored separately. + For example, statements in the `stmts` array reference expressions by their index (represented as a [`ExprId`]) in the `exprs` array. @@ -35,10 +39,13 @@ But it has some other interesting features that distinguish it from the HIR: [`ExprId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/thir/struct.ExprId.html [body owners]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.BodyOwnerKind.html -The THIR lives in [`rustc_mir_build::thir`][thir-docs]. To construct a [`thir::Expr`], +The THIR lives in [`rustc_mir_build::thir`][thir-docs]. +To construct a [`thir::Expr`], you can use the [`thir_body`] function, passing in the memory arena where the THIR -will be allocated. Dropping this arena will result in the THIR being destroyed, -which is useful to keep peak memory in check. Having a THIR representation of +will be allocated. +Dropping this arena will result in the THIR being destroyed, +which is useful to keep peak memory in check. +Having a THIR representation of all bodies of a crate in memory at the same time would be very heavy. You can get a debug representation of the THIR by passing the `-Zunpretty=thir-tree` flag From 5aa5ac5e1dd8f9a2c07664cf23bd078be7e86dcf Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:17:38 +0200 Subject: [PATCH 2/9] sembr src/name-resolution.md --- src/name-resolution.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/name-resolution.md b/src/name-resolution.md index 79d0897b4..2723207a9 100644 --- a/src/name-resolution.md +++ b/src/name-resolution.md @@ -62,8 +62,7 @@ files and expanding `macros`. This phase produces links from all the names in the source to relevant places where the name was introduced. It also generates helpful error messages, -like typo suggestions, traits to import or lints about -unused items. +like typo suggestions, traits to import or lints about unused items. A successful run of the second phase ([`Resolver::resolve_crate`]) creates kind of an index the rest of the compilation may use to ask about the present names From 8065d270d96003bfa17dcdbb75f9fed7dfc03f0b Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:18:17 +0200 Subject: [PATCH 3/9] missing pause --- src/name-resolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/name-resolution.md b/src/name-resolution.md index 2723207a9..83d299642 100644 --- a/src/name-resolution.md +++ b/src/name-resolution.md @@ -62,7 +62,7 @@ files and expanding `macros`. This phase produces links from all the names in the source to relevant places where the name was introduced. It also generates helpful error messages, -like typo suggestions, traits to import or lints about unused items. +like typo suggestions, traits to import, or lints about unused items. A successful run of the second phase ([`Resolver::resolve_crate`]) creates kind of an index the rest of the compilation may use to ask about the present names From 7243e02ffd1d5c70b27339ad0355937f59d1a2a1 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:19:16 +0200 Subject: [PATCH 4/9] sembr src/building/suggested.md --- src/building/suggested.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/building/suggested.md b/src/building/suggested.md index d9921cbb1..c8b25edc1 100644 --- a/src/building/suggested.md +++ b/src/building/suggested.md @@ -154,9 +154,10 @@ For Neovim users, there are a few options: #### neoconf.nvim [neoconf.nvim][neoconf.nvim] allows for project-local configuration -files with the native LSP. The steps for how to use it are below. Note that they require -rust-analyzer to already be configured with Neovim. Steps for this can be -[found here][r-a nvim lsp]. +files with the native LSP. +The steps for how to use it are below. +Note that they require rust-analyzer to already be configured with Neovim. +Steps for this can be [found here][r-a nvim lsp]. 1. First install the plugin. This can be done by following the steps in the README. From 8bc9afe3f48dabd6c324493fb855fe077c208c4a Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:20:02 +0200 Subject: [PATCH 5/9] sembr src/building/prerequisites.md --- src/building/prerequisites.md | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/src/building/prerequisites.md b/src/building/prerequisites.md index d984cc4e3..74e93d78c 100644 --- a/src/building/prerequisites.md +++ b/src/building/prerequisites.md @@ -6,35 +6,44 @@ See [the `rust-lang/rust` INSTALL](https://github.com/rust-lang/rust/blob/HEAD/I ## Hardware -You will need an internet connection to build. The bootstrapping process -involves updating git submodules and downloading a beta compiler. It doesn't -need to be super fast, but that can help. +You will need an internet connection to build. +The bootstrapping process +involves updating git submodules and downloading a beta compiler. +It doesn't need to be super fast, but that can help. There are no strict hardware requirements, but building the compiler is computationally expensive, so a beefier machine will help, and I wouldn't -recommend trying to build on a Raspberry Pi! We recommend the following. -* 30GB+ of free disk space. Otherwise, you will have to keep - clearing incremental caches. More space is better, the compiler is a bit of a +recommend trying to build on a Raspberry Pi! +We recommend the following. +* 30GB+ of free disk space. + Otherwise, you will have to keep clearing incremental caches. + More space is better, the compiler is a bit of a hog; it's a problem we are aware of. * 8GB+ RAM -* 2+ cores. Having more cores really helps. 10 or 20 or more is not too many! +* 2+ cores. + Having more cores really helps. + 10 or 20 or more is not too many! -Beefier machines will lead to much faster builds. If your machine is not very +Beefier machines will lead to much faster builds. +If your machine is not very powerful, a common strategy is to only use `./x check` on your local machine and let the CI build test your changes when you push to a PR branch. Building the compiler takes more than half an hour on my moderately powerful -laptop. We suggest downloading LLVM from CI so you don't have to build it from source +laptop. +We suggest downloading LLVM from CI so you don't have to build it from source ([see here][config]). -Like `cargo`, the build system will use as many cores as possible. Sometimes -this can cause you to run low on memory. You can use `-j` to adjust the number -of concurrent jobs. If a full build takes more than ~45 minutes to an hour, you +Like `cargo`, the build system will use as many cores as possible. +Sometimes this can cause you to run low on memory. +You can use `-j` to adjust the number of concurrent jobs. +If a full build takes more than ~45 minutes to an hour, you are probably spending most of the time swapping memory in and out; try using `-j1`. If you don't have too much free disk space, you may want to turn off -incremental compilation ([see here][config]). This will make compilation take +incremental compilation ([see here][config]). +This will make compilation take longer (especially after a rebase), but will save a ton of space from the incremental caches. From 9ff80d70ad4ed4f38969663fe1e53253aa5831a6 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:29:21 +0200 Subject: [PATCH 6/9] reflow --- src/building/prerequisites.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/building/prerequisites.md b/src/building/prerequisites.md index 74e93d78c..0613a931b 100644 --- a/src/building/prerequisites.md +++ b/src/building/prerequisites.md @@ -25,8 +25,8 @@ We recommend the following. 10 or 20 or more is not too many! Beefier machines will lead to much faster builds. -If your machine is not very -powerful, a common strategy is to only use `./x check` on your local machine +If your machine is not very powerful, +a common strategy is to only use `./x check` on your local machine and let the CI build test your changes when you push to a PR branch. Building the compiler takes more than half an hour on my moderately powerful From 0fd4f0994072c154a65a7aff2bca52aa67d3a906 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 19:30:57 +0200 Subject: [PATCH 7/9] whose laptop --- src/building/prerequisites.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/building/prerequisites.md b/src/building/prerequisites.md index 0613a931b..4833d3441 100644 --- a/src/building/prerequisites.md +++ b/src/building/prerequisites.md @@ -29,7 +29,7 @@ If your machine is not very powerful, a common strategy is to only use `./x check` on your local machine and let the CI build test your changes when you push to a PR branch. -Building the compiler takes more than half an hour on my moderately powerful +Building the compiler takes more than half an hour on a moderately powerful laptop. We suggest downloading LLVM from CI so you don't have to build it from source ([see here][config]). From 57e49f17fa6b427ab83f9b6d4f9bf7be05d222c6 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 20:19:00 +0200 Subject: [PATCH 8/9] sembr src/mir/construction.md --- src/mir/construction.md | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/src/mir/construction.md b/src/mir/construction.md index 8360d9ff1..c85de0240 100644 --- a/src/mir/construction.md +++ b/src/mir/construction.md @@ -11,13 +11,15 @@ list of items: * Drop code (the `Drop::drop` function is not called directly) * Drop implementations of types without an explicit `Drop` implementation -The lowering is triggered by calling the [`mir_built`] query. The MIR builder does +The lowering is triggered by calling the [`mir_built`] query. +The MIR builder does not actually use the HIR but operates on the [THIR] instead, processing THIR expressions recursively. The lowering creates local variables for every argument as specified in the signature. Next, it creates local variables for every binding specified (e.g. `(a, b): (i32, String)`) -produces 3 bindings, one for the argument, and two for the bindings. Next, it generates +produces 3 bindings, one for the argument, and two for the bindings. +Next, it generates field accesses that read the fields from the argument and writes the value to the binding variable. @@ -52,7 +54,8 @@ fn generate_more_mir(&mut self, block: BasicBlock) -> BlockAnd { ``` When you invoke these functions, it is common to have a local variable `block` -that is effectively a "cursor". It represents the point at which we are adding new MIR. +that is effectively a "cursor". +It represents the point at which we are adding new MIR. When you invoke `generate_more_mir`, you want to update this cursor. You can do this manually, but it's tedious: @@ -89,10 +92,13 @@ representations: We start out with lowering the function body to an `Rvalue` so we can create an assignment to `RETURN_PLACE`, This `Rvalue` lowering will in turn trigger lowering to -`Operand` for its arguments (if any). `Operand` lowering either produces a `const` -operand, or moves/copies out of a `Place`, thus triggering a `Place` lowering. An +`Operand` for its arguments (if any). +`Operand` lowering either produces a `const` +operand, or moves/copies out of a `Place`, thus triggering a `Place` lowering. +An expression being lowered to a `Place` can in turn trigger a temporary to be created -if the expression being lowered contains operations. This is where the snake bites its +if the expression being lowered contains operations. +This is where the snake bites its own tail and we need to trigger an `Rvalue` lowering for the expression to be written into the local. @@ -100,7 +106,8 @@ into the local. Operators on builtin types are not lowered to function calls (which would end up being infinite recursion calls, because the trait impls just contain the operation itself -again). Instead there are `Rvalue`s for binary and unary operators and index operations. +again). +Instead there are `Rvalue`s for binary and unary operators and index operations. These `Rvalue`s later get codegened to llvm primitive operations or llvm intrinsics. Operators on all other types get lowered to a function call to their `impl` of the @@ -118,7 +125,8 @@ In [MIR] there is no difference between method calls and function calls anymore. ## Conditions `if` conditions and `match` statements for `enum`s with variants that have no fields are -lowered to `TerminatorKind::SwitchInt`. Each possible value (so `0` and `1` for `if` +lowered to `TerminatorKind::SwitchInt`. +Each possible value (so `0` and `1` for `if` conditions) has a corresponding `BasicBlock` to which the code continues. The argument being branched on is (again) an `Operand` representing the value of the if condition. @@ -127,14 +135,14 @@ the if condition. `match` statements for `enum`s with variants that have fields are lowered to `TerminatorKind::SwitchInt`, too, but the `Operand` refers to a `Place` where the -discriminant of the value can be found. This often involves reading the discriminant -to a new temporary variable. +discriminant of the value can be found. +This often involves reading the discriminant to a new temporary variable. ## Aggregate construction Aggregate values of any kind (e.g. structs or tuples) are built via `Rvalue::Aggregate`. -All fields are -lowered to `Operator`s. This is essentially equivalent to one assignment +All fields are lowered to `Operator`s. +This is essentially equivalent to one assignment statement per aggregate field plus an assignment to the discriminant in the case of `enum`s. From ea6d6462c10d12686c1e84c5125edab72c05f699 Mon Sep 17 00:00:00 2001 From: Tshepang Mbambo Date: Thu, 23 Apr 2026 20:22:46 +0200 Subject: [PATCH 9/9] reflow --- src/mir/construction.md | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/src/mir/construction.md b/src/mir/construction.md index c85de0240..b9b5f0a34 100644 --- a/src/mir/construction.md +++ b/src/mir/construction.md @@ -12,16 +12,16 @@ list of items: * Drop implementations of types without an explicit `Drop` implementation The lowering is triggered by calling the [`mir_built`] query. -The MIR builder does -not actually use the HIR but operates on the [THIR] instead, processing THIR -expressions recursively. +The MIR builder does not actually use the HIR, +but operates on the [THIR] instead, +processing THIR expressions recursively. The lowering creates local variables for every argument as specified in the signature. Next, it creates local variables for every binding specified (e.g. `(a, b): (i32, String)`) produces 3 bindings, one for the argument, and two for the bindings. -Next, it generates -field accesses that read the fields from the argument and writes the value to the binding -variable. +Next, +it generates field accesses that read the fields from the argument, +and writes the value to the binding variable. With this initialization out of the way, the lowering triggers a recursive call to a function that generates the MIR for the body (a `Block` expression) and @@ -93,10 +93,9 @@ representations: We start out with lowering the function body to an `Rvalue` so we can create an assignment to `RETURN_PLACE`, This `Rvalue` lowering will in turn trigger lowering to `Operand` for its arguments (if any). -`Operand` lowering either produces a `const` -operand, or moves/copies out of a `Place`, thus triggering a `Place` lowering. -An -expression being lowered to a `Place` can in turn trigger a temporary to be created +`Operand` lowering either produces a `const` operand, +or moves/copies out of a `Place`, thus triggering a `Place` lowering. +An expression being lowered to a `Place` can in turn trigger a temporary to be created if the expression being lowered contains operations. This is where the snake bites its own tail and we need to trigger an `Rvalue` lowering for the expression to be written