From 851817340b281e649d160e96e7250cbb90cd8543 Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Thu, 24 Oct 2024 14:18:24 -0700 Subject: [PATCH 01/10] Initial writeup --- text/0000-layout-packed-aligned.md | 176 +++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 text/0000-layout-packed-aligned.md diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md new file mode 100644 index 00000000000..7274da7fa32 --- /dev/null +++ b/text/0000-layout-packed-aligned.md @@ -0,0 +1,176 @@ +- Feature Name: `c_layout_packed_aligned` +- Start Date: 2024-10-24 s +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/100743) + +# Summary +[summary]: #summary + +This RFC makes it legal to have `#[repr(C)]` structs that are: +- Both packed and aligned. +- Packed, and transitively contains`#[repr(align)]` types. + +It also introduces `#[repr(system)]` which is designed for interoperability with operating system APIs. +It has the same behavior as `#[repr(C)]` except on `*-pc-windows-gnu` targets where it uses the msvc layout +rules instead. + +# Motivation +[motivation]: #motivation + +This RFC enables the following struct definitions: + +```rs +#[repr(C, packed(2), align(4))] +struct Foo { // Alignment = 4, Size = 8 + a: u8, // Offset = 0 + b: u32, // Offset = 2 +} +``` + +This is commonly needed when Rust is being used to interop with existing C and C++ code bases, which may contain +unaligned types. For example in `clang` it is possible to create the following type definition, and there is +currently no easy way to create a matching Rust type: + +```cpp +struct __attribute__((packed, aligned(4))) MyStruct { + uint8_t a; + uint32_t b; +}; +``` + +Currently `#[repr(packed(_))]` structs cannot transitively contain #[repr(align(_))] structs due to differing behavior between msvc and gcc/clang. +However, in most cases, the user would expect `#[repr(C)]` to produce a struct layout matching the same type as defined by the current target. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +## `#[repr(C)]` +When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, +the resulting layout matches the current compilation target. + +For example, given: +```c +#[repr(C, align(4))] +struct Foo(u8); +#[repr(C, packed(1))] +struct Bar(Foo); +``` +`align_of::()` would be 4 for `*-pc-windows-msvc` and 1 for everything else. + + +## `#[repr(system)]` +When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, +the resulting layout matches the current compilation target. + +For example, given: +```c +#[repr(C, align(4))] +struct Foo(u8); +#[repr(C, packed(1))] +struct Bar(Foo); +``` +`align_of::()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. + + + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +In the following paragraphs, "Decreasing M to N" means: +``` +if M > N { + M = n +} +``` + +"Increasing M to N" means: +``` +if M < N { + M = N +} +``` + + +`#[repr(align(N))]` increases the base alignment of a type to be N. + +`#[repr(packed(M))]` decreases the alignment of the struct fields to be M. Because the base alignment of the type +is defined as the maximum of the alignment for any fields, this also has the indirect result of decreasing the base +alignment of the type to be M. + +When the align and packed modifiers are applied on the same type as `#[repr(align(N), packed(M))]`, +the alignment of the struct fields are decreased to be M. Then, the base alignment of the type is +increased to be N. + +When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, +- The field is first `pad_to_align`. Then, the field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, `#[repr(system)]` on non-windows targets) +- The field is added to the struct with alignment increased to N. The alignment requirement overrides the packing requirement. (MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets) + +# Drawbacks +[drawbacks]: #drawbacks + +Historically the meaning of `#[repr(C)]` has been somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: +1. Having a target-independent and stable representation of the data structure for storage or transmission. +2. FFI with C and C++ libraries compiled for the same target. +3. Interoperability with operating system APIs. + +Today, `#[repr(C)]` is being used for all 3 scenarios because the user cannot create a `#[repr(C)]` struct with ambiguous layout between targets. However, this also means +that there exists some C layouts that cannot be specified using `#[repr(C)]`. + +This RFC addresses use case 2 with `#[repr(C)]` and use case 3 with `#[repr(system)]`. For use case 1, people will have to seek alternative solutions such as `crABI` or +protobuf. However, it could be a footgun if people continue to use `#[repr(C)]` for use case 1. + + + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +This RFC clarifies that: +- `repr(C)` must interoperate with the C compiler for the target. +- `repr(system)` must interoperate with the operating system APIs for the target. +- Similiar to Clang, `repr(C)` does not guarantee consistent layout between targets. + +Alternatively, we can also create syntax that allows the user to specify exactly which semantic to use when packed structs transitively contains aligned fields. +For example, a new attribute: #[repr(align_override_packed(N))] that can be used when the behavior of the child overriding the parent alignment is desired. + +#[repr(align(N))] #[repr(packed)] can be used together to get the opposite behavior, parent/outer alignment wins. + +Explicitly specifying the pack/align semantic has the drawback of complicating FFI. For example, you might need two different definition files depending on the target. + +Therefore, a stable layout across compilation target should be relegated as future work. + + + + +# Prior art +[prior-art]: #prior-art + +Clang matches the Windows ABI for `x86_64-pc-windows-msvc` and matches the GCC ABI for `x86_64-pc-windows-gnu`. + +MinGW always uses the GCC ABI. + +We already have both `C` and `system` [calling conventions](https://doc.rust-lang.org/beta/nomicon/ffi.html#foreign-calling-conventions) +to support differing behavior on `x86_windows` and `x86_64_windows`. + + +This issue was introduced in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed(N))]` and have since underwent extensive community discussions: +- [#[repr(align(N))] fields not allowed in #[repr(packed(M>=N))] structs](https://github.com/rust-lang/rust/issues/100743) +- [repr(C) does not always match the current target's C toolchain (when that target is windows-msvc)](https://github.com/rust-lang/unsafe-code-guidelines/issues/521) +- [repr(C) is unsound on MSVC targets](https://github.com/rust-lang/rust/issues/81996) +- [E0587 error on packed and aligned structures from C](https://github.com/rust-lang/rust/issues/59154) +- [E0587 error on packed and aligned structures from C (bindgen)](https://github.com/rust-lang/rust-bindgen/issues/1538) +- [Support for both packed and aligned (in repr(C)](https://github.com/rust-lang/rust/issues/118018) +- [bindgen wanted features & bugfixes (Rust-for-Linux)](https://github.com/Rust-for-Linux/linux/issues/353) +- [packed type cannot transitively contain a #[repr(align)] type](https://github.com/rust-lang/rust-bindgen/issues/2179) +- [structure layout using __aligned__ attribute is incorrect](https://github.com/rust-lang/rust-bindgen/issues/867) + + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +None for now. + + +# Future possibilities +[future-possibilities]: #future-possibilities + +People intending for a stable struct layout consistent across targets would be directed to use `crABI`. From 3d4419820a3a2bd1f88b8855f8f2ab9415b56bb9 Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Thu, 24 Oct 2024 14:27:33 -0700 Subject: [PATCH 02/10] Update PR numbers --- text/0000-layout-packed-aligned.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 7274da7fa32..693af102253 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -1,7 +1,7 @@ -- Feature Name: `c_layout_packed_aligned` -- Start Date: 2024-10-24 s -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/100743) +- Feature Name: `layout_packed_aligned` +- Start Date: 2024-10-24 +- RFC PR: [rust-lang/rfcs#3718](https://github.com/rust-lang/rfcs/pull/3718) +- Rust Issue: [rust-lang/rust#100743](https://github.com/rust-lang/rust/issues/100743) # Summary [summary]: #summary From e77c6be23d218a0202b156acfbc9f02cc206e86a Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Thu, 24 Oct 2024 14:30:02 -0700 Subject: [PATCH 03/10] update --- text/0000-layout-packed-aligned.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 693af102253..9c88b726f62 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -46,7 +46,7 @@ However, in most cases, the user would expect `#[repr(C)]` to produce a struct l ## `#[repr(C)]` When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, -the resulting layout matches the current compilation target. +the resulting layout matches the target toolchain ABI. For example, given: ```c @@ -60,7 +60,7 @@ struct Bar(Foo); ## `#[repr(system)]` When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, -the resulting layout matches the current compilation target. +the resulting layout matches the target OS ABI. For example, given: ```c From af3cc3067948ff3de32a80fe49cffb17caf75fc6 Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Fri, 25 Oct 2024 09:00:03 -0700 Subject: [PATCH 04/10] Update --- text/0000-layout-packed-aligned.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 9c88b726f62..6c1e28d7fae 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -64,9 +64,9 @@ the resulting layout matches the target OS ABI. For example, given: ```c -#[repr(C, align(4))] +#[repr(system, align(4))] struct Foo(u8); -#[repr(C, packed(1))] +#[repr(system, packed(1))] struct Bar(Foo); ``` `align_of::()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. From 175d29fabbe696ab2a54a8c6ef9e23542da766fb Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Tue, 29 Oct 2024 09:05:11 -0700 Subject: [PATCH 05/10] Updates --- text/0000-layout-packed-aligned.md | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 6c1e28d7fae..8cb8be18c59 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -38,8 +38,9 @@ struct __attribute__((packed, aligned(4))) MyStruct { }; ``` -Currently `#[repr(packed(_))]` structs cannot transitively contain #[repr(align(_))] structs due to differing behavior between msvc and gcc/clang. -However, in most cases, the user would expect `#[repr(C)]` to produce a struct layout matching the same type as defined by the current target. +Currently, `#[repr(packed(_))]` structs cannot be `#[repr(align(_))]` or transitively contain `#[repr(align(_))]` types. Attempting to do so results in a [hard error](https://doc.rust-lang.org/nightly/error_codes/E0588.html). + +This behavior was added in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed)]` due to concerns over differing behavior between msvc and gcc/clang. This makes it cumbersome or even impossible to produce C-compatible struct layouts in Rust when the corresponding C types were annotated with both `packed` and `aligned`. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -71,7 +72,11 @@ struct Bar(Foo); ``` `align_of::()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. - +## `#[repr(Rust)]` +When `align(N)` and `packed(M)` attributes exist on the same type, or when `packed` structs contain `aligned` fields, +the type will have a base alignment of `N`, while the struct fields will be laid out as if their alignment was +decreased to `M`. However, in general Rust is free to reorder +these fields for optimization purposes, and the only guarantee is that the fields will maintain a minimum alignment of `M`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -101,8 +106,9 @@ When the align and packed modifiers are applied on the same type as `#[repr(alig the alignment of the struct fields are decreased to be M. Then, the base alignment of the type is increased to be N. -When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, -- The field is first `pad_to_align`. Then, the field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, `#[repr(system)]` on non-windows targets) +When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, depending on the +target triplet, either: +- The field is first `pad_to_align`. Then, the field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, `#[repr(system)]` on non-windows targets), or - The field is added to the struct with alignment increased to N. The alignment requirement overrides the packing requirement. (MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets) # Drawbacks From a53aab1607fefe1399622670e1e3034b4ee7e0bb Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Tue, 29 Oct 2024 09:08:33 -0700 Subject: [PATCH 06/10] Updates --- text/0000-layout-packed-aligned.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 8cb8be18c59..831a7315759 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -74,8 +74,8 @@ struct Bar(Foo); ## `#[repr(Rust)]` When `align(N)` and `packed(M)` attributes exist on the same type, or when `packed` structs contain `aligned` fields, -the type will have a base alignment of `N`, while the struct fields will be laid out as if their alignment was -decreased to `M`. However, in general Rust is free to reorder +the type will have their base alignment increased to `N`, while the struct fields will be laid out as if their +alignments were decreased to `M`. However, in general Rust is free to reorder these fields for optimization purposes, and the only guarantee is that the fields will maintain a minimum alignment of `M`. # Reference-level explanation From 5190399bddb30e734096d79be0696457cc0bda35 Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Mon, 4 Nov 2024 14:26:23 -0800 Subject: [PATCH 07/10] Update text/0000-layout-packed-aligned.md Co-authored-by: Christopher Durham --- text/0000-layout-packed-aligned.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 831a7315759..8ec7edf2a75 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -108,8 +108,8 @@ increased to be N. When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, depending on the target triplet, either: -- The field is first `pad_to_align`. Then, the field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, `#[repr(system)]` on non-windows targets), or -- The field is added to the struct with alignment increased to N. The alignment requirement overrides the packing requirement. (MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets) +- The field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (This is the case for GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, and `#[repr(system)]` on non-windows targets.) +- The field is added to the struct with alignment decreased to M and then increased to N. The alignment requirement overrides the packing requirement. (This is the case for MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets.) # Drawbacks [drawbacks]: #drawbacks From a12370d073b82d77d9eabcc74704935bad3bae8e Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Tue, 12 Nov 2024 09:44:08 -0800 Subject: [PATCH 08/10] Update drawbacks --- text/0000-layout-packed-aligned.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 8ec7edf2a75..aa5f0524b25 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -114,7 +114,9 @@ target triplet, either: # Drawbacks [drawbacks]: #drawbacks -Historically the meaning of `#[repr(C)]` has been somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: +Although [https://doc.rust-lang.org/reference/type-layout.html#the-c-representation](the Rust reference) documents the meaning +of repr(C) quite clearly (types are laid out linearly, according to a fixed algorithm.), when you see `#[repr(C)]` in code, +its meaning can be somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: 1. Having a target-independent and stable representation of the data structure for storage or transmission. 2. FFI with C and C++ libraries compiled for the same target. 3. Interoperability with operating system APIs. @@ -125,6 +127,19 @@ that there exists some C layouts that cannot be specified using `#[repr(C)]`. This RFC addresses use case 2 with `#[repr(C)]` and use case 3 with `#[repr(system)]`. For use case 1, people will have to seek alternative solutions such as `crABI` or protobuf. However, it could be a footgun if people continue to use `#[repr(C)]` for use case 1. +It's worthy to note that while this RFC does require people to stop treating `repr(C)` as a linear layout but rather as an +ABI compatiblity layout, our intention is not proposing a breaking change: `packed` structs are previously banned from +transitively containing `aligned` fields, so in most cases existing `repr(C)` structs will be laid out in exactly the same +way as it did before. However, due to an oversight in the current implementation of the Rust compiler, the restriction +can actuall be +[circumvented](https://github.com/rust-lang/rust/issues/100743#issuecomment-1229343705) using generics. Applications +using this pattern to circumvent the restriction will see a change in the struct layout on MSVC targets. + +This RFC alone still doesn't make `repr(C)` fully match the target (MSVC) toolchain in all cases; the known other +divergences are enums with overflowing discriminant and how a field of type [T; 0] is handled. So while this does +improve parity, the reality is that there are still edge cases to keep track of for now. These cases shall be addressed +in future RFCs. + # Rationale and alternatives From 2ed92ef75ad416edd5f950d3742ef7944542675f Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Thu, 21 Nov 2024 11:29:13 -0800 Subject: [PATCH 09/10] Update 0000-layout-packed-aligned.md --- text/0000-layout-packed-aligned.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index aa5f0524b25..3cf6bb0074b 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -114,8 +114,8 @@ target triplet, either: # Drawbacks [drawbacks]: #drawbacks -Although [https://doc.rust-lang.org/reference/type-layout.html#the-c-representation](the Rust reference) documents the meaning -of repr(C) quite clearly (types are laid out linearly, according to a fixed algorithm.), when you see `#[repr(C)]` in code, +Although [The Rust reference](https://doc.rust-lang.org/reference/type-layout.html#the-c-representation) documents the meaning +of repr(C) quite clearly (types are laid out linearly, according to a fixed algorithm), when you see `#[repr(C)]` in code, its meaning can be somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: 1. Having a target-independent and stable representation of the data structure for storage or transmission. 2. FFI with C and C++ libraries compiled for the same target. From ee3cd02eb29f50b6bda95380e0e815abf4c67c7d Mon Sep 17 00:00:00 2001 From: Zhixing Zhang Date: Thu, 8 May 2025 00:18:28 -0700 Subject: [PATCH 10/10] Update based on feedback --- text/0000-layout-packed-aligned.md | 78 ++++++++++++++---------------- 1 file changed, 37 insertions(+), 41 deletions(-) diff --git a/text/0000-layout-packed-aligned.md b/text/0000-layout-packed-aligned.md index 3cf6bb0074b..9d1780e99b6 100644 --- a/text/0000-layout-packed-aligned.md +++ b/text/0000-layout-packed-aligned.md @@ -6,13 +6,17 @@ # Summary [summary]: #summary -This RFC makes it legal to have `#[repr(C)]` structs that are: +This RFC deprecates the existing `#[repr(C)]` attribute and introduces two new variants of this attribute: + +- `#[repr(C(target))]`, for structs intended for interoperability with operating system APIs +- `#[repr(C(system))]`, for structs intended for interoperability with libraries compiled for the current target + +Compared to `#[repr(C)]`, these new attributes require the user to clarify their usage intent. This allows us to have nested structs that are: - Both packed and aligned. - Packed, and transitively contains`#[repr(align)]` types. +These usages were previously prohibited under [E0588](https://doc.rust-lang.org/nightly/error_codes/E0588.html). -It also introduces `#[repr(system)]` which is designed for interoperability with operating system APIs. -It has the same behavior as `#[repr(C)]` except on `*-pc-windows-gnu` targets where it uses the msvc layout -rules instead. +Existing `#[repr(C)]` usages will emit a warning and default to `#[repr(C(target))]`. # Motivation [motivation]: #motivation @@ -20,7 +24,7 @@ rules instead. This RFC enables the following struct definitions: ```rs -#[repr(C, packed(2), align(4))] +#[repr(C(target), packed(2), align(4))] struct Foo { // Alignment = 4, Size = 8 a: u8, // Offset = 0 b: u32, // Offset = 2 @@ -40,14 +44,25 @@ struct __attribute__((packed, aligned(4))) MyStruct { Currently, `#[repr(packed(_))]` structs cannot be `#[repr(align(_))]` or transitively contain `#[repr(align(_))]` types. Attempting to do so results in a [hard error](https://doc.rust-lang.org/nightly/error_codes/E0588.html). -This behavior was added in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed)]` due to concerns over differing behavior between msvc and gcc/clang. This makes it cumbersome or even impossible to produce C-compatible struct layouts in Rust when the corresponding C types were annotated with both `packed` and `aligned`. +This behavior was added in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed)]` due to concerns over differing behavior between MSVC and gcc/clang. This makes it cumbersome or even impossible to produce C-compatible struct layouts in Rust when the corresponding C types were annotated with both `packed` and `aligned`. + +Although [The Rust reference](https://doc.rust-lang.org/reference/type-layout.html#the-c-representation) documents the meaning +of `repr(C)` quite clearly (types are laid out linearly, according to a fixed algorithm.), when you see `#[repr(C)]` in code, +its meaning can be somewhat ambiguous. Their intention could be one of three things: +1. Having a target-independent and stable representation of the data structure for storage or transmission. +2. FFI with C and C++ libraries compiled for the same target. +3. Interoperability with operating system APIs. + +Previously, `#[repr(C)]` was being used for all 3 scenarios because [E0588](https://doc.rust-lang.org/nightly/error_codes/E0588.html) prohibits the user from creating a `#[repr(C)]` struct with ambiguous layout between targets. +This RFC seeks to differentiate between 2 and 3, leaving 1 for a Rust-defined linear layout to be addressed in a separate RFC. + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -## `#[repr(C)]` -When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, -the resulting layout matches the target toolchain ABI. +## `#[repr(C(target))]` +Structs annotated with this attribute are guaranteed to have the same layout as a struct produced by the C compiler for the current target toolchain. +This is useful for interfacing with libraries compiled for the current target. For example, given: ```c @@ -56,12 +71,12 @@ struct Foo(u8); #[repr(C, packed(1))] struct Bar(Foo); ``` -`align_of::()` would be 4 for `*-pc-windows-msvc` and 1 for everything else. +`align_of::()` would be 4 for `*-pc-windows-msvc` and 1 for everything else, matching the target toolchain (MSVC). -## `#[repr(system)]` -When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, -the resulting layout matches the target OS ABI. +## `#[repr(C(system))]` +Structs annotated with this attribute are guaranteed to have the same layout as a struct defined by the target OS ABI. +This is useful for interfacing with operating system APIs. For example, given: ```c @@ -70,13 +85,7 @@ struct Foo(u8); #[repr(system, packed(1))] struct Bar(Foo); ``` -`align_of::()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. - -## `#[repr(Rust)]` -When `align(N)` and `packed(M)` attributes exist on the same type, or when `packed` structs contain `aligned` fields, -the type will have their base alignment increased to `N`, while the struct fields will be laid out as if their -alignments were decreased to `M`. However, in general Rust is free to reorder -these fields for optimization purposes, and the only guarantee is that the fields will maintain a minimum alignment of `M`. +`align_of::()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. This matches the target OS (windows). # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -108,34 +117,21 @@ increased to be N. When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, depending on the target triplet, either: -- The field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (This is the case for GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, and `#[repr(system)]` on non-windows targets.) -- The field is added to the struct with alignment decreased to M and then increased to N. The alignment requirement overrides the packing requirement. (This is the case for MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets.) +- The field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (This is the case for GCC, `#[repr(C(target))]` on gnu targets, and `#[repr(C(system))]` on non-windows targets.) +- The field is added to the struct with alignment decreased to M and then increased to N. The alignment requirement overrides the packing requirement. (This is the case for MSVC, `#[repr(C(target))]` on msvc targets, `#[repr(C(system))]` on windows targets.) # Drawbacks [drawbacks]: #drawbacks -Although [The Rust reference](https://doc.rust-lang.org/reference/type-layout.html#the-c-representation) documents the meaning -of repr(C) quite clearly (types are laid out linearly, according to a fixed algorithm), when you see `#[repr(C)]` in code, -its meaning can be somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: -1. Having a target-independent and stable representation of the data structure for storage or transmission. -2. FFI with C and C++ libraries compiled for the same target. -3. Interoperability with operating system APIs. - -Today, `#[repr(C)]` is being used for all 3 scenarios because the user cannot create a `#[repr(C)]` struct with ambiguous layout between targets. However, this also means -that there exists some C layouts that cannot be specified using `#[repr(C)]`. - -This RFC addresses use case 2 with `#[repr(C)]` and use case 3 with `#[repr(system)]`. For use case 1, people will have to seek alternative solutions such as `crABI` or -protobuf. However, it could be a footgun if people continue to use `#[repr(C)]` for use case 1. - It's worthy to note that while this RFC does require people to stop treating `repr(C)` as a linear layout but rather as an -ABI compatiblity layout, our intention is not proposing a breaking change: `packed` structs are previously banned from -transitively containing `aligned` fields, so in most cases existing `repr(C)` structs will be laid out in exactly the same +ABI compatiblity layout, it is not our intention to propose a breaking change: `packed` structs are previously banned from +transitively containing `aligned` fields, so the proposed default `repr(C(target))` will have structs laid out in exactly the same way as it did before. However, due to an oversight in the current implementation of the Rust compiler, the restriction can actuall be [circumvented](https://github.com/rust-lang/rust/issues/100743#issuecomment-1229343705) using generics. Applications -using this pattern to circumvent the restriction will see a change in the struct layout on MSVC targets. +using this pattern to circumvent the restriction may see a change in the struct layout on MSVC targets. -This RFC alone still doesn't make `repr(C)` fully match the target (MSVC) toolchain in all cases; the known other +This RFC alone still doesn't make `repr(C(target))` fully match the target (MSVC) toolchain in all cases; the known other divergences are enums with overflowing discriminant and how a field of type [T; 0] is handled. So while this does improve parity, the reality is that there are still edge cases to keep track of for now. These cases shall be addressed in future RFCs. @@ -146,8 +142,8 @@ in future RFCs. [rationale-and-alternatives]: #rationale-and-alternatives This RFC clarifies that: -- `repr(C)` must interoperate with the C compiler for the target. -- `repr(system)` must interoperate with the operating system APIs for the target. +- `repr(C(target))` must interoperate with the C compiler for the target. +- `repr(C(system))` must interoperate with the operating system APIs for the target. - Similiar to Clang, `repr(C)` does not guarantee consistent layout between targets. Alternatively, we can also create syntax that allows the user to specify exactly which semantic to use when packed structs transitively contains aligned fields.