bazel: make build byte-for-byte reproducible#30187
bazel: make build byte-for-byte reproducible#30187travisdowns merged 2 commits intoredpanda-data:devfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to make Bazel builds of //:redpanda byte-for-byte reproducible on the same host by removing sandbox-specific timestamps and absolute paths from third-party build outputs.
Changes:
- Pin/normalize build metadata for OpenSSL (timestamp + build-root path stripping) and apply the patch via
http_archive. - Normalize embedded paths for libxml2 and hwloc via configure options and compiler prefix-map flags.
- Patch cranelift build scripts to emit deterministic, relative paths; wire these patches through
MODULE.bazel/ lockfile updates.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| bazel/thirdparty/openssl.BUILD | Sets SOURCE_DATE_EPOCH to stabilize OpenSSL build metadata. |
| bazel/thirdparty/openssl-reproducible-buildinf.patch | Strips sandbox/build-root paths from OpenSSL mkbuildinf.pl output. |
| bazel/thirdparty/libxml2.BUILD | Attempts to prevent sandbox-derived paths via fixed sysconfdir and prefix-map flags. |
| bazel/thirdparty/hwloc.BUILD | Attempts to prevent sandbox-derived paths via fixed runstatedir and prefix-map flags. |
| bazel/thirdparty/cranelift-codegen-reproducible.patch | Adjusts cranelift codegen build script to compute relative paths when sandboxed. |
| bazel/thirdparty/cranelift-assembler-x64-reproducible.patch | Adjusts assembler build script to write relative paths into generated Rust code. |
| bazel/repositories.bzl | Applies the new OpenSSL patch when fetching the dependency. |
| MODULE.bazel | Adds crate annotations to apply cranelift patches under bzlmod. |
| MODULE.bazel.lock | Records repository rule attribute changes (patches/args) for reproducibility. |
f8add27 to
03d2d01
Compare
CI test resultstest results on build#83219
|
There was a problem hiding this comment.
do we need to repeat this for arm64 assembler?
|
|
||
| # CORE-16110: make cranelift build scripts produce deterministic output. | ||
| # The cranelift-codegen patch can be removed after upgrading to a wasmtime | ||
| # version that includes 0694bba38 ("Strip prefixes in file names"). |
There was a problem hiding this comment.
this was merged a while ago and released. Can we just upgrade wasmtime?
There was a problem hiding this comment.
Oh, yes we can. Claude led me astray telling me it wasn't released yet, but I see that is just wrong.
There was a problem hiding this comment.
@rockwotj looks it requires a bump from v32 to v42 at a minimum to get fixes for both issues. I'm trying that out now, but is that the kind of jump you'd be OK if all existing tests pass?
There was a problem hiding this comment.
yes, should be - I can double check the release notes and new config knobs to make sure.
There was a problem hiding this comment.
Here's an effort at this:
https://github.com/travisdowns/redpanda/tree/td-hermetic-wasmtime42
It was a bit convoluted due to this allocator mangling change. So there's this workaround to link in this empty file. There is an upstream fix in rules_rust but it requires switching to unstable rustc.
I'm wondering if you think any of that is worth it, or this patch on v32 is look better now as a low resistance/less churn path.
There was a problem hiding this comment.
That change looks good, I wonder if there is a version there that doesn't require the allocator hack but does have the fix applied. I personally prefer the allocator hack with a big todo over this level of hacks in code I really have no clue about bc the hack is fairly clear as to what is going on.
There was a problem hiding this comment.
Actually I was able to remove the allocator hack with a newer rules_rust + a mangling option.
| +# across different Bazel output bases and sandbox instances. | ||
| +my $ebr = $ENV{'EXT_BUILD_ROOT'} // ''; | ||
| +if ($ebr ne '') { | ||
| + $cflags =~ s/\Q$ebr\E\/?/./g; |
There was a problem hiding this comment.
this feels sort of sketchy 😀
I suppose our alternatives are to use the bazel build for openssl is undesirable for other reasons? I am not sure if there is a way to achieve this without the regex replace... Do you know what cflags in practice have the build root embedded?
There was a problem hiding this comment.
Just to be clear, since this also sketched me out, I checked and this "cflags" isn't used for compilation (after this point), it's only used to embed the a copy of cflags in the binary probably for diagnostic output (e.g. so --version can tell you what flags you compiled with).
There was a problem hiding this comment.
Ah a comment to that effect would be helpful TBH
There was a problem hiding this comment.
Added a comment in the patch clarifying that the modified cflags are only embedded as a diagnostic string (shown by openssl version -a), not used for compilation.
|
Can we push on #28975. That gets rid of libxml. |
I don't understand this pattern. Why is the sandbox not identical everywhere? |
| @@ -0,0 +1,46 @@ | |||
| --- a/build.rs 2026-04-15 18:44:30.930898497 -0400 | |||
| +++ b/build.rs 2026-04-15 18:44:46.447314373 -0400 | |||
There was a problem hiding this comment.
Can't we just pin to a commit instead of a released version?
There was a problem hiding this comment.
In order to pick up the upstream fix? Probably.
There was a problem hiding this comment.
This is already in a released version upstream, see also my thread with Tyler.
Apparently "no": the sandboxes have paths like |
03d2d01 to
4b13657
Compare
|
Addressed — added a comment to the patch clarifying that the cflags modified here are only embedded as a diagnostic string (shown by |
YOu wanted to push something here? |
The hwloc configure_make build embeds sandbox-absolute paths into compiled objects, making the output non-deterministic across builds with different --output_base directories. Two sources of path leakage: 1. Inlined assert() macros in hwloc headers (helper.h, plugins.h) expand __FILE__ to the sandbox-absolute include path, which ends up in .rodata of 7 object files and propagates into libhwloc.a. Fix: add -ffile-prefix-map=$EXT_BUILD_ROOT=. to CFLAGS/CXXFLAGS via the env dict, remapping the sandbox root to "." in all __FILE__ expansions. 2. Autoconf derives runstatedir from --prefix, which points into the sandbox install directory. This path gets compiled into topology-linux.o as a string literal. Fix: pass --runstatedir=/var/run/hwloc to configure, overriding the prefix-derived default with a fixed path. With both fixes, libhwloc.a and all object files are bit-for-bit identical across independent builds.
4b13657 to
a0d5b2a
Compare
|
Rebased onto latest
What remains here is the hwloc + openssl reproducibility fixes. The openssl commit picked up a small conflict resolution against the 3.5.6 CVE bump — patch still applies cleanly since it only touches |
Two sources of non-determinism in the openssl configure_make build: 1. util/mkbuildinf.pl captures the full CC and CFLAGS strings and embeds them into crypto/buildinf.h as a compiled-in constant (the compiler_flags[] array). Since CC and CFLAGS contain $EXT_BUILD_ROOT-relative paths (compiler wrapper, --sysroot), the sandbox-absolute path leaks into libcrypto. This string is only used by the informational OpenSSL_version(OPENSSL_CFLAGS) API (i.e., `openssl version -f`). Fix: patch mkbuildinf.pl to strip $EXT_BUILD_ROOT prefixes from the compiler string before embedding it, using the EXT_BUILD_ROOT environment variable already exported by rules_foreign_cc. 2. mkbuildinf.pl also embeds a DATE macro using the current wall-clock time via time(), causing the binary to differ between runs even with identical inputs. OpenSSL already supports the SOURCE_DATE_EPOCH convention for reproducible builds. Fix: set SOURCE_DATE_EPOCH=0 in the env dict, pinning the embedded timestamp to the Unix epoch. With both fixes, libcrypto.so.3, libssl.so.3, libcrypto.a, and the openssl binary are all bit-for-bit identical across independent builds. The only remaining diff is Configure.log (a text build log). Refs: https://reproducible-builds.org/docs/source-date-epoch/ https://wiki.openssl.org/index.php/Compilation_and_Installation
a0d5b2a to
5f19753
Compare
No, I needed to write something because I made comments on this PR in "review mode", requires an overall comment to submit them (or something), not sure. |
Make the Bazel build of
//:redpandabyte-for-byte reproducible acrossbuilds on the same host (with different output bases or sandbox instances).
Five third-party dependencies were embedding sandbox-specific absolute paths
or timestamps into compiled artifacts. Each commit fixes one dependency:
-ffile-prefix-mapto normalize__FILE__in inlined asserts,--runstatedirto avoid baking the sandbox install prefix intoHWLOC_RUN_DIRSOURCE_DATE_EPOCH=0to pin the build timestamp, patchmkbuildinf.plto strip$EXT_BUILD_ROOTfrom the compiler info string--sysconfdir=/etcto fix the XML catalog path to the standardLinux location (instead of a sandbox-derived prefix),
-ffile-prefix-mapfor__FILE__normalizationmake_isle_source_path_relative()to computerelative paths via common ancestor when
strip_prefixfails (i.e., whenOUT_DIRis not underCWD, as in sandboxed builds)build.rsto relativizeOUT_DIRpathsin
generated-files.rsbefore they get compiled into the libraryVerified: building
//:redpandatwice with separate--output_basedirsproduces bit-for-bit identical binaries.
Backports Required
Release Notes