auditwheel: copy unpatched cargo output back before in-place patching#3159
Merged
auditwheel: copy unpatched cargo output back before in-place patching#3159
Conversation
`fs::rename` in `finalize_staged_artifacts` to skip the full byte copy on ext4. The trade-off was that whenever auditwheel / delocate / delvewheel actually rewrote bytes (DT_NEEDED, Mach-O load commands, PE imports, RPATH), the cargo output path was left empty (same-fs) or holding the unpatched original (cross-device), and the staged copy was never restored — see PyO3#2969 for why patched bytes must not flow back to cargo's incremental cache. Tools and CI scripts that read `target/<target>/release/<bin>` after a patched build saw a missing or stale file. Restore the invariant that the cargo output path always holds the clean unpatched bytes after a successful build, while keeping the rename-back fast-path for the unpatched case. The repair pipeline now splits into a detect step and a patch step: - New `PatchKind<'a>` enum carries the per-call context (`Repair { grafted, libs_dir, artifact_dir }` or `Editable`) and is used by both prediction and execution. - New `WheelRepairer::patch_required(audited, kind) -> Vec<bool>` predicts, before any bytes are rewritten, which audited artifacts the upcoming `patch` call will modify in place. Conservative defaults: all-true on `Repair`, all-false on `Editable`. `ElfRepairer` overrides the editable arm to track `linked_paths.is_empty()` per-artifact so multi-bin editable builds where only some bins carry cargo-target search paths still get the cheap rename-back for the rest. - `WheelRepairer::patch_editable` is folded into `patch` via `PatchKind::Editable`, removing the parallel API surface. - New `copy_back_cargo_outputs` helper in `build_context/repair.rs` reflinks/copies the still-clean staged artifact back to its cargo output path for every `will_patch[i] == true` entry, then `take()`s `cargo_output_path` so finalize won't overwrite the unpatched bytes with the patched ones. Cross-device staging is a no-op (the original is already at the cargo path); same-filesystem staging pays one reflink per genuinely-patched artifact (O(1) on apfs/btrfs/xfs/refs, a real copy on ext4) — same cost as the pre-PyO3#3155 worst case, but scoped only to artifacts that actually need patching. - `add_external_libs` calls `patch_required` then `copy_back_cargo_outputs` then `patch` on both the editable and non-editable branches. The unconditional `cargo_output_path = None` clearing loop is gone — clearing is now per-artifact and happens in the helper. Net effect: - Unpatched artifacts: identical to today — single rename-back in `finalize_staged_artifacts`. - Patched artifacts (same-fs): one reflink/copy at patch time instead of the rename-back; cargo output ends up with unpatched bytes. - Patched artifacts (cross-device): zero extra I/O — `stage_file` already left the unpatched original at cargo output. Public auditwheel types (`AuditedArtifact`, `GraftedLib`, `AuditResult`, `PatchKind`) now derive `Debug` for consistency with the rest of the public surface.
Tightens the per-platform `patch` impls so artifacts whose `external_libs` is empty are left alone, and loosens the default `WheelRepairer::patch_required` to `!aa.external_libs.is_empty()` for `Repair`. Together this saves one reflink in `copy_back_cargo_outputs` plus one `patchelf::set_rpath` / `patch_macho` + `ad_hoc_sign` / `pe_patch::replace_needed` invocation per "audited but actually unaffected" artifact in mixed multi-bin wheels. Before this change every artifact in the audited slice was rewritten: - ELF: the final RPATH-set loop ran unconditionally on every artifact. - Mach-O: `install_name_changes` was built from the full grafted name map for every artifact, then `patch_macho` + `ad_hoc_sign` were invoked even when the artifact's load commands didn't reference any grafted install name. - PE: `pe_patch::replace_needed` was invoked with the full grafted replacement list for every artifact, regardless of whether its import table actually referenced any of those names. The new behaviour, scoped per-artifact: - ELF: skip the RPATH-set when `external_libs.is_empty()`. DT_NEEDED rewriting was already gated on per-artifact deps. - Mach-O: filter `install_name_changes` to entries the artifact actually imports (mirroring the existing ELF DT_NEEDED logic), and only call `patch_macho` + `ad_hoc_sign` when the filtered list is non-empty. - PE: filter `replacements` per-artifact case-insensitively against `aa.external_libs[*].name` and skip `pe_patch::replace_needed` when empty. The new default `patch_required` for `Repair` relies on the invariant that `prepare_grafted_libs` builds `grafted` from `audited.iter().flat_map(|a| &a.external_libs)`, so an artifact with empty `external_libs` is guaranteed to share no install names with the grafted set and is therefore left untouched by every current `patch` implementation. `ElfRepairer::patch_required` keeps the override (duplicating the default Repair arm because Rust traits can't override a single match arm) only because its `Editable` arm still needs the `linked_paths`-based prediction. The two changes ship together: tightening only the default would regress PyO3#2969 (finalize would rename patched bytes back to the cargo output path); tightening only the impls without the prediction would just be dead-code elimination with `copy_back_cargo_outputs` still paying an unnecessary reflink for the now-unaffected artifacts. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…r-artifact filter The new per-artifact `install_name_changes` filter in `MacOSRepairer::patch` relies on `lddtree`'s `Library::name` being byte-exact equal to the install-name string in the binary's `LC_LOAD_DYLIB` (no leaf-vs-`@rpath/` normalization). That contract is currently held by `lddtree-0.5.x` (see `find_macho_library` -> `try_single_candidate`, which sets `name: lib_name.to_string()` from the unmodified load-command bytes), but nothing in this crate enforces it. Document the dependency so a future `lddtree` bump that normalizes the form trips on the comment instead of silently making the filter narrower than `patch_macho`'s `change_install_name` lookup. No behavior change. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…putState` enum `stage_file`, `copy_back_cargo_outputs`, and `finalize_staged_artifacts` were communicating through an implicit invariant: cross-device staging leaves the original at `cargo_output`, same-fs staging leaves it empty, and post-patch artifacts have `cargo_output_path` cleared via `take()`. `copy_back_cargo_outputs` and `finalize_staged_artifacts` then re-derived the staging mode at runtime by stat-ing `cargo_output.exists()`. That worked but was fragile: any future refactor of `stage_file` that broke the "cross-device leaves the original" contract would silently break the copy-back path (cargo path ends up holding stale or zero bytes because finalize no-ops on a `take()`'d `cargo_output_path` while `copy_back_cargo_outputs` skipped the reflink based on a wrong `exists()` reading). Spotted in code review of PyO3#3159. Replace `Option<PathBuf>` with a four-variant `CargoOutputState` enum that captures the staging mode explicitly: ```rust pub enum CargoOutputState { NotStaged, Renamed { cargo_output: PathBuf }, Mirrored { cargo_output: PathBuf }, Patched, } ``` - `stage_file` now returns `(PathBuf, CargoOutputState)` and emits `Renamed` on the `fs::rename` path / `Mirrored` on the cross-device reflink-or-copy path, eliminating the `exists()`-based inference. - `copy_back_cargo_outputs` matches on the state and transitions `Renamed` -> `Patched` (with the reflink) or `Mirrored` -> `Patched` (no I/O); `NotStaged` / `Patched` are no-ops. The runtime `cargo_output.exists()` check is gone. - `finalize_staged_artifacts` dispatches on the same state: `Renamed` goes through `restore_renamed` (rename, fallback to reflink+remove if `target/` straddled a mount point added after staging), `Mirrored` drops the staged duplicate, `Patched` / `NotStaged` no-op. The `exists()` early-return is gone. Future refactors of `stage_file` must now update the enum variants explicitly, making the contract impossible to violate silently. The behavior at every call site is identical to the previous logic for every reachable state today. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fs::renameinfinalize_staged_artifactsto skip the full byte copy on ext4. The trade-off was that whenever auditwheel / delocate / delvewheel actually rewrote bytes (DT_NEEDED, Mach-O load commands, PE imports, RPATH), the cargo output path was left empty (same-fs) or holding the unpatched original (cross-device), and the staged copy was never restored — see #2969 for why patched bytes must not flow back to cargo's incremental cache. Tools and CI scripts that readtarget/<target>/release/<bin>after a patched build saw a missing or stale file.Restore the invariant that the cargo output path always holds the clean unpatched bytes after a successful build, while keeping the rename-back fast-path for the unpatched case.
The repair pipeline now splits into a detect step and a patch step:
PatchKind<'a>enum carries the per-call context (Repair { grafted, libs_dir, artifact_dir }orEditable) and is used by both prediction and execution.WheelRepairer::patch_required(audited, kind) -> Vec<bool>predicts, before any bytes are rewritten, which audited artifacts the upcomingpatchcall will modify in place. Conservative defaults: all-true onRepair, all-false onEditable.ElfRepaireroverrides the editable arm to tracklinked_paths.is_empty()per-artifact so multi-bin editable builds where only some bins carry cargo-target search paths still get the cheap rename-back for the rest.WheelRepairer::patch_editableis folded intopatchviaPatchKind::Editable, removing the parallel API surface.copy_back_cargo_outputshelper inbuild_context/repair.rsreflinks/copies the still-clean staged artifact back to its cargo output path for everywill_patch[i] == trueentry, thentake()scargo_output_pathso finalize won't overwrite the unpatched bytes with the patched ones. Cross-device staging is a no-op (the original is already at the cargo path); same-filesystem staging pays one reflink per genuinely-patched artifact (O(1) on apfs/btrfs/xfs/refs, a real copy on ext4) — same cost as the pre-perf: defer stage_artifact copy-back, finalize via rename when unpatched #3155 worst case, but scoped only to artifacts that actually need patching.add_external_libscallspatch_requiredthencopy_back_cargo_outputsthenpatchon both the editable and non-editable branches. The unconditionalcargo_output_path = Noneclearing loop is gone — clearing is now per-artifact and happens in the helper.Net effect:
finalize_staged_artifacts.stage_filealready left the unpatched original at cargo output.Public auditwheel types (
AuditedArtifact,GraftedLib,AuditResult,PatchKind) now deriveDebugfor consistency with the rest of the public surface.The follow-up commit tightens each platform's
patchimpl to skip artifacts whoseexternal_libsis empty (ELF: skipset_rpath; Mach-O: filterinstall_name_changesper-artifact and skippatch_macho+ad_hoc_signwhen the filter is empty; PE: filterreplacementsper-artifact case-insensitively againstaa.external_libs[*].nameand skippe_patch::replace_neededwhen empty), and loosens the defaultpatch_requiredforRepairto!aa.external_libs.is_empty(). The two changes ship together — tightening only the default would regress #2969, while tightening only the impls would just be dead-code elimination withcopy_back_cargo_outputsstill paying an unnecessary reflink for the now-unaffected artifacts.Known limitation: dlopen-only dependencies
external_libsis populated by static dependency analysis (lddtree on the artifact's LC_LOAD_DYLIB / DT_NEEDED / PE import table). A library reached only via runtime loading —dlopen("libfoo.so", ...),dlsym,LoadLibrary— won't appear inexternal_libs, so its host artifact getswill_patch = falseand is skipped by the tightenedpatchimpls.For Mach-O install-name rewriting and PE import-table rewriting this is benign: a
dlopendoesn't go through the load command / import table, so rewriting them was a no-op anyway. ELF DT_NEEDED rewriting was already gated onaa.external_libspre-PR for the same reason.The new behavior in this PR widens the gating to ELF RPATH append: pre-PR, every audited artifact got
$ORIGIN/<dist>.libsappended unconditionally; post-PR, only those with non-emptyexternal_libsdo. Practically: a Linux extension that statically depends on nothing butdlopens a library bundled by some other artifact in the same wheel will now lack the RPATH and fail at runtime. Workaround: declare a static dependency on at least one bundled lib (so the artifact gets the auto-RPATH), or set the RPATH manually viacargo:rustc-link-arg=-Wl,-rpath,$ORIGIN/<dist>.libs.