Integrate penguin-tools: ship guest debug tools#823
Open
lacraig2 wants to merge 16 commits into
Open
Conversation
Collaborator
Author
|
Bumped |
533d2b5 to
1f2e82b
Compare
Download penguin-tools.tar.gz (gdbserver, strace, ltrace, python cross-compiled to musl for the full guest arch set, plus their dylibs) in the downloader stage and extract it into /igloo_static. The tarball is rooted at igloo_static/, so it populates /igloo_static/<arch>/ and merges /igloo_static/dylibs/<arch>/; the existing per-arch symlink pass then exposes each binary as /igloo_static/utils.bin/<tool>.<arch>, which is what the guest paths (/igloo/utils/strace, /igloo/utils/ltrace, ...) and the STRACE / IGLOO_LTRACE env-flag machinery already expect. Also wire a local_packages override so a locally-built penguin-tools.tar.gz takes precedence, matching the other artifacts.
v0.0.2 carries the package-correctness fixes: strace is now present on every arch, every tool's interpreter (ld-musl-<arch>.so.1) and NEEDED sonames resolve within dylibs/<arch>, and the legacy arch/dylib dir names (arm64, ppc64, ppc64el, loongarch) are emitted as compat aliases. v0.0.1 shipped non-runnable binaries (missing interpreter) and no strace.
v0.0.2 shipped legacy compat dir symlinks (arm64/ppc64/ppc64el/loongarch) that collide with the directories hyperfs still ships under those names, making tar fail when extracting over them. v0.0.3 (rehosting/penguin-tools#4) reverts to canonical names only, which merge cleanly. Blocked on that release.
penguin-tools and hyperfs both ship dylibs/<arch> for the arch names they share (armel, mipsel/mipseb, mips64*, riscv64, x86_64). Extracting penguin-tools over them replaced hyperfs's libc.so/libgcc_s.so.1 -- the ones the per-arch sysroots link init.d/*.c drop-ins against -- with penguin-tools' musl, which is inconsistent with the cross-toolchain crt objects, so the compiled drop-in failed to run (dropin_c test failed on exactly those arches; arches where the dir names differ -- aarch64/arm64, powerpc64/ppc64, loongarch64/loongarch -- passed). Extract with --skip-old-files so hyperfs's system dylibs are preserved while penguin-tools' tools and extra libs (libstdc++, libpython, ld-musl interpreter alias) are still layered in. When hyperfs is removed, penguin-tools becomes the sole provider and the names no longer pre-exist.
penguin-tools (v0.0.5) is now the sole provider of the guest dynamic libraries, the per-arch drop-in sysroot (crt + libc.so/libgcc_s.so.1), and the debug tools, so hyperfs is removed entirely. - Dockerfile: pin penguin-tools 0.0.5; drop the hyperfs download; extract penguin-tools plainly (no collision now); remove the embedded-toolchains crt sysroot build and instead layer headers + the libgcc_s.so linker alias onto penguin-tools' sysroots/<arch>; install a /igloo/utils/python3 wrapper that execs the bundled CPython; make the legacy arch-dir normalization tolerant (those dirs came from hyperfs). - Canonical dylib dir names everywhere (config_patchers, nvram2, dropin_compile): penguin-tools ships dylibs under the canonical per-arch name, so the old arm64/ppc64/loongarch remapping is gone. - live_image: stage directories (copytree), needed to mount the CPython tree to /igloo/utils/python. - Add tests/.../penguin_tools.yaml actuating gdbserver/strace/ltrace/python3 and wire it into base_config. Note: the ~9 tests that use /igloo/utils/micropython (provided by hyperfs) still need migrating to CPython -- that's the remaining work, done while iterating on the per-arch Test Container CI.
hyperfs (which provided /igloo/utils/micropython) is gone, so the 9 tests
that used it now run on penguin-tools' CPython at /igloo/utils/python3:
- uos -> os; usocket -> socket
- ffilib/uctypes mmap -> the CPython `mmap` module (native_mmap, devfs
mmap_custom, pseudofile_mmap_{shared,private,rw})
- ffilib open/ioctl -> os.open + fcntl.ioctl (pseudofiles_comprehensive)
- ffi libc/lib_inject calls -> ctypes (uprobes, proc_mtd_dynamic)
Requires the full CPython (ctypes/mmap/fcntl/socket), so bump
PENGUIN_TOOLS_VERSION to 0.0.6 (rehosting/penguin-tools#7).
- dropin_compile: resolve the sysroot (and ld-musl loader) via get_arch_subdir() so intel64->x86_64 and powerpc64el->powerpc64 match the canonical directory names penguin-tools exports (and that the runtime dylib mount already uses). Fixes 'missing sysroots/intel64'. - anonfs stat test: use raw os.open() instead of the buffered open() builtin; CPython's io stack issues an isatty()/ioctl() that hits the sockfs mock socket's NULL ->ioctl and oopses the guest. - mmap tests (native_mmap, pseudofile_mmap_shared/private/rw): map via raw libc mmap through ctypes; CPython's mmap module rejects a length larger than the (tiny) pseudofile size, but the test maps a full page. - seek tests (pseudofile_explicit_lseek, dev_compat_unknown_seek): use raw os.lseek/os.read so the kernel actually sees the seek instead of CPython's buffer resolving it locally. - CI: also upload basic_target/results on failure (a basic_target failure aborts before test_target, so its console was never captured).
- Dockerfile: stage the x86_64 drop-in sysroot under its canonical name (stage x86_64, not intel64) so the musl headers land in sysroots/x86_64 where dropin_compile (via get_arch_subdir) now looks. Without this the intel64 drop-in failed with 'fcntl.h not found'. - uprobes test: the uprobes are filtered by task comm 'uprobes_test.sh', but the /igloo/utils/python3 wrapper adds a second shebang hop so comm becomes 'python3'. Reset comm via prctl(PR_SET_NAME) before the probed libc calls so the entry/return probes fire again. - CI: point the basic_target half of the failure artifact at its real results path (projects/empty_fs/results/latest).
find_lib() realpath()'d the matched ld-musl-*.so.1 down to libc.so, but penguin-tools ships that loader as a symlink to libc.so and musl's loader IS libc, so the guest maps the libc code under the ld-musl name, not libc.so. Use the glob-matched (loader) basename for the guest path so the uprobes land on the mapping the process actually executes.
c631c60 to
eae8da6
Compare
musl's off_t is 64-bit on every arch (including 32-bit ones). The ctypes mmap argtypes used c_long for the offset, which is 32-bit on armel/mipsel/ mipseb, so the final off_t argument was malformed and mmap returned EINVAL on 32-bit guests (64-bit arches matched by luck). Use c_longlong.
The micropython->CPython migration makes each guest python test heavier; on slowly-emulated arches (powerpc64, loongarch64) the full test_target suite no longer finishes within the old 10m budget and was killed before writing results. Give it 20m.
The per-arch test patches override /igloo/dylibs/* with a hardcoded host_path that still used the hyperfs-era directory names (arm64, loongarch, ppc64). penguin-tools ships dylibs under the canonical aarch64/loongarch64/powerpc64, so those three pointed at nonexistent directories: the dylibs never mounted and the uprobes_test plugin's ld-musl lookup raised, aborting the entire run (no .ran, every test failing). Point them at the real penguin-tools dylib directories.
Instrument the actuation test so the per-test stdout/stderr capture localizes which step aborts on mips64el/eb (tool --version vs CPython).
…tion test v0.0.8 drops the broken 32-bit ltrace fallback on mips64, so /igloo/utils/ltrace is absent there and penguin_tools.sh skips it. Revert the temporary set -x diagnostics back to set -eu (keeping the per-step checkpoints) and note mips64 in the ltrace-omitted comment.
The UDP echo server (migrated micropython->CPython) failed on all mips arches: the guest never registered a UDP 4444 bind (vpn_test's probe thread never started, so udp_echo reported failure), correlated with CPython taking a wild-jump SIGSEGV under MIPS emulation. Drop the getaddrinfo dance (a leftover micropython workaround and a heavier code path) for a direct AF_INET/SOCK_DGRAM bind, and run the tiny echo server in a shell restart loop with SO_REUSEADDR so a transient interpreter crash re-binds (re-triggering bind detection) while the host retries.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ships the new penguin-tools artifact (gdbserver, strace, ltrace, python — cross-compiled to musl for the full guest arch set, plus their dylibs).
penguin-tools.tar.gzin the downloader stage and extracts into/igloo_static(tarball is rooted atigloo_static/, so-C /)./igloo_static/utils.bin/<tool>.<arch>, which is exactly what the guest paths (/igloo/utils/strace,/igloo/utils/ltrace, …) and theSTRACE/IGLOO_LTRACEenv-flag machinery already expect.local_packages/penguin-tools.tar.gzoverride.Pinned to
PENGUIN_TOOLS_VERSION=0.0.1.Follow-up (separate): removing the hyperfs download once dylib arch-name reconciliation lands.