Skip to content

Integrate penguin-tools: ship guest debug tools#823

Open
lacraig2 wants to merge 16 commits into
mainfrom
feature/integrate-penguin-tools
Open

Integrate penguin-tools: ship guest debug tools#823
lacraig2 wants to merge 16 commits into
mainfrom
feature/integrate-penguin-tools

Conversation

@lacraig2

@lacraig2 lacraig2 commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Ships the new penguin-tools artifact (gdbserver, strace, ltrace, python — cross-compiled to musl for the full guest arch set, plus their dylibs).

  • Downloads penguin-tools.tar.gz in the downloader stage and extracts into /igloo_static (tarball is rooted at igloo_static/, so -C /).
  • The existing per-arch symlink pass exposes each binary as /igloo_static/utils.bin/<tool>.<arch>, which is exactly what the guest paths (/igloo/utils/strace, /igloo/utils/ltrace, …) and the STRACE / IGLOO_LTRACE env-flag machinery already expect.
  • Adds a local_packages/penguin-tools.tar.gz override.

Pinned to PENGUIN_TOOLS_VERSION=0.0.1.

Note: v0.0.1 is missing strace on every arch due to a manifest bug in penguin-tools (rehosting/penguin-tools#3). Once that fix releases (v0.0.2), bump PENGUIN_TOOLS_VERSION here.

Follow-up (separate): removing the hyperfs download once dylib arch-name reconciliation lands.

@lacraig2

lacraig2 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Bumped PENGUIN_TOOLS_VERSION to 0.0.2 (b2c225b), now that rehosting/penguin-tools#3 is merged and released. v0.0.2 is the first correct release: strace present on every arch, all tools' interpreter + sonames resolve within dylibs/<arch>, and legacy arch/dylib names (arm64/ppc64/ppc64el/loongarch) emitted as compat aliases. (v0.0.1 shipped non-runnable binaries.)

lacraig2 added 10 commits June 9, 2026 10:51
Download penguin-tools.tar.gz (gdbserver, strace, ltrace, python
cross-compiled to musl for the full guest arch set, plus their dylibs)
in the downloader stage and extract it into /igloo_static. The tarball
is rooted at igloo_static/, so it populates /igloo_static/<arch>/ and
merges /igloo_static/dylibs/<arch>/; the existing per-arch symlink pass
then exposes each binary as /igloo_static/utils.bin/<tool>.<arch>, which
is what the guest paths (/igloo/utils/strace, /igloo/utils/ltrace, ...)
and the STRACE / IGLOO_LTRACE env-flag machinery already expect.

Also wire a local_packages override so a locally-built
penguin-tools.tar.gz takes precedence, matching the other artifacts.
v0.0.2 carries the package-correctness fixes: strace is now present on
every arch, every tool's interpreter (ld-musl-<arch>.so.1) and NEEDED
sonames resolve within dylibs/<arch>, and the legacy arch/dylib dir
names (arm64, ppc64, ppc64el, loongarch) are emitted as compat aliases.
v0.0.1 shipped non-runnable binaries (missing interpreter) and no strace.
v0.0.2 shipped legacy compat dir symlinks (arm64/ppc64/ppc64el/loongarch)
that collide with the directories hyperfs still ships under those names,
making tar fail when extracting over them. v0.0.3 (rehosting/penguin-tools#4)
reverts to canonical names only, which merge cleanly. Blocked on that
release.
penguin-tools and hyperfs both ship dylibs/<arch> for the arch names they
share (armel, mipsel/mipseb, mips64*, riscv64, x86_64). Extracting
penguin-tools over them replaced hyperfs's libc.so/libgcc_s.so.1 -- the
ones the per-arch sysroots link init.d/*.c drop-ins against -- with
penguin-tools' musl, which is inconsistent with the cross-toolchain crt
objects, so the compiled drop-in failed to run (dropin_c test failed on
exactly those arches; arches where the dir names differ -- aarch64/arm64,
powerpc64/ppc64, loongarch64/loongarch -- passed).

Extract with --skip-old-files so hyperfs's system dylibs are preserved
while penguin-tools' tools and extra libs (libstdc++, libpython, ld-musl
interpreter alias) are still layered in. When hyperfs is removed,
penguin-tools becomes the sole provider and the names no longer pre-exist.
penguin-tools (v0.0.5) is now the sole provider of the guest dynamic
libraries, the per-arch drop-in sysroot (crt + libc.so/libgcc_s.so.1),
and the debug tools, so hyperfs is removed entirely.

- Dockerfile: pin penguin-tools 0.0.5; drop the hyperfs download; extract
  penguin-tools plainly (no collision now); remove the embedded-toolchains
  crt sysroot build and instead layer headers + the libgcc_s.so linker
  alias onto penguin-tools' sysroots/<arch>; install a /igloo/utils/python3
  wrapper that execs the bundled CPython; make the legacy arch-dir
  normalization tolerant (those dirs came from hyperfs).
- Canonical dylib dir names everywhere (config_patchers, nvram2,
  dropin_compile): penguin-tools ships dylibs under the canonical per-arch
  name, so the old arm64/ppc64/loongarch remapping is gone.
- live_image: stage directories (copytree), needed to mount the CPython
  tree to /igloo/utils/python.
- Add tests/.../penguin_tools.yaml actuating gdbserver/strace/ltrace/python3
  and wire it into base_config.

Note: the ~9 tests that use /igloo/utils/micropython (provided by hyperfs)
still need migrating to CPython -- that's the remaining work, done while
iterating on the per-arch Test Container CI.
hyperfs (which provided /igloo/utils/micropython) is gone, so the 9 tests
that used it now run on penguin-tools' CPython at /igloo/utils/python3:

- uos -> os; usocket -> socket
- ffilib/uctypes mmap -> the CPython `mmap` module (native_mmap, devfs
  mmap_custom, pseudofile_mmap_{shared,private,rw})
- ffilib open/ioctl -> os.open + fcntl.ioctl (pseudofiles_comprehensive)
- ffi libc/lib_inject calls -> ctypes (uprobes, proc_mtd_dynamic)

Requires the full CPython (ctypes/mmap/fcntl/socket), so bump
PENGUIN_TOOLS_VERSION to 0.0.6 (rehosting/penguin-tools#7).
- dropin_compile: resolve the sysroot (and ld-musl loader) via
  get_arch_subdir() so intel64->x86_64 and powerpc64el->powerpc64 match
  the canonical directory names penguin-tools exports (and that the
  runtime dylib mount already uses). Fixes 'missing sysroots/intel64'.
- anonfs stat test: use raw os.open() instead of the buffered open()
  builtin; CPython's io stack issues an isatty()/ioctl() that hits the
  sockfs mock socket's NULL ->ioctl and oopses the guest.
- mmap tests (native_mmap, pseudofile_mmap_shared/private/rw): map via
  raw libc mmap through ctypes; CPython's mmap module rejects a length
  larger than the (tiny) pseudofile size, but the test maps a full page.
- seek tests (pseudofile_explicit_lseek, dev_compat_unknown_seek): use
  raw os.lseek/os.read so the kernel actually sees the seek instead of
  CPython's buffer resolving it locally.
- CI: also upload basic_target/results on failure (a basic_target
  failure aborts before test_target, so its console was never captured).
- Dockerfile: stage the x86_64 drop-in sysroot under its canonical name
  (stage x86_64, not intel64) so the musl headers land in
  sysroots/x86_64 where dropin_compile (via get_arch_subdir) now looks.
  Without this the intel64 drop-in failed with 'fcntl.h not found'.
- uprobes test: the uprobes are filtered by task comm 'uprobes_test.sh',
  but the /igloo/utils/python3 wrapper adds a second shebang hop so comm
  becomes 'python3'. Reset comm via prctl(PR_SET_NAME) before the probed
  libc calls so the entry/return probes fire again.
- CI: point the basic_target half of the failure artifact at its real
  results path (projects/empty_fs/results/latest).
find_lib() realpath()'d the matched ld-musl-*.so.1 down to libc.so, but
penguin-tools ships that loader as a symlink to libc.so and musl's loader
IS libc, so the guest maps the libc code under the ld-musl name, not
libc.so. Use the glob-matched (loader) basename for the guest path so the
uprobes land on the mapping the process actually executes.
@lacraig2 lacraig2 force-pushed the feature/integrate-penguin-tools branch from c631c60 to eae8da6 Compare June 9, 2026 14:52
lacraig2 added 6 commits June 9, 2026 11:20
musl's off_t is 64-bit on every arch (including 32-bit ones). The ctypes
mmap argtypes used c_long for the offset, which is 32-bit on armel/mipsel/
mipseb, so the final off_t argument was malformed and mmap returned EINVAL
on 32-bit guests (64-bit arches matched by luck). Use c_longlong.
The micropython->CPython migration makes each guest python test heavier;
on slowly-emulated arches (powerpc64, loongarch64) the full test_target
suite no longer finishes within the old 10m budget and was killed before
writing results. Give it 20m.
The per-arch test patches override /igloo/dylibs/* with a hardcoded
host_path that still used the hyperfs-era directory names (arm64,
loongarch, ppc64). penguin-tools ships dylibs under the canonical
aarch64/loongarch64/powerpc64, so those three pointed at nonexistent
directories: the dylibs never mounted and the uprobes_test plugin's
ld-musl lookup raised, aborting the entire run (no .ran, every test
failing). Point them at the real penguin-tools dylib directories.
Instrument the actuation test so the per-test stdout/stderr capture
localizes which step aborts on mips64el/eb (tool --version vs CPython).
…tion test

v0.0.8 drops the broken 32-bit ltrace fallback on mips64, so /igloo/utils/ltrace
is absent there and penguin_tools.sh skips it. Revert the temporary set -x
diagnostics back to set -eu (keeping the per-step checkpoints) and note mips64
in the ltrace-omitted comment.
The UDP echo server (migrated micropython->CPython) failed on all mips
arches: the guest never registered a UDP 4444 bind (vpn_test's probe
thread never started, so udp_echo reported failure), correlated with
CPython taking a wild-jump SIGSEGV under MIPS emulation. Drop the
getaddrinfo dance (a leftover micropython workaround and a heavier code
path) for a direct AF_INET/SOCK_DGRAM bind, and run the tiny echo server
in a shell restart loop with SO_REUSEADDR so a transient interpreter
crash re-binds (re-triggering bind detection) while the host retries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant