Summary
fw2tar unconditionally strips all character and block device nodes from every
extracted filesystem, and the only record of what was removed is opt-in. For firmware
that ships a static /dev (no devtmpfs/mdev at runtime), this silently removes
nodes the firmware needs, which can surface downstream as a hard-to-diagnose daemon crash
rather than an obvious "missing device" error.
Current behavior
src/archive.rs excludes any node where is_block_device() || is_char_device() from the
output tar (the entry is simply not added).
--log-devices (default off, see src/args.rs) is the only way to find out what was
dropped; it writes a *.devices.log listing the paths, but nothing recreates them and the
major/minor/type aren't captured.
This is reasonable as a default — tar created as a non-root user can't mknod, and most
modern targets recreate /dev at runtime — but it's lossy and silent.
Where it's fine vs. where it bites
devtmpfs/mdev firmware (most modern Linux targets): harmless. The image ships an
essentially empty /dev and the kernel/userland repopulate it at boot. Stripping removes
almost nothing (often just /dev/console).
- Static-
/dev firmware (older/simpler embedded SDK images): the device nodes live in
the rootfs itself. Stripping them means a daemon doing open("/dev/<x>") gets ENOENT;
if that return isn't checked (common in vendor C code), the result is a NULL-deref
segfault at startup that looks like a generic crash, not a missing-file error. Because
--log-devices is off by default, there's no breadcrumb pointing at extraction.
A quick way to tell which class you're in: inspect the extracted rootfs /dev — empty
directory => runtime-populated (strip is harmless); pre-populated with nodes => static
/dev (strip is potentially destructive).
Why this matters for rehosting
In a recent rehost, a web daemon bound its port and then crashed on startup. The
investigation could have been short-circuited if the extraction step had surfaced "these N
device nodes were removed" by default. (In that particular case the firmware was
devtmpfs-based, so the strip turned out not to be the cause — but precisely because
there was no default manifest, ruling fw2tar in/out took manual digging. The signal is
cheap; its absence is what costs time.)
Suggested improvements (in rough order of value)
- Always emit the device manifest (not gated behind
--log-devices), or at minimum
emit it whenever any node was dropped. Cheap, and it turns a silent loss into a visible
one.
- Record
type, major, minor, and mode in that manifest, not just the path, so a
downstream consumer can faithfully recreate the nodes (e.g. a sidecar *.devices.tsv/JSON).
Today only the path is logged.
- Optionally re-materialize the nodes for consumers that can act on it — via a
fakeroot-style path that preserves them in the archive, or by leaving the manifest for
the orchestrator (e.g. Penguin) to recreate as static device files. The manifest from (2)
is the enabling piece.
Scope note
This is specifically about static-/dev firmware. For devtmpfs targets the missing
runtime state is driver/sysfs-created and is not addressed by preserving image device
nodes — that's a separate, orchestrator-side modeling concern and shouldn't be conflated
with this.
Summary
fw2tarunconditionally strips all character and block device nodes from everyextracted filesystem, and the only record of what was removed is opt-in. For firmware
that ships a static
/dev(nodevtmpfs/mdevat runtime), this silently removesnodes the firmware needs, which can surface downstream as a hard-to-diagnose daemon crash
rather than an obvious "missing device" error.
Current behavior
src/archive.rsexcludes any node whereis_block_device() || is_char_device()from theoutput tar (the entry is simply not added).
--log-devices(default off, seesrc/args.rs) is the only way to find out what wasdropped; it writes a
*.devices.loglisting the paths, but nothing recreates them and themajor/minor/type aren't captured.
This is reasonable as a default — tar created as a non-root user can't
mknod, and mostmodern targets recreate
/devat runtime — but it's lossy and silent.Where it's fine vs. where it bites
devtmpfs/mdevfirmware (most modern Linux targets): harmless. The image ships anessentially empty
/devand the kernel/userland repopulate it at boot. Stripping removesalmost nothing (often just
/dev/console)./devfirmware (older/simpler embedded SDK images): the device nodes live inthe rootfs itself. Stripping them means a daemon doing
open("/dev/<x>")getsENOENT;if that return isn't checked (common in vendor C code), the result is a NULL-deref
segfault at startup that looks like a generic crash, not a missing-file error. Because
--log-devicesis off by default, there's no breadcrumb pointing at extraction.A quick way to tell which class you're in: inspect the extracted rootfs
/dev— emptydirectory => runtime-populated (strip is harmless); pre-populated with nodes => static
/dev(strip is potentially destructive).Why this matters for rehosting
In a recent rehost, a web daemon bound its port and then crashed on startup. The
investigation could have been short-circuited if the extraction step had surfaced "these N
device nodes were removed" by default. (In that particular case the firmware was
devtmpfs-based, so the strip turned out not to be the cause — but precisely becausethere was no default manifest, ruling fw2tar in/out took manual digging. The signal is
cheap; its absence is what costs time.)
Suggested improvements (in rough order of value)
--log-devices), or at minimumemit it whenever any node was dropped. Cheap, and it turns a silent loss into a visible
one.
type,major,minor, andmodein that manifest, not just the path, so adownstream consumer can faithfully recreate the nodes (e.g. a sidecar
*.devices.tsv/JSON).Today only the path is logged.
fakeroot-style path that preserves them in the archive, or by leaving the manifest forthe orchestrator (e.g. Penguin) to recreate as static device files. The manifest from (2)
is the enabling piece.
Scope note
This is specifically about static-
/devfirmware. Fordevtmpfstargets the missingruntime state is driver/sysfs-created and is not addressed by preserving image device
nodes — that's a separate, orchestrator-side modeling concern and shouldn't be conflated
with this.