From e481c8eae1281cfdccb45440f6f1b515f6fed136 Mon Sep 17 00:00:00 2001 From: ErenAri Date: Mon, 29 Jun 2026 18:11:33 +0300 Subject: [PATCH] feat(test): command/binary validation mode + known-tricky kernel library MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add `bpfcompat test --command` so an artifact can be validated through its own loader binary/command run inside each matrix kernel VM (verdict = exit code), instead of only handing a .bpf.o to the bundled validator. This exercises the real userspace loader path and needs no manifest kept in sync with that loader — directly answering reviewer feedback on falcosecurity/libs PR #3024 (Andrea Terzolo) and from the ebpf-go/vimto maintainer (Lorenz Bauer). - `--command `: shell command run as root in each guest; pass == exit 0 (override with `--command-expect-exit N`). Exposes $BPFCOMPAT_BIN, $BPFCOMPAT_ARTIFACT, $BPFCOMPAT_REMOTE_ROOT. The command is shell-quoted as a single `bash -lc` operand (cannot break out to inject host syntax). - `--command-binary `: local executable shipped into the guest. - `--artifact` becomes optional in command mode; if given it is staged and exposed as $BPFCOMPAT_ARTIFACT. No-artifact runs get a content-addressed `command://` identity so history/compare still work. - Reuses the result -> report -> registry pipeline: load_status="skipped" and the outcome lands in the report `functional` section as a synthetic `command` test; failure classifies as COMMAND_VALIDATION_FAILURE. VM runner only for now. Also add a curated library of known-tricky vendor kernels (`matrices/quirk-library.yaml` + `docs/kernel-quirk-library.md`) — the kernels where "version != eBPF feature support" bites (ring-buffer boundary, enterprise backports, no-BTF, vendor rebases, variant bands). A fresh ring-buffer run across it showed ubuntu-20.04-5.4 fail vs almalinux-8-4.18 pass (RHEL backport), and amazon-linux-2-4.14 fail (Amazon did not backport ring buffer) — i.e. enterprise backports are per-vendor, not a blanket guarantee. Tests: config validation, runner command-mode verdicts (pass/fail/expected-exit/ infra), and the guest command-line assembly + shell-quoting. Co-Authored-By: Claude Opus 4.8 --- README.md | 38 +++++ cmd/bpfcompat/main.go | 6 +- docs/command-validation.md | 99 ++++++++++++ docs/kernel-quirk-library.md | 99 ++++++++++++ internal/runner/command_mode_test.go | 197 ++++++++++++++++++++++++ internal/runner/config.go | 37 ++++- internal/runner/config_test.go | 54 +++++++ internal/runner/runner.go | 216 ++++++++++++++++++++++----- internal/vm/qemu.go | 114 +++++++++++++- internal/vm/qemu_test.go | 35 +++++ internal/vm/result.go | 14 ++ matrices/quirk-library.yaml | 55 +++++++ 12 files changed, 919 insertions(+), 45 deletions(-) create mode 100644 docs/command-validation.md create mode 100644 docs/kernel-quirk-library.md create mode 100644 internal/runner/command_mode_test.go create mode 100644 matrices/quirk-library.yaml diff --git a/README.md b/README.md index 0b1a6a4..4e453e4 100644 --- a/README.md +++ b/README.md @@ -62,6 +62,42 @@ CO-RE relocation, map/program/attach support), not a heuristic. Each run leaves per-target evidence — `serial.log` (the guest kernel boot), `qemu.log`, and `validator-result.json`. +### Validate via your own loader (command mode) + +The bundled validator answers "does this `.bpf.o` load/attach?" Sometimes you +want to answer "does **my project's actual loader** come up on this kernel?" — +which also exercises your userspace path and needs no manifest kept in sync with +that loader. Command mode does exactly that: it runs a command (optionally a +binary you ship in) **inside each matrix kernel VM**, and the per-kernel verdict +is its exit code. + +```bash +# Run your statically-built loader across the matrix; pass == exit 0 per kernel. +bpfcompat test --command '$BPFCOMPAT_BIN --self-test' \ + --command-binary ./build/myloader --matrix matrices/mvp.yaml --out report.json + +# Or drive an already-installed tool against a shipped .bpf.o. +bpfcompat test --command '$BPFCOMPAT_BIN --obj $BPFCOMPAT_ARTIFACT' \ + --command-binary ./build/loader --artifact ./build/probe.bpf.o \ + --matrix matrices/mvp.yaml --out report.json +``` + +The command runs as root in the disposable guest with `$BPFCOMPAT_BIN` (your +shipped binary), `$BPFCOMPAT_ARTIFACT` (a staged `.bpf.o`, if given), and +`$BPFCOMPAT_REMOTE_ROOT` exported. See +[docs/command-validation.md](docs/command-validation.md). + +Point either flow at the **library of known-tricky vendor kernels** — the ones +where "version ≠ feature support" bites (ring-buffer boundary, enterprise +backports, no-BTF, vendor rebases, variant bands): + +```bash +bpfcompat test --command '$BPFCOMPAT_BIN --self-test' --command-binary ./build/loader \ + --matrix matrices/quirk-library.yaml --out report.json +``` + +See [docs/kernel-quirk-library.md](docs/kernel-quirk-library.md). + ### Distributions covered A curated, multi-distro, multi-architecture matrix of the kernels enterprises @@ -604,6 +640,8 @@ User guide — start here: - [`docs/architecture.md`](docs/architecture.md) - [`docs/project-compatibility-suite.md`](docs/project-compatibility-suite.md) — suites and collection matrices - [`docs/validator.md`](docs/validator.md) — what the in-guest validator checks +- [`docs/command-validation.md`](docs/command-validation.md) — validate via your own loader binary/command (exit-code verdict) +- [`docs/kernel-quirk-library.md`](docs/kernel-quirk-library.md) — curated library of known-tricky vendor kernels (version ≠ feature support) - [`docs/profile-catalog.md`](docs/profile-catalog.md) — kernel/distro profiles and image maintenance - [`docs/image-pipeline.md`](docs/image-pipeline.md) — where images come from, integrity, adding profiles - [`docs/upstream-kernel-virtme-ng.md`](docs/upstream-kernel-virtme-ng.md) diff --git a/cmd/bpfcompat/main.go b/cmd/bpfcompat/main.go index 42cf3b1..6db7f40 100644 --- a/cmd/bpfcompat/main.go +++ b/cmd/bpfcompat/main.go @@ -164,6 +164,9 @@ func runTest(args []string) int { var cfg runner.Config fs.StringVar(&cfg.ArtifactPath, "artifact", "", "Compiled .bpf.o artifact: a local ELF file, an OCI gadget (registry ref e.g. ghcr.io/org/gadget:tag), or an OCI image archive/layout") + fs.StringVar(&cfg.Command, "command", "", "Command-mode validation: shell command run inside each kernel VM (verdict = exit code). Exposes $BPFCOMPAT_BIN/$BPFCOMPAT_ARTIFACT. Use instead of --artifact to validate via a real loader binary/command.") + fs.StringVar(&cfg.CommandBinary, "command-binary", "", "Command mode: local executable shipped into each guest and exposed to --command as $BPFCOMPAT_BIN") + fs.IntVar(&cfg.CommandExpectExit, "command-expect-exit", 0, "Command mode: exit code that counts as a pass (default 0)") fs.StringVar(&cfg.ArtifactURI, "artifact-uri", "", "Optional remote URI for artifact retrieval metadata (http|https|file)") fs.StringVar(&cfg.ArtifactName, "artifact-name", "", "Logical artifact family name for version history (optional)") fs.StringVar(&cfg.ArtifactVersion, "artifact-version", "", "Artifact version label for version history (optional)") @@ -182,7 +185,7 @@ func runTest(args []string) int { keepVMOnFailure := fs.Bool("keep-vm-on-failure", false, "Keep VM overlays/logs on failure") fs.Usage = func() { - fmt.Fprintf(fs.Output(), "Usage:\n bpfcompat test --artifact --matrix --out [flags]\n\n") + fmt.Fprintf(fs.Output(), "Usage:\n bpfcompat test --artifact --matrix --out [flags]\n bpfcompat test --command --matrix --out [--command-binary ] [flags]\n\n") fs.PrintDefaults() } @@ -1353,6 +1356,7 @@ func printRootUsage() { fmt.Println() fmt.Println("Usage:") fmt.Println(" bpfcompat test --artifact --matrix --out [flags]") + fmt.Println(" bpfcompat test --command --matrix --out [--command-binary ] [flags]") fmt.Println(" bpfcompat suite --suite --out [flags]") fmt.Println(" bpfcompat profile list --matrix ") fmt.Println(" bpfcompat history list [flags]") diff --git a/docs/command-validation.md b/docs/command-validation.md new file mode 100644 index 0000000..b81469d --- /dev/null +++ b/docs/command-validation.md @@ -0,0 +1,99 @@ +# Command-mode validation (validate via a binary/command) + +The default `bpfcompat test` flow ships a `.bpf.o` plus the bundled C/libbpf +validator into each kernel VM and answers *"does this object load and attach?"*. + +**Command mode** answers a different question: *"does my project's own loader — +the real userspace path — come up on this kernel?"* Instead of the bundled +validator, it runs a command (optionally a binary you ship into the guest) +inside each matrix-kernel VM, and the per-kernel verdict is the command's **exit +code**. + +This is useful when: + +- you want to exercise the **userspace loader path**, not just the kernel's + acceptance of the object; +- you'd rather **not maintain a manifest** (map fixups, program-variant groups) + that has to stay in sync with how your loader configures the object — your + loader already encodes that; +- your artifact isn't a single `.bpf.o` (multiple objects, skeletons, a CLI that + loads several programs). + +It is the analog of running your binary under a per-kernel VM harness (e.g. +`vimto exec`), wired into the same multi-distro matrix, evidence, and history +that the `.bpf.o` flow uses. + +## Usage + +```bash +# Ship a statically-linked loader and run it on every matrix kernel. +# Pass == exit 0 (override with --command-expect-exit N). +bpfcompat test \ + --command '$BPFCOMPAT_BIN --self-test' \ + --command-binary ./build/myloader \ + --matrix matrices/mvp.yaml \ + --out report.json +``` + +```bash +# Drive a loader against a shipped .bpf.o (both are staged into the guest). +bpfcompat test \ + --command '$BPFCOMPAT_BIN --obj $BPFCOMPAT_ARTIFACT' \ + --command-binary ./build/loader \ + --artifact ./build/probe.bpf.o \ + --matrix matrices/mvp.yaml \ + --out report.json +``` + +```bash +# No shipped binary — use a tool already present in the guest image. +bpfcompat test \ + --command 'bpftool prog load /tmp/x.bpf.o /sys/fs/bpf/x' \ + --command-binary ./build/x.bpf.o-copier ... # (or stage via --artifact) +``` + +### Flags + +| Flag | Meaning | +|---|---| +| `--command ` | Shell command run inside each kernel VM. Required to enter command mode. | +| `--command-binary ` | Local executable shipped into each guest, `chmod +x`, exposed as `$BPFCOMPAT_BIN`. | +| `--command-expect-exit ` | Exit code that counts as a pass (default `0`). | +| `--artifact ` | Optional in command mode; when given it is staged and exposed as `$BPFCOMPAT_ARTIFACT`. | + +### Environment available to the command + +The command runs **as root** inside the disposable guest with: + +- `BPFCOMPAT_BIN` — absolute path to the `--command-binary` you shipped (empty if none); +- `BPFCOMPAT_ARTIFACT` — absolute path to the staged `--artifact` (empty if none); +- `BPFCOMPAT_REMOTE_ROOT` — the per-run scratch root inside the guest. + +The command string is executed as a single `bash -lc` operand (it is +shell-quoted, so it cannot break out to inject host-side syntax). Use real shell +inside it freely: pipes, `&&`, redirects. + +## Verdict and report + +- The kernel **passes** iff the command exits with `--command-expect-exit` + (default `0`); otherwise it **fails** with classification + `COMMAND_VALIDATION_FAILURE`. +- The libbpf load/attach phase is **skipped** (`validation.load_status: + "skipped"`); the outcome is recorded in the report's `functional` section as a + single synthetic `command` test carrying the exit code and bounded + stdout/stderr tails. +- A command that *fails to execute at all* (VM didn't boot, SSH failed) is an + **infra error**, not a compatibility failure — exactly as in the `.bpf.o` + flow. +- The run is still recorded in artifact version history; with no `.bpf.o` the + artifact identity is content-addressed from the command string + (`command://`), so `compare`/history still work. + +## Scope / limitations (first cut) + +- Command mode currently supports the **`vm`** runner only (the default). It is + rejected for `virtme-ng`/`firecracker`. +- The verdict is the **exit code**. Richer assertions (stdout/stderr matchers, + per-program expectations) remain available through the manifest + `functional_tests` + `--validation-mode behavior` path, which layers commands + *on top of* a `.bpf.o` load. diff --git a/docs/kernel-quirk-library.md b/docs/kernel-quirk-library.md new file mode 100644 index 0000000..be5987e --- /dev/null +++ b/docs/kernel-quirk-library.md @@ -0,0 +1,99 @@ +# Library of known-tricky vendor kernels + +A curated catalog of real distro kernels where **"kernel version ≠ eBPF feature +support."** These are the kernels that surprise you in production: upstream +feature boundaries, enterprise backports that carry new features onto old bases, +no-BTF kernels, vendor rebases, and program-variant fallback bands. + +Every entry is a kernel bpfcompat **actually boots** (real vendor cloud image in +a disposable VM) and has evidence for — not a version string we inferred from. +Run the whole library against a `.bpf.o` *or* your own loader (command mode): + +```bash +# A compiled object, the default validator: +bpfcompat test --artifact build/probe.bpf.o \ + --matrix matrices/quirk-library.yaml --out report.json --markdown report.md + +# Your own loader binary/command (see docs/command-validation.md): +bpfcompat test --command '$BPFCOMPAT_BIN --self-test' --command-binary ./build/loader \ + --matrix matrices/quirk-library.yaml --out report.json +``` + +The matrix is [`matrices/quirk-library.yaml`](../matrices/quirk-library.yaml). +Profiles whose pass/fail is artifact-dependent (genuine feature boundaries) are +`required: false`, so the library never forces a verdict the kernel itself +decides. + +## The catalog + +The "verdict" column below is the **observed** result of running a ring-buffer +probe (`examples/ringbuf-modern/ringbuf_modern.bpf.o`) across the library — see +[Fresh evidence](#fresh-evidence-2026-06-29). + +| Profile | Real kernel | The quirk | Ring-buffer probe | +|---|---|---|---| +| `ubuntu-20.04-5.4` | 5.4.0-216 | Ring-buffer maps land upstream in **5.8** — not present here | ❌ `UNSUPPORTED_MAP_TYPE` (high) — *correct*, not a bug | +| `ubuntu-20.10-5.8` | 5.8.0-63 | First upstream kernel **with** ring buffer | ✅ pass — the other side of the line | +| `almalinux-8-4.18` | 4.18.0-553.el8 | **Version lies:** RHEL backports ring buffer onto 4.18, so it works on a kernel numbered *older* than the 5.4 that failed | ✅ **pass despite 4.18** | +| `rocky-8-4.18` | 4.18.0-…el8 | Same RHEL-8 backport base (ABI-compatible rebuild) | ✅ pass | +| `centos-stream-9-5.14` | 5.14.0-706.el9 | RHEL-9 base: 5.14 carrying many 6.x BPF features | ✅ pass | +| `amazon-linux-2-4.14` | 4.14.26-…amzn2 | **Backports are not uniform:** Amazon's 4.14 (no embedded BTF) does **not** carry the ring-buffer backport that RHEL's 4.18 does | ❌ `UNSUPPORTED_MAP_TYPE` — a *simple* program loads here, ring buffer does not | +| `amazon-linux-2-5.10` | 5.10.247-…amzn2 | Amazon backport tier | ✅ pass | +| `oracle-linux-9-uek7-5.15` | **6.12.0**-…el9uek | **Version-string trap:** the `uek7-5.15` profile actually boots a **6.12** UEK kernel on an EL9 userspace | ✅ pass (test, don't assume from the name) | +| `opensuse-leap-15.6-6.4` | 6.4.0-…default | SUSE backport tier | ✅ pass | +| `ubuntu-22.04-5.15` | 5.15.0-173 | **Program-variant fallback band:** a loader must select the `*_old_x` syscall variants; `dump_task` is unsupported | ✅ pass *with the right variant selection* | +| `debian-12-6.1` | 6.1.0-47 | Newer band: `bpf_loop` variants + both BPF iterators available | ✅ pass with the modern variants | + +The two ❌ rows are `required: false`: a ring-buffer probe *should* be rejected +there, and that rejection is the evidence, not a failure of the run. + +## Fresh evidence (2026-06-29) + +Run of `ringbuf_modern.bpf.o` across the library (`load_attach`, real QEMU/KVM +VMs, run `20260629T145413Z-5b261e`). The standout pair: + +- **`ubuntu-20.04-5.4` ❌ vs `almalinux-8-4.18` ✅** — the *higher* version number + (5.4) fails ring buffer while the *lower* one (RHEL 4.18) passes it, because + RHEL backported the feature. Version number predicts the wrong answer. +- **`almalinux-8-4.18` ✅ vs `amazon-linux-2-4.14` ❌** — two old-numbered + enterprise kernels disagree on the *same* probe: RHEL backported ring buffer to + 4.18, Amazon did not backport it to 4.14. **"Enterprise backport" is not a + blanket guarantee — it's per-vendor, per-feature, and has to be tested.** +- `oracle-linux-9-uek7-5.15` actually booted `6.12.0-…el9uek` — the UEK rebase + ran a far newer kernel than the profile name implies, and still passed. + +All other kernels (5.8, 5.10, 5.14, 5.15, 6.1, 6.4) loaded and attached the +ring-buffer probe. The failures carry `classification_code: +UNSUPPORTED_MAP_TYPE` with the verifier detail (`map "events" failed to create — +Invalid argument (-22)`), so a CI gate or human gets the *why*, not just a ❌. + +## Why these specific kernels + +The catalog is assembled from kernels with documented, reproduced behavior: + +- **Ring-buffer boundary + the backport that breaks the rule** — the 5.4 fail vs + 5.8 pass vs AlmaLinux-8 *4.18* pass is the canonical "version ≠ support" + demonstration. Evidence: + [case-study-falco-modern-bpf.md](case-study-falco-modern-bpf.md), + [case-study-enterprise-kernels.md](case-study-enterprise-kernels.md). +- **No-BTF backport (Amazon Linux 2 / 4.14)** — boots and validates + `load_attach` on a real `4.14.26-54.32.amzn2`; see the enterprise case study. +- **Program-variant bands (5.15 vs 6.1+)** — which syscall variants and + iterators each kernel accepts, recorded per profile in the Falco modern_bpf + study. +- **Enterprise/rebase tier (RHEL-family, Oracle UEK, SUSE)** — the 14/14 + backported-tier run in the enterprise case study. + +## Scope and honesty + +- These are **independent tests of public distro images**; not affiliated with or + endorsed by Red Hat, AlmaLinux, Rocky, Oracle, Amazon, or SUSE. +- "Tricky" means *behavior is not predictable from the version number*, not that + the kernel is defective. A ❌ on `ubuntu-20.04-5.4` for a ring-buffer probe is + the kernel correctly rejecting an unsupported feature. +- RHEL itself is subscription-walled; AlmaLinux/Rocky/CentOS Stream are the free, + ABI-compatible rebuilds used as the reproducible RHEL stand-in. RHCOS/OpenShift + is a separate opt-in, operator-supplied path + ([rhcos-openshift.md](rhcos-openshift.md)). +- The list grows as new quirks are reproduced with evidence. It is deliberately + small and verified rather than large and inferred. diff --git a/internal/runner/command_mode_test.go b/internal/runner/command_mode_test.go new file mode 100644 index 0000000..84cbfce --- /dev/null +++ b/internal/runner/command_mode_test.go @@ -0,0 +1,197 @@ +package runner + +import ( + "context" + "testing" + "time" + + "github.com/kernel-guard/bpfcompat/internal/matrix" + "github.com/kernel-guard/bpfcompat/internal/vm" +) + +func commandModeProfile(string) (vm.Profile, error) { + return vm.Profile{ + ID: "ubuntu-test", + Distro: "ubuntu", + Version: "22.04", + KernelFamily: "5.15", + Arch: "x86_64", + }, nil +} + +func stubCommandExecutor(t *testing.T, exitCode int) func(context.Context, vm.ExecutionRequest) vm.ExecutionResult { + t.Helper() + return func(_ context.Context, req vm.ExecutionRequest) vm.ExecutionResult { + if req.Command == "" { + t.Fatalf("expected command on execution request") + } + if req.ValidatorBinary != "" { + t.Fatalf("command mode must not run the validator (got binary %q)", req.ValidatorBinary) + } + now := time.Now().UTC() + return vm.ExecutionResult{ + ProfileID: req.Profile.ID, + Status: "pass", + CommandMode: true, + CommandExitCode: exitCode, + CommandStdoutTail: "loaded 3 programs", + HostRelease: "5.15.0-test", + HostMachine: "x86_64", + StartedAt: now.Add(-time.Second), + FinishedAt: now, + } + } +} + +func TestExecuteTargetCommandModePass(t *testing.T) { + origLoad, origExec := loadProfileFn, executeProfileFn + t.Cleanup(func() { loadProfileFn, executeProfileFn = origLoad, origExec }) + loadProfileFn = commandModeProfile + executeProfileFn = stubCommandExecutor(t, 0) + + target, infraErr, requiredFail := executeTarget( + context.Background(), + Config{Timeout: 2 * time.Second, Command: "$BPFCOMPAT_BIN --self-test"}, + matrix.MatrixProfile{ID: "ubuntu-test", Required: boolPtr(true)}, + t.TempDir(), + "", // no staged artifact in command mode + "", + "", + validatorTuning{}, + "", // no validator binary + "best-effort", + ) + + if infraErr || requiredFail { + t.Fatalf("expected clean pass, got infraErr=%v requiredFail=%v target=%+v", infraErr, requiredFail, target) + } + if target.Status != "pass" { + t.Fatalf("expected pass status, got %q", target.Status) + } + if target.Validation == nil || target.Validation.LoadStatus != "skipped" { + t.Fatalf("expected load skipped in command mode, got %+v", target.Validation) + } + if target.Functional == nil || target.Functional.Status != "pass" || len(target.Functional.Tests) != 1 { + t.Fatalf("expected single passing functional test, got %+v", target.Functional) + } + if target.Host == nil || target.Host.Kernel != "5.15.0-test" { + t.Fatalf("expected host kernel recorded, got %+v", target.Host) + } + if target.ValidatorExit != 0 { + t.Fatalf("expected validator_exit 0, got %d", target.ValidatorExit) + } +} + +func TestExecuteTargetCommandModeFailRequired(t *testing.T) { + origLoad, origExec := loadProfileFn, executeProfileFn + t.Cleanup(func() { loadProfileFn, executeProfileFn = origLoad, origExec }) + loadProfileFn = commandModeProfile + executeProfileFn = stubCommandExecutor(t, 1) + + target, infraErr, requiredFail := executeTarget( + context.Background(), + Config{Timeout: 2 * time.Second, Command: "$BPFCOMPAT_BIN --self-test"}, + matrix.MatrixProfile{ID: "ubuntu-test", Required: boolPtr(true)}, + t.TempDir(), + "", "", "", + validatorTuning{}, + "", + "best-effort", + ) + + if infraErr { + t.Fatalf("did not expect infra error, got %+v", target) + } + if !requiredFail { + t.Fatalf("expected required compatibility failure") + } + if target.Status != "fail" || target.FailedStage != "command" { + t.Fatalf("unexpected status/stage: %q/%q", target.Status, target.FailedStage) + } + if target.ClassificationCode != "COMMAND_VALIDATION_FAILURE" { + t.Fatalf("expected COMMAND_VALIDATION_FAILURE, got %q", target.ClassificationCode) + } +} + +// A non-zero exit is a pass when --command-expect-exit matches it. +func TestExecuteTargetCommandModeExpectedNonZero(t *testing.T) { + origLoad, origExec := loadProfileFn, executeProfileFn + t.Cleanup(func() { loadProfileFn, executeProfileFn = origLoad, origExec }) + loadProfileFn = commandModeProfile + executeProfileFn = stubCommandExecutor(t, 42) + + target, infraErr, requiredFail := executeTarget( + context.Background(), + Config{Timeout: 2 * time.Second, Command: "expect-42", CommandExpectExit: 42}, + matrix.MatrixProfile{ID: "ubuntu-test", Required: boolPtr(true)}, + t.TempDir(), + "", "", "", + validatorTuning{}, + "", + "best-effort", + ) + + if infraErr || requiredFail { + t.Fatalf("expected pass with expected-exit 42, got infraErr=%v requiredFail=%v", infraErr, requiredFail) + } + if target.Status != "pass" { + t.Fatalf("expected pass, got %q", target.Status) + } +} + +func TestExecuteTargetCommandModeInfraError(t *testing.T) { + origLoad, origExec := loadProfileFn, executeProfileFn + t.Cleanup(func() { loadProfileFn, executeProfileFn = origLoad, origExec }) + loadProfileFn = commandModeProfile + executeProfileFn = func(_ context.Context, req vm.ExecutionRequest) vm.ExecutionResult { + now := time.Now().UTC() + return vm.ExecutionResult{ + ProfileID: req.Profile.ID, + Status: "infra_error", + InfraError: "boot failed", + CommandMode: true, + StartedAt: now, + FinishedAt: now, + } + } + + target, infraErr, requiredFail := executeTarget( + context.Background(), + Config{Timeout: 2 * time.Second, Command: "x"}, + matrix.MatrixProfile{ID: "ubuntu-test", Required: boolPtr(true)}, + t.TempDir(), + "", "", "", + validatorTuning{}, + "", + "best-effort", + ) + + if !infraErr { + t.Fatalf("expected infra error") + } + if requiredFail { + t.Fatalf("infra error must not count as compatibility failure") + } + if target.Status != "infra_error" || target.InfraError != "boot failed" { + t.Fatalf("unexpected target: %+v", target) + } +} + +func TestCommandArtifactMetadataStableAndDistinct(t *testing.T) { + a := commandArtifactMetadata(Config{Command: "loader --run"}) + b := commandArtifactMetadata(Config{Command: "loader --run"}) + c := commandArtifactMetadata(Config{Command: "loader --other"}) + if a.SHA256 == "" || a.SHA256 != b.SHA256 { + t.Fatalf("expected stable sha for identical command: %q vs %q", a.SHA256, b.SHA256) + } + if a.SHA256 == c.SHA256 { + t.Fatalf("expected distinct sha for different command") + } + if a.BaseName != "command" { + t.Fatalf("expected default basename 'command', got %q", a.BaseName) + } + withBin := commandArtifactMetadata(Config{Command: "x", CommandBinary: "/path/to/myloader"}) + if withBin.BaseName != "myloader" { + t.Fatalf("expected basename from command binary, got %q", withBin.BaseName) + } +} diff --git a/internal/runner/config.go b/internal/runner/config.go index 79c1c33..1a3dfce 100644 --- a/internal/runner/config.go +++ b/internal/runner/config.go @@ -40,7 +40,18 @@ type Config struct { ArtifactVersion string ArtifactVariant string ValidationMode string - MatrixPath string + // Command, when set, switches the run to command/binary validation mode: + // instead of loading a .bpf.o with the bundled validator, the command is + // executed inside each matrix kernel VM and the per-kernel verdict is its + // exit code. This exercises the artifact's real userspace loader path and + // needs no manifest kept in sync with that loader. + Command string + // CommandBinary is an optional local executable shipped into each guest and + // exposed to Command as $BPFCOMPAT_BIN (chmod +x in the guest). + CommandBinary string + // CommandExpectExit is the exit code that counts as a pass in command mode. + CommandExpectExit int + MatrixPath string // Quick selects the built-in quick-check kernel set (matrix.Quick) when no // MatrixPath is given — a fast local "does it load?" check. Quick bool @@ -85,8 +96,28 @@ func NormalizeValidationMode(mode string) string { } func (c Config) Validate() error { - if c.ArtifactPath == "" { - return errors.New("--artifact is required") + commandMode := strings.TrimSpace(c.Command) != "" + if !commandMode { + if c.ArtifactPath == "" { + return errors.New("--artifact is required (or pass --command to validate via a binary/command)") + } + if strings.TrimSpace(c.CommandBinary) != "" { + return errors.New("--command-binary requires --command") + } + if c.CommandExpectExit != 0 { + return errors.New("--command-expect-exit requires --command") + } + } else { + if c.CommandExpectExit < 0 || c.CommandExpectExit > 255 { + return fmt.Errorf("--command-expect-exit must be in [0,255] (got %d)", c.CommandExpectExit) + } + runner := c.Runner + if runner == "" { + runner = RunnerVM + } + if runner != RunnerVM { + return fmt.Errorf("--command validation currently supports --runner %q only (got %q)", RunnerVM, c.Runner) + } } if c.MatrixPath == "" && !c.Quick { return errors.New("--matrix is required (or pass --quick for the default kernel set)") diff --git a/internal/runner/config_test.go b/internal/runner/config_test.go index 8338c8c..010d985 100644 --- a/internal/runner/config_test.go +++ b/internal/runner/config_test.go @@ -79,3 +79,57 @@ func TestConfigValidateRejectsUnknownRunner(t *testing.T) { t.Fatalf("unexpected error: %v", err) } } + +func TestConfigValidateRequiresArtifactOrCommand(t *testing.T) { + cfg := validConfigForTest() + cfg.ArtifactPath = "" + + err := cfg.Validate() + if err == nil || !strings.Contains(err.Error(), "--command") { + t.Fatalf("expected error hinting at --command, got %v", err) + } +} + +func TestConfigValidateAllowsCommandWithoutArtifact(t *testing.T) { + cfg := validConfigForTest() + cfg.ArtifactPath = "" + cfg.Command = "$BPFCOMPAT_BIN --self-test" + + if err := cfg.Validate(); err != nil { + t.Fatalf("expected command mode without artifact to pass, got %v", err) + } +} + +func TestConfigValidateCommandRejectsNonVMRunner(t *testing.T) { + cfg := validConfigForTest() + cfg.ArtifactPath = "" + cfg.Command = "loader" + cfg.Runner = RunnerVirtmeNG + + err := cfg.Validate() + if err == nil || !strings.Contains(err.Error(), "--runner") { + t.Fatalf("expected command mode to reject non-vm runner, got %v", err) + } +} + +func TestConfigValidateCommandExpectExitRange(t *testing.T) { + cfg := validConfigForTest() + cfg.ArtifactPath = "" + cfg.Command = "loader" + cfg.CommandExpectExit = 300 + + err := cfg.Validate() + if err == nil || !strings.Contains(err.Error(), "command-expect-exit") { + t.Fatalf("expected out-of-range exit code rejection, got %v", err) + } +} + +func TestConfigValidateCommandBinaryRequiresCommand(t *testing.T) { + cfg := validConfigForTest() + cfg.CommandBinary = "/tmp/loader" + + err := cfg.Validate() + if err == nil || !strings.Contains(err.Error(), "--command-binary requires --command") { + t.Fatalf("expected --command-binary to require --command, got %v", err) + } +} diff --git a/internal/runner/runner.go b/internal/runner/runner.go index 363dd7a..5a4f294 100644 --- a/internal/runner/runner.go +++ b/internal/runner/runner.go @@ -2,6 +2,8 @@ package runner import ( "context" + "crypto/sha256" + "encoding/hex" "fmt" "os" "path/filepath" @@ -83,36 +85,46 @@ func ExecuteBootstrap(ctx context.Context, cfg Config) (RunResult, error) { Message: "Inspecting artifact", }) - artifactPath := cfg.ArtifactPath - if artifact.IsOCISource(artifactPath) { - emitProgress(cfg.Progress, ProgressUpdate{ - Stage: ProgressStageInspectArtifact, - Message: "Extracting eBPF object from OCI source", - }) - ociDir, err := os.MkdirTemp("", "bpfcompat-oci-") - if err != nil { - return RunResult{}, fmt.Errorf("create OCI extract dir: %w", err) + commandMode := strings.TrimSpace(cfg.Command) != "" + hasArtifact := strings.TrimSpace(cfg.ArtifactPath) != "" + + var meta artifact.Metadata + if hasArtifact { + artifactPath := cfg.ArtifactPath + if artifact.IsOCISource(artifactPath) { + emitProgress(cfg.Progress, ProgressUpdate{ + Stage: ProgressStageInspectArtifact, + Message: "Extracting eBPF object from OCI source", + }) + ociDir, err := os.MkdirTemp("", "bpfcompat-oci-") + if err != nil { + return RunResult{}, fmt.Errorf("create OCI extract dir: %w", err) + } + defer os.RemoveAll(ociDir) + extracted, err := artifact.ExtractEBPFFromOCI(artifactPath, ociDir) + if err != nil { + return RunResult{}, fmt.Errorf("load OCI gadget %q: %w", artifactPath, err) + } + artifactPath = extracted } - defer os.RemoveAll(ociDir) - extracted, err := artifact.ExtractEBPFFromOCI(artifactPath, ociDir) + + meta, err = artifact.Inspect(artifactPath) if err != nil { - return RunResult{}, fmt.Errorf("load OCI gadget %q: %w", artifactPath, err) + return RunResult{}, err } - artifactPath = extracted - } - meta, err := artifact.Inspect(artifactPath) - if err != nil { - return RunResult{}, err - } - - emitProgress(cfg.Progress, ProgressUpdate{ - Stage: ProgressStageStageArtifact, - Message: "Staging artifact", - }) + emitProgress(cfg.Progress, ProgressUpdate{ + Stage: ProgressStageStageArtifact, + Message: "Staging artifact", + }) - if _, err := artifact.Stage(meta.AbsolutePath, runPaths.InputDir); err != nil { - return RunResult{}, err + if _, err := artifact.Stage(meta.AbsolutePath, runPaths.InputDir); err != nil { + return RunResult{}, err + } + } else { + // Command mode with no .bpf.o: synthesize artifact identity from the + // command so reports and version history still have a stable key. + meta = commandArtifactMetadata(cfg) } var stagedManifest string @@ -181,21 +193,28 @@ func ExecuteBootstrap(ctx context.Context, cfg Config) (RunResult, error) { attachMode = "disabled" } - stagedArtifact := filepath.Join(runPaths.InputDir, filepath.Base(meta.AbsolutePath)) - validatorBinPath, err := filepath.Abs("validator/c-libbpf/bin/bpfcompat-validator") - if err != nil { - return RunResult{}, fmt.Errorf("resolve validator path: %w", err) - } - if _, err := os.Stat(validatorBinPath); err != nil { - return RunResult{}, fmt.Errorf("validator binary not found at %s; run `make validator-static` first", validatorBinPath) + stagedArtifact := "" + if hasArtifact { + stagedArtifact = filepath.Join(runPaths.InputDir, filepath.Base(meta.AbsolutePath)) } - if runner == RunnerVM { - dynamic, err := validatorIsDynamicallyLinked(validatorBinPath) + + validatorBinPath := "" + if !commandMode { + validatorBinPath, err = filepath.Abs("validator/c-libbpf/bin/bpfcompat-validator") if err != nil { - return RunResult{}, fmt.Errorf("inspect validator binary: %w", err) + return RunResult{}, fmt.Errorf("resolve validator path: %w", err) + } + if _, err := os.Stat(validatorBinPath); err != nil { + return RunResult{}, fmt.Errorf("validator binary not found at %s; run `make validator-static` first", validatorBinPath) } - if dynamic { - return RunResult{}, fmt.Errorf("validator binary at %s is dynamically linked; VM-backed runs require a static build (run `make validator-static`)", validatorBinPath) + if runner == RunnerVM { + dynamic, err := validatorIsDynamicallyLinked(validatorBinPath) + if err != nil { + return RunResult{}, fmt.Errorf("inspect validator binary: %w", err) + } + if dynamic { + return RunResult{}, fmt.Errorf("validator binary at %s is dynamically linked; VM-backed runs require a static build (run `make validator-static`)", validatorBinPath) + } } } @@ -212,7 +231,11 @@ func ExecuteBootstrap(ctx context.Context, cfg Config) (RunResult, error) { attachMode, cfg.Progress, ) - notes = append(notes, validationModeNotes(validationMode)...) + if commandMode { + notes = append(notes, commandModeNote(cfg)) + } else { + notes = append(notes, validationModeNotes(validationMode)...) + } notes = append(notes, targetNotes...) status := "pass" @@ -525,6 +548,15 @@ func executeTarget( return target, false, matrixProfile.RequiredBool() } + commandBinaryAbs := "" + if strings.TrimSpace(cfg.Command) != "" && strings.TrimSpace(cfg.CommandBinary) != "" { + if abs, absErr := filepath.Abs(cfg.CommandBinary); absErr == nil { + commandBinaryAbs = abs + } else { + commandBinaryAbs = cfg.CommandBinary + } + } + targetCtx, cancel := context.WithTimeout(ctx, cfg.Timeout) execResult := executor(targetCtx, vm.ExecutionRequest{ Profile: profile, @@ -540,9 +572,15 @@ func executeTarget( AttachMode: attachMode, Timeout: cfg.Timeout, KeepVMOnFailure: cfg.KeepVMOnFailure, + Command: cfg.Command, + CommandBinary: commandBinaryAbs, }) cancel() + if strings.TrimSpace(cfg.Command) != "" { + return evaluateCommandTarget(target, execResult, matrixProfile, cfg, profile) + } + target.Status = execResult.Status target.StartedAt = execResult.StartedAt.Format(time.RFC3339) target.FinishedAt = execResult.FinishedAt.Format(time.RFC3339) @@ -1103,6 +1141,108 @@ func shouldRunFunctionalTests(validationMode string, mf manifest.Manifest) bool } } +// commandArtifactMetadata synthesizes an artifact identity for command-mode +// runs that have no .bpf.o, so reports and version history still carry a stable +// key. The SHA256 is computed over the command string (and binary basename when +// present) — a content-addressed handle for "this validation command". +func commandArtifactMetadata(cfg Config) artifact.Metadata { + base := "command" + seed := strings.TrimSpace(cfg.Command) + if b := strings.TrimSpace(cfg.CommandBinary); b != "" { + base = filepath.Base(b) + seed = base + "\x00" + seed + } + sum := sha256.Sum256([]byte(seed)) + return artifact.Metadata{ + AbsolutePath: "command://" + base, + BaseName: base, + SHA256: hex.EncodeToString(sum[:]), + SizeBytes: 0, + } +} + +func commandModeNote(cfg Config) string { + return fmt.Sprintf( + "validation mode: command (per-kernel verdict = command exit code == %d; libbpf load/attach skipped)", + cfg.CommandExpectExit, + ) +} + +// evaluateCommandTarget turns a command-mode VM execution into a target verdict: +// the kernel passes iff the command exited with the expected code. The result is +// recorded in the Functional section as a single synthetic "command" test so the +// existing report/markdown surfaces render it without special-casing. +func evaluateCommandTarget( + target schema.Target, + execResult vm.ExecutionResult, + matrixProfile matrix.MatrixProfile, + cfg Config, + profile vm.Profile, +) (schema.Target, bool, bool) { + target.StartedAt = execResult.StartedAt.Format(time.RFC3339) + target.FinishedAt = execResult.FinishedAt.Format(time.RFC3339) + target.DurationMs = execResult.FinishedAt.Sub(execResult.StartedAt).Milliseconds() + target.VMRunDir = execResult.VMRunDir + target.QEMUCommand = execResult.QEMUCommand + target.SerialLog = execResult.SerialLogPath + target.Notes = append(target.Notes, execResult.Notes...) + + if execResult.Status == "infra_error" { + target.Status = "infra_error" + target.FailedStage = "infra" + target.InfraError = execResult.InfraError + return target, true, false + } + + target.Host = &schema.TargetEnv{ + Distro: profile.Distro, + Version: profile.Version, + KernelFamily: profile.KernelFamily, + Kernel: execResult.HostRelease, + Arch: execResult.HostMachine, + } + target.ValidatorExit = execResult.CommandExitCode + target.Validation = &schema.Validation{LoadStatus: "skipped"} + + expected := cfg.CommandExpectExit + pass := execResult.CommandExitCode == expected + status := "fail" + if pass { + status = "pass" + } + target.Functional = &schema.Functional{ + Status: status, + Tests: []schema.FunctionalTest{{ + Name: "command", + Required: true, + Status: status, + Command: cfg.Command, + ExpectedExitCode: expected, + ExitCode: execResult.CommandExitCode, + StdoutTail: execResult.CommandStdoutTail, + StderrTail: execResult.CommandStderrTail, + }}, + } + + if pass { + target.Status = "pass" + target.FailedStage = "" + target.Notes = append(target.Notes, fmt.Sprintf("command validation passed (exit code %d)", execResult.CommandExitCode)) + return target, false, false + } + + target.Status = "fail" + target.FailedStage = "command" + target.ClassificationCode = "COMMAND_VALIDATION_FAILURE" + target.ClassificationConfidence = "high" + target.ClassificationReason = fmt.Sprintf("Command exited %d (expected %d) on this kernel.", execResult.CommandExitCode, expected) + target.Notes = append(target.Notes, + fmt.Sprintf("classification: %s (%s)", target.ClassificationCode, target.ClassificationConfidence), + "remediation: inspect the command stdout/stderr tails; the artifact's loader/command failed on this kernel.", + ) + return target, false, matrixProfile.RequiredBool() +} + func validationModeNotes(validationMode string) []string { switch validationMode { case ValidationModeLoadOnly: diff --git a/internal/vm/qemu.go b/internal/vm/qemu.go index 0b8bae7..0d6291d 100644 --- a/internal/vm/qemu.go +++ b/internal/vm/qemu.go @@ -28,6 +28,14 @@ type ExecutionRequest struct { AttachMode string Timeout time.Duration KeepVMOnFailure bool + + // Command-mode validation. When Command is non-empty the validator is not + // run; instead Command is executed inside the guest (as root) and the + // per-kernel verdict is its exit code. CommandBinary, when set, is a local + // executable shipped into the guest and exposed to Command as $BPFCOMPAT_BIN; + // ArtifactPath, when set, is staged and exposed as $BPFCOMPAT_ARTIFACT. + Command string + CommandBinary string } // ProgVariantGroup mirrors a manifest program-variant group for the @@ -135,6 +143,14 @@ type ExecutionResult struct { Notes []string StartedAt time.Time FinishedAt time.Time + + // Command-mode outputs (populated only when ExecutionRequest.Command is set). + CommandMode bool + CommandExitCode int + CommandStdoutTail string + CommandStderrTail string + HostRelease string + HostMachine string } const ( @@ -341,9 +357,17 @@ func ExecuteProfile(ctx context.Context, req ExecutionRequest) (result Execution return } - artifactRemotePath := filepath.ToSlash(filepath.Join(remoteRoot, "input", filepath.Base(req.ArtifactPath))) - if err := scpToGuest(ctx, target, req.ArtifactPath, artifactRemotePath); err != nil { - result.InfraError = err.Error() + artifactRemotePath := "" + if strings.TrimSpace(req.ArtifactPath) != "" { + artifactRemotePath = filepath.ToSlash(filepath.Join(remoteRoot, "input", filepath.Base(req.ArtifactPath))) + if err := scpToGuest(ctx, target, req.ArtifactPath, artifactRemotePath); err != nil { + result.InfraError = err.Error() + return + } + } + + if strings.TrimSpace(req.Command) != "" { + runGuestCommand(ctx, req, target, vmRunDir, remoteRoot, artifactRemotePath, &result) return } @@ -433,6 +457,90 @@ func ExecuteProfile(ctx context.Context, req ExecutionRequest) (result Execution return } +const commandTailBytes = 4096 + +// guestCommandLine builds the in-guest shell line for command-mode validation. +// The user command is arbitrary shell (the artifact's own loader) run as root; +// it is single-quoted as one `bash -lc` operand and the artifact/bin/root paths +// are passed as quoted env assignments. The wrapper always exits 0 so a non-zero +// command exit is captured in the exit file rather than read as an SSH error. +func guestCommandLine(command, artifactRemotePath, binRemotePath, remoteRoot, stdoutPath, stderrPath, exitPath string) string { + return fmt.Sprintf( + "sudo BPFCOMPAT_ARTIFACT=%s BPFCOMPAT_BIN=%s BPFCOMPAT_REMOTE_ROOT=%s bash -lc %s >%s 2>%s; code=$?; echo \"$code\" > %s; exit 0", + shellQuote(artifactRemotePath), + shellQuote(binRemotePath), + shellQuote(remoteRoot), + shellQuote(command), + shellQuote(stdoutPath), + shellQuote(stderrPath), + shellQuote(exitPath), + ) +} + +// runGuestCommand performs command-mode validation inside an already-booted +// guest: it ships the optional command binary, runs req.Command as root with +// $BPFCOMPAT_ARTIFACT/$BPFCOMPAT_BIN/$BPFCOMPAT_REMOTE_ROOT exported, and +// records the command's exit code plus bounded stdout/stderr tails. A command +// that *executes* — whatever its exit code — is an infra success (Status=pass); +// the runner turns the exit code into the per-kernel compatibility verdict. +func runGuestCommand(ctx context.Context, req ExecutionRequest, target sshTarget, vmRunDir, remoteRoot, artifactRemotePath string, result *ExecutionResult) { + result.CommandMode = true + + binRemotePath := "" + if strings.TrimSpace(req.CommandBinary) != "" { + binRemotePath = filepath.ToSlash(filepath.Join(remoteRoot, "bin", filepath.Base(req.CommandBinary))) + if err := scpToGuest(ctx, target, req.CommandBinary, binRemotePath); err != nil { + result.InfraError = err.Error() + return + } + if err := sshRun(ctx, target, fmt.Sprintf("chmod +x %s", shellQuote(binRemotePath))); err != nil { + result.InfraError = err.Error() + return + } + } + + // Record the kernel the command ran against (best-effort; not fatal). + if release, err := sshOutput(ctx, target, "uname -r"); err == nil { + result.HostRelease = release + } + if machine, err := sshOutput(ctx, target, "uname -m"); err == nil { + result.HostMachine = machine + } + + remoteStdoutPath := filepath.ToSlash(filepath.Join(remoteRoot, "out", "command.stdout")) + remoteStderrPath := filepath.ToSlash(filepath.Join(remoteRoot, "out", "command.stderr")) + remoteExitPath := filepath.ToSlash(filepath.Join(remoteRoot, "out", "command-exit-code")) + + runCmd := guestCommandLine(req.Command, artifactRemotePath, binRemotePath, remoteRoot, remoteStdoutPath, remoteStderrPath, remoteExitPath) + if err := sshRun(ctx, target, runCmd); err != nil { + result.InfraError = fmt.Sprintf("run command: %v", err) + return + } + + localStdoutPath := filepath.Join(vmRunDir, "command.stdout") + localStderrPath := filepath.Join(vmRunDir, "command.stderr") + localExitPath := filepath.Join(vmRunDir, "command-exit-code") + + if err := scpFromGuest(ctx, target, remoteExitPath, localExitPath); err != nil { + result.InfraError = fmt.Sprintf("retrieve command exit code: %v", err) + return + } + exitCode, err := parseExitCodeFile(localExitPath) + if err != nil { + result.InfraError = fmt.Sprintf("parse command exit code: %v", err) + return + } + result.CommandExitCode = exitCode + + _ = scpFromGuest(ctx, target, remoteStdoutPath, localStdoutPath) + _ = scpFromGuest(ctx, target, remoteStderrPath, localStderrPath) + result.CommandStdoutTail = readFileTail(localStdoutPath, commandTailBytes) + result.CommandStderrTail = readFileTail(localStderrPath, commandTailBytes) + + result.Status = "pass" + result.Notes = append(result.Notes, fmt.Sprintf("command executed inside VM (exit code %d)", exitCode)) +} + // installGuestKernelAndReboot installs profile.install_kernel inside the // guest via apt, pins it as the grub default, reboots, and relaunches QEMU // on the same overlay. QEMU runs with -no-reboot, so the guest reboot exits diff --git a/internal/vm/qemu_test.go b/internal/vm/qemu_test.go index 764d7bb..6bae266 100644 --- a/internal/vm/qemu_test.go +++ b/internal/vm/qemu_test.go @@ -507,3 +507,38 @@ func TestShellQuote(t *testing.T) { } } } + +func TestGuestCommandLine(t *testing.T) { + got := guestCommandLine( + "$BPFCOMPAT_BIN --obj $BPFCOMPAT_ARTIFACT", + "/tmp/bpfcompat/input/probe.bpf.o", + "/tmp/bpfcompat/bin/loader", + "/tmp/bpfcompat", + "/tmp/bpfcompat/out/command.stdout", + "/tmp/bpfcompat/out/command.stderr", + "/tmp/bpfcompat/out/command-exit-code", + ) + want := "sudo BPFCOMPAT_ARTIFACT='/tmp/bpfcompat/input/probe.bpf.o' " + + "BPFCOMPAT_BIN='/tmp/bpfcompat/bin/loader' " + + "BPFCOMPAT_REMOTE_ROOT='/tmp/bpfcompat' " + + "bash -lc '$BPFCOMPAT_BIN --obj $BPFCOMPAT_ARTIFACT' " + + ">'/tmp/bpfcompat/out/command.stdout' 2>'/tmp/bpfcompat/out/command.stderr'; " + + "code=$?; echo \"$code\" > '/tmp/bpfcompat/out/command-exit-code'; exit 0" + if got != want { + t.Fatalf("guestCommandLine mismatch:\n got: %s\nwant: %s", got, want) + } +} + +// A malicious/odd command must stay a single quoted operand — it cannot break +// out of the bash -lc argument to inject host-side shell syntax. +func TestGuestCommandLineQuotesHostileCommand(t *testing.T) { + got := guestCommandLine("'; reboot #", "", "", "/tmp/bpfcompat", + "/o/out", "/o/err", "/o/exit") + if !strings.Contains(got, `bash -lc ''"'"'; reboot #'`) { + t.Fatalf("hostile command not safely quoted: %s", got) + } + // Empty artifact/bin paths render as empty quoted strings, not bare gaps. + if !strings.Contains(got, "BPFCOMPAT_ARTIFACT='' BPFCOMPAT_BIN=''") { + t.Fatalf("empty env paths not quoted: %s", got) + } +} diff --git a/internal/vm/result.go b/internal/vm/result.go index 9fdd854..149211b 100644 --- a/internal/vm/result.go +++ b/internal/vm/result.go @@ -22,3 +22,17 @@ func parseExitCodeFile(path string) (int, error) { } return code, nil } + +// readFileTail returns the trailing up-to-maxBytes of a file (best-effort: +// missing/unreadable files yield ""), trimmed of surrounding whitespace. Used +// to capture bounded stdout/stderr from command-mode validation runs. +func readFileTail(path string, maxBytes int) string { + data, err := os.ReadFile(path) // #nosec G304 -- path is a run-dir file we wrote. + if err != nil { + return "" + } + if maxBytes > 0 && len(data) > maxBytes { + data = data[len(data)-maxBytes:] + } + return strings.TrimSpace(string(data)) +} diff --git a/matrices/quirk-library.yaml b/matrices/quirk-library.yaml new file mode 100644 index 0000000..5d9ebfc --- /dev/null +++ b/matrices/quirk-library.yaml @@ -0,0 +1,55 @@ +# Library of known-tricky vendor kernels. +# +# A curated matrix of real distro kernels where "kernel version != eBPF feature +# support" bites hardest: upstream feature boundaries, enterprise backports that +# carry new features onto old bases, no-BTF kernels, vendor rebases, and +# program-variant fallback bands. Each profile boots the real vendor cloud image. +# +# Run a .bpf.o or your own loader (command mode) against the whole library: +# bpfcompat test --artifact build/probe.bpf.o --matrix matrices/quirk-library.yaml --out report.json +# bpfcompat test --command '$BPFCOMPAT_BIN --self-test' --command-binary ./build/loader \ +# --matrix matrices/quirk-library.yaml --out report.json +# +# Per-kernel quirks and expected behavior are documented in +# docs/kernel-quirk-library.md. `required: false` marks kernels whose pass/fail +# is artifact-dependent (feature boundaries) so the library doesn't force a +# verdict the kernel itself decides. +name: quirk-library +profiles: + # Upstream ring-buffer boundary: ringbuf maps require >= 5.8. A ringbuf + # artifact fails here with UNSUPPORTED_MAP_TYPE (high) — correct, not a bug. + - id: ubuntu-20.04-5.4 + required: false + # First upstream kernel with ring buffer support — the other side of the line. + - id: ubuntu-20.10-5.8 + required: false + # The "version lies" case: RHEL-family 4.18 BACKPORTS ring buffer, so a + # ringbuf artifact that fails on upstream 5.4 PASSES on this older-numbered 4.18. + - id: almalinux-8-4.18 + required: true + - id: rocky-8-4.18 + required: true + # RHEL 9 backport base (5.14 carrying many 6.x features). + - id: centos-stream-9-5.14 + required: true + # Backports are not uniform: Amazon's 4.14 (no embedded BTF) loads SIMPLE + # programs but does NOT carry the ring-buffer backport RHEL put in 4.18 — so a + # ringbuf probe fails here while it passes on almalinux-8-4.18. Per-vendor. + - id: amazon-linux-2-4.14 + required: false + - id: amazon-linux-2-5.10 + required: true + # Oracle UEK rebases onto a newer kernel than the EL base implies (uname shows + # a 6.12 uek build on an EL9 userspace) — a version-string trap. + - id: oracle-linux-9-uek7-5.15 + required: true + # SUSE backport tier. + - id: opensuse-leap-15.6-6.4 + required: true + # Program-variant fallback band: on 5.15 the loader must pick the *_old_x + # syscall variants and dump_task is unsupported (see Falco modern_bpf study). + - id: ubuntu-22.04-5.15 + required: true + # Newer band: bpf_loop variants + both BPF iterators available. + - id: debian-12-6.1 + required: true