Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) once a
## [Unreleased]

### Added
- RHCOS evidence matrix: profiles for OpenShift 4.14 / 4.16 / 4.18 (`matrices/rhcos.yaml`)
and a recorded multi-version, multi-artifact run in `docs/evidence-rhcos.md` —
baseline load and ring-buffer load+attach pass on every release (RHEL 9.2 and
9.4 backported 5.14 kernels), and a CO-RE failure is correctly rejected on
every release (the discriminator). `make rhcos-image` takes `RHCOS_VERSION` to
stage per-version images. x86_64 only; opt-in via `BPFCOMPAT_ENABLE_RHCOS=1`.
- CoreOS (Ignition) boot support. CoreOS-family images boot via Ignition, not
cloud-init, so the executor now writes a minimal Ignition config (SSH key for
the `core` user) and passes it to QEMU via `-fw_cfg name=opt/com.coreos/config`
Expand Down
8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -175,13 +175,15 @@ vm-image-fcos:

# Stage an operator-supplied RHEL CoreOS (OpenShift) image for the rhcos-4.16
# profile. RHCOS ships with an OpenShift release, not a public cloud-image URL,
# so the operator provides it:
# so the operator provides it. RHCOS_VERSION picks the profile/cache slot
# (default 4.16); the image staged at vm/cache/rhcos-$(RHCOS_VERSION).qcow2:
# make rhcos-image RHCOS_IMAGE=/path/to/rhcos-qemu.x86_64.qcow2
# make rhcos-image RHCOS_IMAGE_URL=https://internal-mirror/rhcos.qcow2.gz
# make rhcos-image RHCOS_VERSION=4.18 RHCOS_IMAGE_URL=https://mirror/rhcos.qcow2.gz
# Then run with BPFCOMPAT_ENABLE_RHCOS=1 to enable the profile.
RHCOS_VERSION ?= 4.16
rhcos-image:
RHCOS_IMAGE='$(RHCOS_IMAGE)' RHCOS_IMAGE_URL='$(RHCOS_IMAGE_URL)' \
bash vm/scripts/fetch-rhcos-image.sh vm/cache/rhcos-4.16.qcow2
bash vm/scripts/fetch-rhcos-image.sh vm/cache/rhcos-$(RHCOS_VERSION).qcow2

vm-ubuntu-22-arm64:
bash vm/scripts/fetch-cloud-image.sh \
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,11 @@ different bootstrap. bpfcompat implements it (Ignition config over QEMU
-matrix matrices/rhcos.yaml -runner vm -out report.json
```

Recorded run: RHCOS `416.94…` on kernel `5.14.0-427.93.1.el9_4` (OpenShift
4.16), ring-buffer artifact load + attach **pass** —
Recorded evidence matrix: **3 OpenShift releases (4.14 / 4.16 / 4.18)** × 3
artifacts, real boots — baseline + ring-buffer load+attach **pass**, and a
CO-RE failure correctly **rejected** on every release —
[docs/evidence-rhcos.md](docs/evidence-rhcos.md). Without an image, the **RHEL /
AlmaLinux 9 (5.14)** profiles are the interim kernel approximation (RHCOS for
4.16 is the RHEL 9.4 kernel). Full guide:
AlmaLinux 9 (5.14)** profiles are the interim kernel approximation. Full guide:
[docs/rhcos-openshift.md](docs/rhcos-openshift.md).

## Try it in CI without your own KVM box
Expand Down Expand Up @@ -583,7 +583,7 @@ Reference matrices (real, reproducible artifacts):
- [`docs/case-study-falco-modern-bpf.md`](docs/case-study-falco-modern-bpf.md) — Falco `modern_bpf` across 5 kernels
- [`docs/case-study-enterprise-kernels.md`](docs/case-study-enterprise-kernels.md) — RHEL/Oracle/Amazon/SUSE backported tier
- [`docs/case-study-inspektor-gadget.md`](docs/case-study-inspektor-gadget.md) — published gadgets from OCI, zero config
- [`docs/evidence-rhcos.md`](docs/evidence-rhcos.md) — RHEL CoreOS / OpenShift 4.16, load + attach inside a real RHCOS guest
- [`docs/evidence-rhcos.md`](docs/evidence-rhcos.md) — RHEL CoreOS / OpenShift 4.14·4.16·4.18 matrix, load + attach inside real RHCOS guests

Internal evidence and program docs (acceptance records, runbooks, and
planning notes — useful for contributors, not needed to use the tool):
Expand Down
153 changes: 92 additions & 61 deletions docs/evidence-rhcos.md
Original file line number Diff line number Diff line change
@@ -1,94 +1,125 @@
# Evidence: RHEL CoreOS (OpenShift) validation
# Evidence: RHEL CoreOS (OpenShift) validation matrix

A real validation run of a compiled eBPF artifact **inside a booted RHEL CoreOS
guest**, recorded here as committed evidence. This is the proof behind the opt-in
RHCOS path documented in [docs/rhcos-openshift.md](rhcos-openshift.md). Reproduce
it with the steps at the bottom.
Real validation runs of compiled eBPF artifacts **inside booted RHEL CoreOS
guests**, across multiple OpenShift releases. Committed as evidence behind the
opt-in RHCOS path documented in [docs/rhcos-openshift.md](rhcos-openshift.md).
Reproduce with the steps at the bottom.

> The raw run artifacts (full `report.json`, `validator-result.json`, serial log)
> Raw run artifacts (full `report.json`, `validator-result.json`, serial logs)
> are written under `evidence/rhcos/` locally; that path is git-ignored as
> high-churn output, so the decisive fields are inlined below.
> high-churn output, so the decisive fields are inlined here.

## Result
## Releases under test

| Field | Value |
|---|---|
| Profile | `rhcos-4.16-5.14` |
| Booted OS | Red Hat Enterprise Linux CoreOS `416.94.202510081640-0` (OpenShift 4.16) |
| Kernel | `5.14.0-427.93.1.el9_4.x86_64` (RHEL 9.4 base, heavily backported) |
| Arch | x86_64 |
| Boot path | Ignition via QEMU `-fw_cfg name=opt/com.coreos/config`; SSH as `core` |
| Artifact | `examples/ringbuf-modern/ringbuf_modern.bpf.o` (sha256 `569df554…21728`) |
| Load | **pass** (errno 0) |
| Attach | **pass** (1/1, best-effort) |
| Kernel BTF | present (`/sys/kernel/btf/vmlinux`, 4876642 bytes) |
| Overall | **pass** |
Each row is a real RHCOS bootimage from the public OpenShift mirror, booted via
Ignition (`-fw_cfg name=opt/com.coreos/config`), SSH as `core`. The RHCOS
version encodes the RHEL base (e.g. `416.94` = OpenShift 4.16 on RHEL 9.4), and
the kernel column is the **in-guest `uname -r`** captured at run time.

The artifact uses a BPF ring buffer, upstream since **5.8**. It loads on this
**5.14** RHCOS kernel because RHEL backports the feature — "kernel version ≠
feature support," tested by booting the real vendor kernel rather than inferred.
| OpenShift | RHCOS bootimage | RHEL base | Kernel (`uname -r`) | Kernel BTF |
|---|---|---|---|---|
| 4.14 | `414.92.202407091253` | 9.2 | `5.14.0-284.73.1.el9_2.x86_64` | present |
| 4.16 | `416.94.202510081640` | 9.4 | `5.14.0-427.93.1.el9_4.x86_64` | present |
| 4.18 | `418.94.202510081222` | 9.4 | `5.14.0-427.93.1.el9_4.x86_64` | present |

## In-guest validator output (`validator.v0.4`, key fields)
Note the OCP minor does **not** track the kernel linearly: 4.16 and 4.18 share
the RHEL 9.4 `-427` kernel, while 4.14 is RHEL 9.2 `-284`. That is exactly the
"version number predicts nothing" property bpfcompat tests by booting the real
vendor kernel. (The mirror's `4.18/latest` bootimage is RHEL-9.4-based; later
4.18 z-streams may move to 9.6 — the table records the bootimage actually run.)

## Matrix result (3 artifacts × 3 releases, real boots)

| Artifact | What it exercises | 4.14 | 4.16 | 4.18 |
|---|---|---|---|---|
| `simple-pass` | baseline program load | ✅ load | ✅ load | ✅ load |
| `ringbuf-modern` | BPF ring buffer (upstream ≥ 5.8) + attach | ✅ load + attach 1/1 | ✅ load + attach 1/1 | ✅ load + attach 1/1 |
| `core-relocation-fail` | CO-RE relocation to a non-existent type | ❌ **rejected** | ❌ **rejected** | ❌ **rejected** |

Two things this proves:

1. **Backports work, tested not inferred.** The ring buffer lands upstream in
5.8, yet `ringbuf-modern` loads *and attaches* on RHCOS's backported 5.14
(both RHEL 9.2 and 9.4) — because the verdict comes from the real kernel.
2. **The verdict discriminates.** `core-relocation-fail` is **rejected on every
release** with `errno -22` and classification `CORE_RELOCATION_FAILURE` — so
the passes above are real acceptances, not a rubber stamp. (Its matrix targets
are non-blocking, so they record a per-target failure without failing the run.)

## In-guest validator output (representative — 4.16, ring buffer)

```json
{
"schema_version": "validator.v0.4",
"status": "pass",
"host": {
"release": "5.14.0-427.93.1.el9_4.x86_64",
"version": "#1 SMP PREEMPT_DYNAMIC Wed Oct 1 11:45:46 EDT 2025",
"machine": "x86_64"
},
"host": { "release": "5.14.0-427.93.1.el9_4.x86_64", "machine": "x86_64" },
"load": { "status": "pass", "error_code": 0, "error": "" },
"attach": { "mode": "best-effort", "status": "pass", "attempted": 1, "passed": 1, "failed": 0 },
"btf": { "kernel_btf_available": true, "kernel_btf_size": 4876642,
"artifact_has_btf": true, "artifact_has_btf_ext": true }
"btf": { "kernel_btf_available": true, "artifact_has_btf": true }
}
```

## Guest serial console (excerpt, ANSI stripped)
Rejection record (representative — 4.14, CO-RE failure):

```json
{
"status": "fail",
"host": { "release": "5.14.0-284.73.1.el9_2.x86_64" },
"load": { "status": "fail", "error_code": -22 },
"classification_code": "CORE_RELOCATION_FAILURE"
}
```
GRUB: Booting `Red Hat Enterprise Linux CoreOS 416.94.202510081640-0 (ostree:0)'

[0.000000] Linux version 5.14.0-427.93.1.el9_4.x86_64
(mockbuild@x86-64-03.build.eng.rdu2.redhat.com) #1 SMP PREEMPT_DYNAMIC Wed Oct 1 ...
[0.000000] Command line: ... vmlinuz-5.14.0-427.93.1.el9_4.x86_64 rw ignition.firstboot
ostree=/ostree/boot.1/rhcos/... ignition.platform.id=qemu console=ttyS0,115200n8

Welcome to Red Hat Enterprise Linux CoreOS 416.94.202510081640-0
dracut-057-54.git20250423.el9_4.1 (Initramfs)!
## Guest serial console (excerpt, ANSI stripped — 4.16)

```
GRUB: Booting `Red Hat Enterprise Linux CoreOS 416.94.202510081640-0 (ostree:0)'
[0.000000] Linux version 5.14.0-427.93.1.el9_4.x86_64 ... #1 SMP PREEMPT_DYNAMIC
[0.000000] Command line: ... ignition.firstboot ... ignition.platform.id=qemu console=ttyS0,115200n8
Welcome to Red Hat Enterprise Linux CoreOS 416.94.202510081640-0 (Initramfs)!
[1.291367] systemd[1]: Starting CoreOS Ignition User Config Setup...
[ OK ] Finished CoreOS Ignition User Config Setup.
```

`ignition.platform.id=qemu` + "CoreOS Ignition User Config Setup" confirm the
boot used the Ignition config bpfcompat delivered over `-fw_cfg`.

## Provenance

- Image: `rhcos-4.16.51-x86_64-qemu.x86_64.qcow2.gz`, public OpenShift mirror
(`mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.16/latest/`).
Published sha256 of the `.gz`:
`92880764c1b3b61940bc209ee021b97474c4db2d9a36abcece55ddd6d8c17c95`.
- Decompressed qcow2 base-image sha256 (recorded in `report.json`):
`d03128234c5dc6217bd37ee0caf6f192107d42d39a8a6b5c9b6148b0f4f92399`.
- The pull secret is required for the OpenShift *container release payload*, not
the RHCOS boot qcow2 used here.
Images: public OpenShift mirror,
`mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/<ver>/latest/`.
The pull secret gates the container release payload, **not** these boot qcow2s.
Decompressed qcow2 sha256 (as run):

## Reproduce
| OpenShift | image | sha256 (decompressed qcow2) |
|---|---|---|
| 4.14 | `rhcos-4.14.34-x86_64-qemu.x86_64.qcow2` | `6d271daf23242570520891cc8013d7ab3e2fa5ab8ab9d37485b28b72ab61e99f` |
| 4.16 | `rhcos-4.16.51-x86_64-qemu.x86_64.qcow2` | `d03128234c5dc6217bd37ee0caf6f192107d42d39a8a6b5c9b6148b0f4f92399` |
| 4.18 | `rhcos-4.18.27-x86_64-qemu.x86_64.qcow2` | `a6f870c3fb8f5039962978980cf6a5a11cd2973a35fc2b2938106658983b18d6` |

```sh
base=https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.16/latest
URL=$(curl -fsSL "$base/sha256sum.txt" | awk '/qemu.x86_64.qcow2.gz$/{print "'"$base"'/"$2; exit}')
## Honest limits

- **x86_64 only.** OpenShift on ARM (aarch64) is real but not covered here — it
needs an ARM64-capable KVM host and an aarch64 RHCOS bootimage. Not yet run.
- **Not in public CI.** RHCOS is operator-supplied by design (no bundled image),
so it does not run on every PR like the Ubuntu/FCOS lanes; this matrix is a
recorded, reproducible operator run.
- **Bootimage, not a live cluster.** This validates the node OS + kernel, not
OpenShift-cluster-specific MachineConfig state.

make rhcos-image RHCOS_IMAGE_URL="$URL"
## Reproduce

BPFCOMPAT_ENABLE_RHCOS=1 ./bin/bpfcompat test \
-artifact examples/ringbuf-modern/ringbuf_modern.bpf.o \
-matrix matrices/rhcos.yaml -runner vm -out report.json
```sh
b=https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos
for v in 4.14 4.16 4.18; do
url=$(curl -fsSL "$b/$v/latest/sha256sum.txt" | awk '/qemu.x86_64.qcow2.gz$/{print "'"$b"'/'"$v"'/latest/"$2; exit}')
make rhcos-image RHCOS_VERSION="$v" RHCOS_IMAGE_URL="$url" # → vm/cache/rhcos-$v.qcow2
done

for art in simple-pass/simple_pass ringbuf-modern/ringbuf_modern core-relocation-fail/core_relocation_fail; do
BPFCOMPAT_ENABLE_RHCOS=1 ./bin/bpfcompat test \
-artifact examples/$art.bpf.o -matrix matrices/rhcos.yaml -runner vm \
-concurrency 3 -out report-$(basename $art).json
done
```

Enterprises with an internal mirror or an `openshift-install`-extracted image
pass `RHCOS_IMAGE=/path/to/rhcos.qcow2` instead of `RHCOS_IMAGE_URL`.
`RHCOS_VERSION` selects both the cache slot (`vm/cache/rhcos-<ver>.qcow2`) and the
matching profile in `matrices/rhcos.yaml`. `core-relocation-fail` is expected to
be rejected — that is the discriminator, not a regression.
11 changes: 8 additions & 3 deletions matrices/rhcos.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
# RHEL CoreOS (OpenShift) — opt-in, operator-supplied image.
# Stage the image with `make rhcos-image RHCOS_IMAGE=... ` (or RHCOS_IMAGE_URL=...)
# and run with BPFCOMPAT_ENABLE_RHCOS=1. See docs/rhcos-openshift.md.
# RHEL CoreOS (OpenShift) evidence matrix — opt-in, operator-supplied images.
# Stage each image with `make rhcos-image RHCOS_IMAGE_URL=...` (one per version,
# staged at vm/cache/rhcos-<ver>.qcow2) and run with BPFCOMPAT_ENABLE_RHCOS=1.
# See docs/rhcos-openshift.md and docs/evidence-rhcos.md.
name: rhcos
profiles:
- id: rhcos-4.14-5.14
required: false
- id: rhcos-4.16-5.14
required: false
- id: rhcos-4.18-5.14
required: false
23 changes: 23 additions & 0 deletions vm/profiles/rhcos-4.14-5.14.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# RHEL CoreOS (OpenShift 4.14) — runnable with an operator-supplied image.
#
# Part of the RHCOS evidence matrix (see docs/evidence-rhcos.md). Same Ignition
# boot path as Fedora CoreOS (internal/vm/ignition.go); off by default, enable
# with BPFCOMPAT_ENABLE_RHCOS=1 once the image is staged with
# make rhcos-image RHCOS_IMAGE_URL=.../rhcos-4.14...-qemu.x86_64.qcow2.gz \
# <or> RHCOS_IMAGE=/path (stage target: vm/cache/rhcos-4.14.qcow2)
# RHCOS for OpenShift 4.14 is a RHEL 9.x kernel (5.14, heavily backported); the
# real booted kernel is captured at runtime in the report.
id: rhcos-4.14-5.14
distro: rhcos
version: "4.14"
kernel_family: "5.14"
arch: x86_64
image:
local_path: "vm/cache/rhcos-4.14.qcow2"
boot:
memory_mb: 2048
cpus: 2
validator:
path: "/usr/local/bin/bpfcompat-validator"
capabilities:
expected_btf: true
23 changes: 23 additions & 0 deletions vm/profiles/rhcos-4.18-5.14.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# RHEL CoreOS (OpenShift 4.18) — runnable with an operator-supplied image.
#
# Part of the RHCOS evidence matrix (see docs/evidence-rhcos.md). Same Ignition
# boot path as Fedora CoreOS (internal/vm/ignition.go); off by default, enable
# with BPFCOMPAT_ENABLE_RHCOS=1 once the image is staged with
# make rhcos-image RHCOS_IMAGE_URL=.../rhcos-4.18...-qemu.x86_64.qcow2.gz \
# <or> RHCOS_IMAGE=/path (stage target: vm/cache/rhcos-4.18.qcow2)
# RHCOS for OpenShift 4.18 is a RHEL 9.x kernel (5.14, heavily backported); the
# real booted kernel is captured at runtime in the report.
id: rhcos-4.18-5.14
distro: rhcos
version: "4.18"
kernel_family: "5.14"
arch: x86_64
image:
local_path: "vm/cache/rhcos-4.18.qcow2"
boot:
memory_mb: 2048
cpus: 2
validator:
path: "/usr/local/bin/bpfcompat-validator"
capabilities:
expected_btf: true
Loading