From 80e4eeca50cb612d58cbf1922d369908bcfc966b Mon Sep 17 00:00:00 2001 From: ErenAri Date: Sun, 21 Jun 2026 14:15:49 +0300 Subject: [PATCH] Prep v0.1.6 release + capture bpfman pre-flight design MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - CHANGELOG: [0.1.6] section (enterprise 14/14 coverage, SLSA provenance + attestations, stability gate, health watchdog, demo UI overhaul, docs). - Bump action ref examples v0.1.5 -> v0.1.6 (README, quickstart, Falco case study). - docs/design-bpfman-preflight.md: the compatibility-gate-in-front-of-bpfman design (Phase 2; complements bpfman, no rival loader / no BPF-running SaaS). Tagging v0.1.6 after merge triggers release-artifacts.yml (SBOM + cosign + SLSA build-provenance attestations) — which only fire on a v* tag. Co-Authored-By: Claude Opus 4.8 --- CHANGELOG.md | 32 +++++++++++ README.md | 4 +- docs/case-study-falco-modern-bpf.md | 2 +- docs/design-bpfman-preflight.md | 88 +++++++++++++++++++++++++++++ docs/quickstart.md | 4 +- 5 files changed, 125 insertions(+), 5 deletions(-) create mode 100644 docs/design-bpfman-preflight.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 9a978f1..8856bd0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -46,6 +46,38 @@ adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) once a correct on runners without hardware virtualization (just slower). Hosted-KVM runners keep `-enable-kvm -cpu host`. +## [0.1.6] - 2026-06-21 + +### Added +- **Enterprise & backported-kernel coverage (14/14 proven).** New AlmaLinux 8 / + Rocky 8 (4.18) profiles plus a real reference run validating + `load_attach` across the RHEL 8/9/10 ABI (AlmaLinux/Rocky/CentOS-Stream), + Oracle UEK 7/8, Amazon Linux 2 (5.10 and the no-BTF 4.14) and 2023, and + openSUSE Leap — documented in `docs/case-study-enterprise-kernels.md`. +- **SLSA Build L3 build-provenance + SBOM attestations** on tag releases + (keyless OIDC via Sigstore/Rekor), with a verification guide + (`docs/verifying-releases.md`). +- **Weekly stability gate** (`.github/workflows/stability-gate.yml`) producing an + archived READY/NOT-READY readiness report. +- **Self-hosted health watchdog** for the demo (`scripts/healthcheck.sh` + + `packaging/systemd/bpfcompat-healthcheck.{service,timer}`) and a runbook + monitoring section. +- **Docs:** evidence-schema reference (`docs/evidence-schema.md`), self-hosted-first + quickstart + trust model (`docs/quickstart.md`), and the Falco modern_bpf + reference matrix (`docs/case-study-falco-modern-bpf.md`). +- **Demo UI:** Carbon design tokens matching the marketing site, light/dark + toggle, example matrix on load, live "watch it boot" matrix, shareable compat + badge (`/badge/.svg`) + OG social cards on `/results`, one-click live + example, and an in-page How-it-works/FAQ/source footer. + +### Fixed +- EL/Amazon/SUSE guests now seed cloud-init via a CIDATA ConfigDrive ISO instead + of the SMBIOS-net seed their cloud-init ignores (fixes EL8 boot/SSH); bootstrap + installs `cloud-image-utils`. Enabled Amazon Linux 2 (4.14) validation. +- Redacted absolute host audit paths (`trace_path`, `event_stream_path`) from + public runtime decision/select/fetch responses. +- Validate UI blocks submitting with no artifact instead of a raw 400. + ## [0.1.5] - 2026-06-11 ### Fixed diff --git a/README.md b/README.md index 16000ee..0069330 100644 --- a/README.md +++ b/README.md @@ -316,7 +316,7 @@ or the Firecracker lane. See Suite mode (recommended — gates the whole collection): ```yaml -- uses: Kernel-Guard/bpfcompat@v0.1.5 +- uses: Kernel-Guard/bpfcompat@v0.1.6 with: suite: suites/project.yaml suite-out: reports/suite.json @@ -330,7 +330,7 @@ are alive and adds the result to the suite-level collection matrix. Single artifact: ```yaml -- uses: Kernel-Guard/bpfcompat@v0.1.5 +- uses: Kernel-Guard/bpfcompat@v0.1.6 with: artifact: path/to/program.bpf.o manifest: path/to/manifest.yaml diff --git a/docs/case-study-falco-modern-bpf.md b/docs/case-study-falco-modern-bpf.md index 02954ff..a0c9139 100644 --- a/docs/case-study-falco-modern-bpf.md +++ b/docs/case-study-falco-modern-bpf.md @@ -49,7 +49,7 @@ That is the difference between "❌ it broke" and "❌ ring buffer isn't availab ```bash # In CI (GitHub Action), against your kernel matrix: -- uses: Kernel-Guard/bpfcompat@v0.1.5 +- uses: Kernel-Guard/bpfcompat@v0.1.6 with: artifact: build/bpf_probe.o matrix: matrices/mvp.yaml diff --git a/docs/design-bpfman-preflight.md b/docs/design-bpfman-preflight.md new file mode 100644 index 0000000..a7149ce --- /dev/null +++ b/docs/design-bpfman-preflight.md @@ -0,0 +1,88 @@ +# Design sketch: bpfman pre-flight integration + +> Status: **design only / Phase 2.** Not built. Pursue after a first external +> adopter; captured here so the direction isn't lost. + +## Principle + +bpfcompat decides **whether/which** an eBPF program will load; bpfman does the +privileged **execute**. bpfcompat never loads BPF in the cluster — it only reads +node metadata and runs its own disposable VMs on the customer's CI/runner. This +keeps bpfcompat on its defensible wedge (pre-deployment compatibility evidence) +with **no new kernel-exploit surface** — unlike building a rival loader or a +multi-tenant BPF-running SaaS. + +## The gap it fills + +[bpfman](https://bpfman.io/) (CNCF; shipped by Red Hat as the OpenShift "eBPF +Manager Operator") deploys an eBPF program from an OCI bytecode image to nodes via +a `nodeSelector`. But a node's `status.nodeInfo.kernelVersion` is the *exact +backported* kernel (e.g. `4.18.0-553.el8_10`), and backported enterprise kernels +do not reveal feature support from the version. So bpfman can roll a program to +nodes where it silently fails to load. bpfcompat answers, before the rollout: +"loads on these node kernels, fails on those, here's the classified reason." + +## Data flow + +``` +bpfman program CR ──▶ OCI bytecode image + program/attach type ──┐ +K8s Node objects ──▶ fleet matrix {distro, kernelVersion, arch} ─┼─▶ bpfcompat run + └─ map each node kernel → profile ──────┘ (VMs on CI / + self-hosted) + │ + classified pass/fail matrix per node kernel │ + ┌── PASS on required kernels ─▶ allow rollout ───┘ + └── FAIL ─▶ gate / annotate (which nodes, why) + bpfman (privileged) still does the load +``` + +Inputs and their sources: +1. **Artifact** — the program CR already references bytecode as an OCI image; + bpfcompat pulls the same image (registry-artifact support already exists). +2. **Target kernels** — read-only K8s API: `Node.status.nodeInfo` → + `kernelVersion`, `osImage`, `architecture` (the "fleet-aware matrix"). +3. **Profiles** — map each distinct node kernel to a curated bpfcompat profile. + This is where the enterprise catalog pays off — real clusters run + RHEL/Amazon/Oracle nodes, now proven (`docs/case-study-enterprise-kernels.md`). + +## Two shapes + +**Shape A — CI / GitOps pre-flight (MVP, build first).** In the pipeline that +publishes the bytecode image + bpfman CR, run the bpfcompat GitHub Action against +the fleet matrix; block/annotate the PR if the program won't load on required node +kernels. Zero cluster privilege, no new infra. + +**Shape B — admission / controller pre-flight (later).** A validating admission +webhook intercepts a bpfman program CR before apply. Because VM validation takes +minutes, it **cannot** run inside admission — a controller/CronJob precomputes a +compatibility cache (image × fleet-kernel → verdict) asynchronously; the webhook +just looks it up and denies/warns. That cache + controller is the part that starts +to look like a real product surface. + +## What to build (later, if pursued) +- `bpfcompat fleet-matrix --from-kube` — read Node objects → emit a matrix YAML, + mapping `(distro, kernelVersion, arch)` to a profile; flag kernels with no + curated profile as **"uncovered"** (honest; a catalog-growth signal). +- `adapters/bpfman/` — given a bpfman program CR, extract OCI image + + program/attach type → bpfcompat artifact + manifest. +- Documented "gate your bpfman CRs in CI" recipe (Shape A). +- (Shape B) the evidence-cache controller + admission webhook. + +## Trust & security posture +- bpfcompat never loads BPF in-cluster or on shared infra; only reads Node + metadata + runs disposable VMs on the customer's CI/runner. bpfman keeps the + privileged load. No untrusted-multi-tenant execution, no new exploit surface. +- Read-only K8s RBAC (get/list Nodes + program CRs). + +## Why this shape is right +- **Complements** bpfman (CNCF/Red Hat) instead of competing → a credible OSS + integration target (joins Falco/Tracee/Aya on the adoption list). +- **Leverages the enterprise catalog moat** exactly where CO-RE/bpfman give the + least guarantee — backported RHEL/Amazon fleets. +- Keeps bpfcompat on its defensible pre-deployment wedge. + +## Caveats +- Phase 2; pursue only after OSS adoption signal. +- Shape B's value depends on the async evidence cache (admission can't boot VMs). +- Coverage is bounded by the curated catalog — "uncovered" must be reported, not + hidden (and it drives the paid-catalog loop). diff --git a/docs/quickstart.md b/docs/quickstart.md index 13fb804..7adbfd5 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -35,7 +35,7 @@ jobs: runs-on: ubuntu-latest # exposes /dev/kvm for KVM acceleration steps: - uses: actions/checkout@v4 - - uses: Kernel-Guard/bpfcompat@v0.1.5 + - uses: Kernel-Guard/bpfcompat@v0.1.6 with: artifact: build/program.bpf.o # your compiled object matrix: matrices/mvp.yaml # the kernels you support @@ -57,7 +57,7 @@ What you get: Shipping a whole product? Use **suite mode** to gate a collection in one run: ```yaml - - uses: Kernel-Guard/bpfcompat@v0.1.5 + - uses: Kernel-Guard/bpfcompat@v0.1.6 with: suite: suites/project.yaml suite-out: reports/suite.json