Skip to content

test(helm): add k3s integration test for chart deploy#160

Open
Hanu-man12 wants to merge 4 commits into
optiqor:mainfrom
Hanu-man12:feat/helm-k3s-integration-test
Open

test(helm): add k3s integration test for chart deploy#160
Hanu-man12 wants to merge 4 commits into
optiqor:mainfrom
Hanu-man12:feat/helm-k3s-integration-test

Conversation

@Hanu-man12

Copy link
Copy Markdown

What

Adds an end-to-end integration test that boots a real k3s cluster via
testcontainers, installs the kerno Helm chart, and asserts the DaemonSet
rolls out and /healthz + /readyz endpoints return HTTP 200.

Why

Fixes #149

How

  • Uses testcontainers-go/modules/k3s to spin up a k3s cluster in Docker
  • Installs the chart via helm binary with CI-safe overrides (busybox image, non-privileged securityContext)
  • Asserts DaemonSet DesiredNumberScheduled > 0 and NumberReady == Desired
  • Asserts /healthz and /readyz return HTTP 200 via client-go service lookup
  • Dumps pod events on rollout failure for easier debugging
  • Test is gated behind -tags integration so it never runs in the normal unit-test pass

Testing

  • go build ./... passes
  • go test ./... passes
  • go vet ./... passes
  • golangci-lint run ./... passes
  • Integration test: go test -v -tags integration -timeout 10m ./tests/integration/
  • N/A — pure docs/refactor
  • sudo ./bin/bpf-verify --read 5s confirms 6/6 programs still load
  • ./scripts/verify.sh passes

Checklist

  • PR title follows Conventional Commits (test(helm): subject)
  • All commits are DCO-signed (git commit -s)
  • No unrelated changes pulled in
  • Added/updated tests for new code paths
  • Documentation updated where user-visible behavior changed
  • If a new doctor rule, paired with a chaos scenario in scripts/verify.sh

@Hanu-man12 Hanu-man12 requested a review from btwshivam as a code owner May 31, 2026 14:22
@github-actions

Copy link
Copy Markdown

🚀 First PR — welcome aboard!

A few things to expect:

  1. CI: every PR runs build + race tests + lint + (eventually) the kernel matrix. If something fails, the log will tell you exactly which gate.
  2. DCO: every commit needs Signed-off-by:git commit -s adds it automatically.
  3. Conventional Commits: PR titles like feat(doctor): add new rule or fix(bpf): handle X. We squash-merge by default.
  4. Review: a maintainer will review within 72 hours. Suggestions are conversations, not orders — push back if something doesn't fit your context.

If you get stuck, reply here or jump to Discussions. We want this PR to land.

@github-actions github-actions Bot added level:advanced 200+ lines or 6+ files (auto-applied) testing Tests and test coverage area/bpf eBPF programs and loaders labels May 31, 2026
Add two new Prometheus metrics for per-program eBPF load tracking:
- kerno_bpf_program_loaded{program} gauge: 1 if loaded, 0 if failed
- kerno_bpf_program_load_errors_total{program,reason} counter

Wire metrics into buildCollectors in doctor.go so each program
reports its load result individually.

Also fix gen_stub.go build tag to use !ebpf instead of architecture
exclusions so stub types compile correctly on amd64/arm64.

Fixes optiqor#25

Signed-off-by: Hanu-man12 <shawharshit116@gmail.com>
- Import internal/metrics package in doctor.go
- Set BPFProgramLoaded{program}=0 and increment
  BPFProgramLoadErrorsTotal{program,reason} on load failure
- Set BPFProgramLoaded{program}=1 on successful load
- Also emit metrics on collector registration failure

Closes optiqor#25

Signed-off-by: Hanu-man12 <shawharshit116@gmail.com>
Re-save doctor.go as UTF-8; replace 11 ??? artifacts (10 em-dashes
and 1 checkmark in --quiet mode output) with the original characters.
Corruption was introduced when the file was re-saved through a
non-UTF-8 codepage during CRLF normalisation.

Also add unit tests for BPFProgramLoaded GaugeVec and
BPFProgramLoadErrorsTotal CounterVec to verify correct registration,
label cardinality, and value semantics.

Signed-off-by: Hanu-man12 <shawharshit116@gmail.com>
Add an end-to-end integration test that:
- Boots a k3s cluster via testcontainers-go (rancher/k3s:v1.31.4-k3s1)
- Installs the kerno Helm chart using the helm binary
- Asserts the DaemonSet rolls out (DesiredNumberScheduled > 0 and
  NumberReady == DesiredNumberScheduled)
- Asserts /healthz and /readyz return HTTP 200

CI overrides (busybox image + non-privileged securityContext) allow
the test to run without a real eBPF-capable kernel.

The test is gated behind -short skip and a helm-binary check so it
does not block unit-test runs in environments without Docker or helm.

Closes optiqor#149

Signed-off-by: Hanu-man12 <shawharshit116@gmail.com>
@Hanu-man12 Hanu-man12 force-pushed the feat/helm-k3s-integration-test branch from 320bb63 to 0c8fe6c Compare May 31, 2026 14:30
@github-actions github-actions Bot removed the area/bpf eBPF programs and loaders label May 31, 2026
@Hanu-man12

Copy link
Copy Markdown
Author

Hi @btwshivam 👋

The Lint and Test CI jobs are failing due to a pre-existing Go version mismatch — not caused by this PR.

Root cause:
ci.yml has GO_VERSION: "1.25" but the repo's existing transitive deps
(testcontainers-go, golang.org/x/*, otel packages) require go >= 1.25.0,
so go mod tidy automatically bumps go.mod to 1.26.0.

All checks pass locally:

  • go build ./...
  • go test ./... ✅ (all 13 packages)
  • go vet ./...

One-line fix needed in ci.yml:

GO_VERSION: "1.26"  # was "1.25"

@btwshivam btwshivam left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the integration test is missing its build tag so it lands in the default gate (that's the red ci), plus unrelated bpf metrics and a go 1.26 bump rode in. add the tag and split the scope.

// go test -v -tags integration -timeout 10m ./tests/integration/
//
// Requires Docker (or a compatible OCI runtime) and the `helm` binary on PATH.
package integration

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the package doc says run with -tags integration, but there's no //go:build integration constraint, so this compiles in the default go test ./... gate and pulls the k3s/client-go tree into the unit run. that's the red Lint and Test. add the build tag (see #164 for the pattern).

Comment thread internal/cli/doctor.go
logger.Debug("failed to load eBPF program; collector disabled",
"program", r.name, "error", err)
// Emit per-program load failure metrics.
metrics.BPFProgramLoaded.WithLabelValues(r.name).Set(0)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per-program bpf load metrics are a separate feature from a helm k3s test (this is the #25 work). scope creep, split it out.

Comment thread go.mod
module github.com/optiqor/kerno

go 1.25.4
go 1.26.0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't bump the go directive to 1.26 as a side effect of adding a test, ci runs 1.25 and nothing here needs 1.26.

@Hanu-man12

Copy link
Copy Markdown
Author

Hi @btwshivam — I looked into this. The go 1.26.0 bump is a side effect
of testcontainers-go v0.42.0 (and its transitive deps like golang.org/x/*,
otel) declaring go 1.25 in their own go.mod. Go toolchain automatically
promotes the go directive to 1.26.0 when any transitive dep requires >= 1.25.

Two options:

  1. Pin testcontainers-go to an older version (e.g. v0.35.0) that declares go 1.23
  2. Accept go 1.26.0 in go.mod and bump GO_VERSION in ci.yml

Which approach do you prefer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

level:advanced 200+ lines or 6+ files (auto-applied) testing Tests and test coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(helm): integration-test chart deploy on k3s

2 participants