fix(metrics): track unique cardinality label tuples by ramnnn2006 · Pull Request #210 · optiqor/kerno

ramnnn2006 · 2026-06-09T12:55:43Z

What

this PR fixes a bug in cardinalityOK where seen[metric] was incremented on every event instead of tracking distinct label combinations. it also refactors the underlying data structure and updates the test suite to cover the correct semantics.

Why

Fixes #117

cardinalityOK was acting as a raw event counter, not a cardinality tracker. one busy (syscall, comm) pair could exhaust the 5000 limit in under a second and permanently silence all observations for that category, including series that were already being tracked fine. the bug meant the cap was effectively unreachable in any real workload.

How

the refactor changes seen from map[string]int to map[string]map[string]struct{} so each metric key maps to a bounded set of distinct label tuples instead of an event count. added a labelKey param to cardinalityOK and updated all 5 callers to compute their label values before the guard and pass them joined with \x00:

recordSyscall passes sc + "\x00" + comm
recordTCP passes src + "\x00" + dst + "\x00" + comm
recordDiskIO passes dev + "\x00" + op
recordSchedDelay and recordFD pass comm

known tuples always pass through even when the set is full. only genuinely new combinations get dropped once the cap is hit.

Testing

rewrote TestCardinalityLimit to test the actual cardinality verified event counting, which is the buggy behavior):

N unique keys fill the bucket, last call returns true
N+1 unique keys, last one is correctly rejected
known key passes even when the bucket is full (the exact cas
saturating one metric has no effect on a separate metric
go build ./... passes (GOOS=linux for eBPF-dependent packages)
go test ./... passes
go vet ./... passes (non-eBPF packages)
golangci-lint run ./... passes
Tested locally
N/A — pure refactor/bug fix, no eBPF programs touched
sudo ./bin/bpf-verify --read 5s confirms 6/6 programs
./scripts/verify.sh passes

Checklist

PR title follows Conventional Commits (fix(metrics): track unique label tuples in cardinalityOK)
all commits are DCO-signed (git commit -s)
no unrelated changes pulled in
documentation updated where user-visible behavior changehanged)
added/updated tests for new code paths
if a new doctor rule, paired with a chaos scenario in `sable)

cardinalityOK was incrementing seen[metric] on every event, so a single busy (syscall, comm) pair could exhaust the 5000 limit in under a second and permanently silence all observations for that category, including already-tracked series. fix: change seen from map[string]int to map[string]map[string]struct{} and add a labelKey param to cardinalityOK. each caller now passes the actual label tuple (joined with \x00) so only genuinely new combinations count toward the cap. known tuples always pass through even when full. updated TestCardinalityLimit to test new semantics: unique keys fill the bucket, known keys still pass when full, new keys are rejected, different metrics stay isolated. Signed-off-by: ramnnn2006 <phrkvvp@gmail.com>

github-actions · 2026-06-09T12:55:52Z

🚀 First PR — welcome aboard!

A few things to expect:

CI: every PR runs build + race tests + lint + (eventually) the kernel matrix. If something fails, the log will tell you exactly which gate.
DCO: every commit needs Signed-off-by: — git commit -s adds it automatically.
Conventional Commits: PR titles like feat(doctor): add new rule or fix(bpf): handle X. We squash-merge by default.
Review: a maintainer will review within 72 hours. Suggestions are conversations, not orders — push back if something doesn't fit your context.

If you get stuck, reply here or jump to Discussions. We want this PR to land.

ramnnn2006 · 2026-06-09T13:10:20Z

hi @btwshivam , could you review one
i think this PR fits type:bug , type:refactor , type:testing and quality:clean labels here since i feel it fixes a real bug, changes the data structure, and updates the tests.
also gssoc:approved is required for the contribution to count.

happy to keep working on more issues in the repo, thanks!

ramnnn2006 requested a review from btwshivam as a code owner June 9, 2026 12:55

github-actions Bot added level:intermediate 50-200 lines or 3-5 files (auto-applied) testing Tests and test coverage labels Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(metrics): track unique cardinality label tuples#210

fix(metrics): track unique cardinality label tuples#210
ramnnn2006 wants to merge 1 commit into
optiqor:mainfrom
ramnnn2006:fix/metrics-cardinality-label-tuples

ramnnn2006 commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

ramnnn2006 commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ramnnn2006 commented Jun 9, 2026

What

Why

How

Testing

Checklist

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

ramnnn2006 commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant