Skip to content

fix: namespace exclude filtering#2118

Open
alexcastilio wants to merge 4 commits intomainfrom
fix/ns-filtering
Open

fix: namespace exclude filtering#2118
alexcastilio wants to merge 4 commits intomainfrom
fix/ns-filtering

Conversation

@alexcastilio
Copy link
Contributor

Description

Fix: Namespace Exclude Filtering in MetricConfiguration CRD

Problem

The namespace exclude filtering feature in the MetricConfiguration CRD is completely non-functional due to two separate bugs:

  1. appendExcludeList() is empty — The function body contains only a TODO comment and never populates the exclude map or manages the filtermanager. Setting spec.namespaces.exclude has no effect.

  2. updateNamespaceLists() has logic errors — Sequential if statements instead of if/else cause both include and exclude lists to be reset when only one should be active.

This means any cluster using spec.namespaces.exclude in a MetricsConfiguration CRD gets no namespace filtering at all — either no IPs are added to the filtermap (no metrics collected) or all IPs are added without filtering (eBPF map exhaustion).

Fix

Three changes in pkg/module/metrics/metrics_module.go:

  1. Implement appendExcludeList() — Mirrors the existing appendIncludeList() pattern: diffs old vs new exclude set, removes IPs for newly-excluded namespaces from filtermanager, adds IPs for newly-un-excluded namespaces. On initial activation of exclude mode, uses GetAllNamespaces() to add IPs for all non-excluded namespaces.

  2. Fix updateNamespaceLists() control flow — Replaced four sequential if blocks with a proper if/else if/else chain. Added return after the both-set error case to prevent state mutation. Clears the opposite mode before activating the new one to avoid filtermanager entry conflicts.

  3. Add doc comment to nsOfInterest() — Clarifies that when no namespace filters are configured, no namespace is considered of interest by default — pods must be individually annotated or already in the filtermap.

One change in pkg/controllers/cache/:

  1. Add GetAllNamespaces() to CacheInterface — Returns all unique namespaces that have at least one endpoint in the cache. Needed by appendExcludeList() to compute the set of non-excluded namespaces when exclude mode activates. Iterates epMap since nsMap is unused.

Tests

Unit tests added covering:

  • TestNsOfInterest — Verifies correct behavior for include-only, exclude-only, and no-filter cases (6 subcases)
  • TestAppendExcludeList — Mirrors existing TestAppendIncludeList: add, change, remove exclude namespaces (5 subcases)
  • TestUpdateNamespaceListsExclude — Exclude-only spec populates excludedNamespaces and clears includedNamespaces
  • TestPodCallBackExclude — End-to-end: pod in non-excluded ns is tracked, pod in excluded ns is not
  • TestGetAllNamespaces — Cache returns unique namespace list from endpoint map

All existing tests pass (no regressions).

Manual test

Scenario 1 — Namespace exclude filtering (MetricsConfiguration CRD mode)

Test:

# helm install with: enableAnnotations=false, enablePodLevel=true, operator.enabled=true

kubectl create namespace test-include
kubectl create namespace test-exclude
Apply MetricsConfiguration CRD with namespaces.exclude: [test-exclude]
# → Verify: CRD status is "Accepted"
kubectl run pod-include -n test-include --image=nginx
# → Verify: "Adding pod IP to ADD dirty pods cache" with test-include/pod-include in retina logs
kubectl run pod-exclude -n test-exclude --image=nginx
# → Verify: NO "Adding pod IP to ADD dirty pods cache" for test-exclude/pod-exclude
kubectl delete pod pod-include -n test-include
# → Verify: "Adding pod IP to DELETE dirty pods cache" with test-include/pod-include

Logs:

=== ADD phase (non-excluded namespace) ===
ts=2026-03-16T15:38:22.790Z level=info caller=metrics/metrics_module.go:529 msg="Adding pod IP to ADD dirty pods cache" pod name=test-include/pod-include

=== ADD phase (excluded namespace) ===
(no matching logs — correct if excluded)

=== DELETE phase ===
ts=2026-03-16T15:39:03.955Z level=info caller=metrics/metrics_module.go:532 msg="Adding pod IP to DELETE dirty pods cache" pod name=test-include/pod-include

Scenario 2 — Switch from exclude to include

Test:

Update MetricsConfiguration CRD from namespaces.exclude: [test-exclude] to namespaces.include: [test-include]
# → Verify: "Including namespaces|Appending namespaces|newly excluded|newly un-excluded" log

Logs:

ts=2026-03-16T15:39:22.959Z level=info caller=metrics/metrics_module.go:184 msg="Including namespaces" namespaces=test-include
ts=2026-03-16T15:39:22.959Z level=info caller=metrics/metrics_module.go:413 msg="Appending namespaces to exclude list" namespaces=
ts=2026-03-16T15:39:22.959Z level=info caller=metrics/metrics_module.go:439 msg="Namespaces newly excluded" namespaces=
ts=2026-03-16T15:39:22.959Z level=info caller=metrics/metrics_module.go:440 msg="Namespaces newly un-excluded" namespaces=test-exclude
ts=2026-03-16T15:39:22.960Z level=info caller=metrics/metrics_module.go:348 msg="Appending namespaces to include list" namespaces=test-include

Related Issue

Fixes #2085

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

Please add any relevant screenshots or GIFs to showcase the changes made.

Additional Notes

Add any additional notes or context about the pull request here.


Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
@alexcastilio alexcastilio requested a review from a team as a code owner March 16, 2026 16:05
@github-actions
Copy link

Retina Code Coverage Report

Total coverage increased from 33.3% to 33.5%

Increased diff

Impacted Files Coverage
pkg/module/metrics/metrics_module.go 64.59% ... 83.38% (18.79%) ⬆️
pkg/controllers/cache/cache.go 65.35% ... 85.4% (20.05%) ⬆️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pod IP Deletion Leak in eBPF Filter and Namespace Filtering Issues in MetricConfiguration CRD

3 participants