Skip to content

[REVIEW] segmentation: avoid AWS local-route false positives and add Kubernetes enforcement gates #1148

@NiXouuuu

Description

@NiXouuuu

Skill Being Reviewed

Skill name: segmentation
Skill path: skills/network/segmentation/

False Positive Analysis

Benign code that triggers a false positive:

# AWS VPC subnets rely on the implicit local route for in-VPC reachability,
# while enforcement is done at the workload ENI through security groups.
resource "aws_security_group" "app" {
  name   = "app"
  vpc_id = aws_vpc.prod.id
}

resource "aws_security_group" "db" {
  name   = "db"
  vpc_id = aws_vpc.prod.id

  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = []
  }
}

Why this is a false positive:

Step 2.2 treats a route with target = "local" between application and data tier as "BAD: Flat routing" and classifies "missing enforcement point between zones" as Critical. In AWS, the VPC local route is normal routing substrate, not proof of absent segmentation. AWS security groups are associated with resources and control the inbound/outbound traffic allowed to reach or leave those resources; with source security-group references, only app workloads can reach db on TCP/5432. There is no firewall ENI, but there is still a policy enforcement point at the workload attachment.

The current heuristic will over-report well-scoped cloud-native segmentation as Critical if it expects every zone boundary to be forced through an inline firewall target. NIST SP 800-207 is about policy enforcement for resource access; it does not require every PEP to be a routed firewall appliance. The skill should evaluate the effective path by combining route reachability with SG/NACL/Kubernetes/service-mesh policy, not classify local routing alone.

Coverage Gaps

Missed variant 1: Kubernetes NetworkPolicy manifest present but not enforced

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: prod
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Why it should be caught:

The skill gives strong positive credit for default-deny NetworkPolicy presence, but Kubernetes makes enforcement dependent on the network plugin. The official Kubernetes docs state that a cluster must use a NetworkPolicy-capable network plugin, and creating a NetworkPolicy without an implementing controller has no effect. The review should require evidence of the CNI/plugin and enforcement mode (Calico/Cilium/Antrea/etc.), not just YAML presence.

The same gap applies to hostNetwork pods and node traffic. Kubernetes documents that traffic to/from the node where a pod runs is always allowed for IPBlock policy, and hostNetwork behavior can be ignored by common implementations. A production cluster can therefore pass this skill with a default-deny manifest while privileged or daemon workloads bypass pod-level isolation.

Missed variant 2: Additive allow policy silently defeats a default-deny baseline

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-prod-egress
  namespace: prod
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0

Why it should be caught:

Kubernetes NetworkPolicies are additive: if any applicable policy allows traffic, that traffic is allowed. A repo can contain a default-deny policy and still allow broad egress or cross-namespace traffic through another policy. The skill asks for "absence of Kubernetes default-deny NetworkPolicy" as High, but it does not require checking the union of all policies selecting the same pods, nor both sides of a pod-to-pod connection (source egress and destination ingress). That misses the real effective-access question.

Edge Cases

  • AWS metadata and reserved service paths: AWS security groups do not filter traffic to several reserved destinations, including EC2 instance metadata, DNS, DHCP, ECS task metadata, time sync, and default-router reserved addresses. For segmentation reviews that claim egress isolation, the skill should explicitly check IMDSv2/session-token controls, container metadata exposure, and DNS resolver policy instead of assuming SG egress covers every path.
  • Inline-firewall requirement can be structurally wrong in managed cloud designs: Serverless, managed database, PrivateLink, service-mesh, identity-aware proxy, or SG-reference architectures may enforce access without a traditional DMZ/internal firewall hop. The skill should distinguish "no enforcement" from "non-firewall enforcement."
  • Active segmentation testing guidance is too broad for production: Step 6 suggests attempting to reach every zone and all ports. That is useful as a methodology but risky as a default instruction. The skill should require authorization, rate limits, maintenance windows, and safe probe lists, or recommend passive reachability analysis first.
  • Fail-open and propagation windows are not assessed: Kubernetes notes that NetworkPolicy handling can be eventual during pod/policy creation. The skill should ask for rollout tests around pod startup, policy changes, and CNI failures, not just steady-state manifests.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: The skill's current remediation model is too appliance-centric. It will push some teams toward unnecessary firewall hairpinning instead of validating existing cloud-native PEPs. It should add an "effective reachability" layer before severity assignment:
    • Route reachability: VPC/subnet/routes/TGW/peering/PrivateLink.
    • Enforcement: SG/NACL/firewall/network policy/service mesh/identity-aware proxy.
    • Exceptions: AWS metadata/reserved services, Kubernetes node and hostNetwork traffic, service mesh sidecar bypass.
    • Validation: reachability analyzer, CNI conformance checks, controlled probe results, and policy union review.

Comparison to Other Tools

Tool Catches this? Notes
Semgrep IaC Partial Can flag obvious 0.0.0.0/0 or missing NetworkPolicy patterns, but usually does not compute effective multi-policy reachability.
CodeQL No Not the right tool for cloud/Kubernetes network-path semantics.
Checkov / tfsec Partial Useful for common IaC misconfigurations, but route + SG + NACL + service-mesh composition still needs manual/effective-path analysis.
AWS VPC Reachability Analyzer Partial Better for AWS path feasibility than static grep, but must be paired with policy review and workload identity context.
Cilium/Calico policy tooling Partial Stronger for Kubernetes enforcement and connectivity tests, but vendor-specific and still needs cross-policy/namespace review.

Overall Assessment

Strengths:

  • Strong structure for zone mapping, trust-boundary documentation, DMZ review, PCI CDE scope, and micro-segmentation readiness.
  • Correctly calls out common segmentation misconceptions: VLAN-only boundaries, hub-and-spoke bypass paths, service-mesh bypass, and namespace-only Kubernetes "segmentation."
  • Good output format: zone map, trust matrix, findings, readiness score, and prioritized remediation are all practitioner-friendly.

Needs improvement:

  • The target = "local" route heuristic creates Critical false positives in AWS-style segmentation where SGs/NACLs enforce at the ENI/resource boundary.
  • Kubernetes review treats NetworkPolicy YAML presence as stronger evidence than it is; it does not require CNI enforcement proof, hostNetwork review, node-traffic exceptions, or additive policy union analysis.
  • The testing methodology is operationally risky if followed literally in production without authorization/safety constraints.
  • Severity should be based on effective reachability and enforcement evidence, not only on whether traffic traverses an inline firewall.

Priority recommendations:

  1. Replace route-only Critical findings with an effective-path matrix that combines routes, SG/NACL/firewall rules, Kubernetes policies, service mesh policy, and identity-aware controls.
  2. Add Kubernetes enforcement gates: CNI/plugin supports NetworkPolicy, selected pod union of all ingress/egress policies, hostNetwork/node exceptions, and policy propagation/rollout behavior.
  3. Add cloud-provider exception checks for AWS metadata/reserved services and similar managed-service paths before claiming egress segmentation is complete.
  4. Rewrite Step 6 as "authorized segmentation validation" with passive reachability analysis first, scoped probes second, and explicit approval/rate-limit requirements.

Official references used:

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: PayPal

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions