Add `CGROUP_SOCK_ADDR` Set Support by yaakov-stein · Pull Request #493 · facebook/bpfilter

yaakov-stein · 2026-03-31T03:48:17Z

Summary

Before this change, sets (hash maps and LPM tries used for bulk matching) were only supported on packet-based hooks. This meant users who wanted to filter against a list of IPs or ports on a CONNECT/SENDMSG hook had to create one rule per value, which scales poorly.

This change adds set support for all four CGROUP_SOCK_ADDR hooks, reusing the same BPF map infrastructure (hash maps for exact match, LPM tries for prefix match) that packet hooks already use.

Refactoring: make set codegen composable/generic

Set codegen previously lived entirely in set.c and was tightly coupled to packet-based header loading. To reuse it for CGROUP_SOCK_ADDR hooks, we extracted the generic pieces (copy field to scratch, map/trie lookup) into shared helpers and moved the packet-specific logic into packet.c. Protocol checks and layer conflict detection were also hoisted out of set codegen into program.c's existing dedup loop, so they now cover set components the same way they cover individual matchers.

As a drive-by, set components are now validated individually against the hook's supported matchers.

`CGROUP_SOCK_ADDR` set codegen

With the shared helpers in place, CGROUP_SOCK_ADDR set codegen maps each key component to its bpf_sock_addr context field offset, copies the field to the scratch buffer, and calls the shared map lookup. No header parsing needed — r6 already points to the context.

Why can't we use the current set codegen?

Packet-based hooks receive a pointer to raw packet memory. The BPF program parses headers layer by layer (L2 -> L3 -> L4), and r6 points to the current header being inspected.

CGROUP_SOCK_ADDR hooks receive a bpf_sock_addr context struct instead. The kernel pre-extracts socket metadata into typed fields:

struct bpf_sock_addr {
    __u32 user_ip4;        // destination IPv4
    __u32 user_ip6[4];     // destination IPv6
    __u32 user_port;       // destination port (network order)
    __u32 msg_src_ip4;     // source IPv4  (sendmsg only)
    __u32 msg_src_ip6[4];  // source IPv6  (sendmsg only)
    ...
};

This difference has two consequences for set codegen:

No header parsing needed. r6 already points to the context. Instead of loading a header pointer and extracting fields from parsed packet data, we read directly from context field offsets (r6 + offsetof(bpf_sock_addr, field)).
BPF verifier constraints on access width. Packet memory can be read at any width. Context struct fields must be read at widths that respect field alignment. The verifier rejects misaligned reads from context pointers. The new bf_stub_load function handles this by picking the largest safe access width based on both source and destination alignment.

The diagram below shows the two paths:

Packet hooks (XDP/TC/NF/cgroup_skb):
  ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
  │ Parse L2/L3  │────>│ Load header  │────>│ Copy field   │───> map lookup
  │ (dynptr)     │     │ ptr into r6  │     │ to scratch   │
  └──────────────┘     └──────────────┘     └──────────────┘

CGROUP_SOCK_ADDR:
  ┌──────────────────┐     ┌──────────────┐
  │ r6 = ctx (set    │────>│ Copy field   │───> map lookup
  │ once in prologue)│     │ to scratch   │
  └──────────────────┘     └──────────────┘

Test plan

Unit tests: validation of set component hook compatibility and cross-layer conflict detection
E2e tests: each CGROUP_SOCK_ADDR hook has dry-run parsing tests and behavioral tests covering hash sets, trie sets, and multi-component sets

github-actions · 2026-03-31T03:48:28Z

Claude review of PR #493 (`5ab9f35`)

Previous must-fix items (NULL pointer dereferences in chain.c, program.c, cgroup_sock_addr.c) have been addressed in the latest push. The bf_stub_load refactoring also corrects a pre-existing alignment check issue in the copy loop (old code checked only the initial offset, not the advancing offset). Existing inline comments from prior runs cover the remaining open suggestions and nits.

Suggestions

Redundant header loads in packet set loop — src/libbpfilter/cgen/matcher/packet.c:340 — bf_stub_load_header is called per-component in the hash-map loop even when multiple components share the same protocol layer, emitting redundant BPF instructions

Nits

Missing assert for array pointer parameter — src/libbpfilter/chain.c:72 — _bf_rule_check_layer takes layer_hdr_id (array/pointer) without asserting it, inconsistent with every other function in the file
Inconsistent behavioral test ordering — tests/e2e/hooks/cgroup_sock_addr_connect6.sh:138 — Test blocks ordered differently (multi-component first) compared to connect4.sh (single-component first)

Workflow run

src/libbpfilter/chain.c

src/libbpfilter/cgen/program.c

src/libbpfilter/cgen/cgroup_sock_addr.c

src/libbpfilter/cgen/matcher/packet.c

src/libbpfilter/cgen/stub.c

Extract the copy-to-scratch logic from bf_stub_stx_payload into a new function with explicit parameters. bf_stub_load copies size bytes from R6 + src_offset to scratch[dst_scratch_offset], using min(src, dst) alignment for access size selection. For each chunk, the largest access width is picked where both src_off and dst_off are aligned and remaining >= width. This ensures optimal width for packet access (R6 = memory pointer) and verifier-safe width for context access (R6 = ctx pointer). bf_stub_stx_payload is rewritten as a thin wrapper around bf_stub_load.

src/libbpfilter/cgen/cgroup_sock_addr.c

src/libbpfilter/chain.c

src/libbpfilter/cgen/program.c

tests/e2e/hooks/cgroup_sock_addr_connect6.sh

tests/e2e/hooks/cgroup_sock_addr_sendmsg6.sh

src/libbpfilter/cgen/matcher/set.h

src/libbpfilter/cgen/stub.h

Extract reusable set codegen helpers from the existing packet-specific set code: - bf_set_generate_map_lookup: the 5-instruction map-lookup tail shared by both trie and hash paths (load set FD, compute key pointer, call bpf_map_lookup_elem, jump to next rule on NULL). - bf_set_generate_trie_lookup: complete LPM trie key assembly and lookup. Writes prefixlen at scratch[4], copies the address to scratch[8] via bf_stub_load, then calls bf_set_generate_map_lookup. Replaces ~30 lines of IPv4/IPv6-branching trie bytecode that would otherwise be duplicated between packet and cgroup_sock_addr flavors. The trie path in bf_matcher_generate_set is inlined since it reduces to three calls with the new helpers. The hash path calls bf_set_generate_map_lookup instead of inlining the sequence.

yaakov-stein · 2026-03-31T18:51:12Z

doc/usage/bfcli.rst

      - Notes
-    * - :rspan:`1` Source address
-      - :rspan:`1` ``ip6.saddr``
+    * - :rspan:`2` Source address


drive-by fix (document ip6.d/saddr works in sets)

src/libbpfilter/cgen/program.c

src/libbpfilter/cgen/cgroup_sock_addr.c

tests/unit/libbpfilter/chain.c

tests/e2e/hooks/cgroup_sock_addr_connect4.sh

src/libbpfilter/cgen/matcher/packet.c

src/libbpfilter/cgen/cgroup_sock_addr.c

github-actions · 2026-03-31T19:15:49Z

src/libbpfilter/cgen/cgroup_sock_addr.c

@@ -204,6 +208,98 @@ static int _bf_cgroup_sock_addr_generate_port(struct bf_program *program,
                        bf_matcher_payload(matcher), 2, BPF_REG_1);
 }


Claude: nit: _bf_cgroup_sock_addr_ctx_offset is a non-trivial function mapping matcher types to bpf_sock_addr field offsets, with a sentinel return convention ((size_t)-1 for unsupported types). A brief Doxygen comment documenting the purpose, the return convention, and the handled matcher types would help maintainability. Other comparable helpers in this file (e.g., _bf_cgroup_sock_addr_generate_port) have documentation.

src/libbpfilter/cgen/matcher/set.c

For BF_MATCHER_SET, iterate set key components through the same checked_layers dedup logic as regular matchers. Extract _bf_program_check_proto to share the check between both paths.

Check each set key component against the chain's hook and track set components in the per-layer compatibility check. Extract _bf_rule_check_layer to share the logic between regular matchers and set components.

github-actions · 2026-04-01T16:50:52Z

src/libbpfilter/cgen/matcher/packet.c

+
+        if (!meta) {
+            return bf_err_r(-EINVAL, "missing meta for '%s'",
+                            bf_matcher_type_to_str(type));


Claude: suggestion: In the hash-map loop of _bf_matcher_pkt_generate_set, bf_stub_load_header is called for every set component. When multiple components share the same protocol layer (e.g. ip4.saddr and ip4.daddr are both L3), the second call redundantly reloads R6 from the same stack slot, emitting unnecessary BPF instructions.

Consider tracking the loaded layer (similar to how _bf_program_check_proto uses checked_layers) to skip redundant header loads. This reduces the generated BPF instruction count for multi-component sets with same-layer keys.

github-actions · 2026-04-01T16:50:53Z

src/libbpfilter/chain.c

+ */
+static int _bf_rule_check_layer(enum bf_matcher_type type,
+                                uint32_t layer_hdr_id[_BF_MATCHER_LAYER_MAX])
+{


Claude: nit: _bf_rule_check_layer takes layer_hdr_id as an array parameter (decays to pointer) but does not assert() it. The style guide requires assert() for pointer preconditions, and every other function in this file that takes a pointer asserts on it (e.g. _bf_rule_references_empty_set, _bf_rule_has_incompatible_matchers).

Suggested change

{

{

const struct bf_matcher_meta *meta = bf_matcher_get_meta(type);

Should be:

{ assert(layer_hdr_id); const struct bf_matcher_meta *meta = bf_matcher_get_meta(type);

github-actions · 2026-04-01T16:50:54Z

tests/e2e/hooks/cgroup_sock_addr_connect6.sh

 (! udp6_connect ${HOST_IP6_ADDR} 9991)
 ${FROM_NS} ${BFCLI} ruleset flush
+
+# (ip6.daddr, udp.dport) multi-component hash set


Claude: nit: The behavioral test blocks in connect6.sh are ordered: multi-component hash set, then single-component hash set, then trie set. In connect4.sh the order is: single-component hash, trie, multi-component. Consider using a consistent ordering across the IPv4 and IPv6 test files for readability.

yaakov-stein requested a review from qdeslandes as a code owner March 31, 2026 03:48

meta-cla bot added the cla signed label Mar 31, 2026

yaakov-stein marked this pull request as draft March 31, 2026 03:49

yaakov-stein removed the request for review from qdeslandes March 31, 2026 03:49

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/chain.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/chain.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/program.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/cgroup_sock_addr.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/matcher/packet.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/stub.c Show resolved Hide resolved

yaakov-stein force-pushed the add_csa_set_support branch 2 times, most recently from 8a5b67b to 1def2a9 Compare March 31, 2026 17:27

yaakov-stein force-pushed the add_csa_set_support branch from 1def2a9 to 557e9ea Compare March 31, 2026 17:43

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/cgroup_sock_addr.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/chain.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/program.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

tests/e2e/hooks/cgroup_sock_addr_connect6.sh Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

tests/e2e/hooks/cgroup_sock_addr_sendmsg6.sh Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/matcher/set.h Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/stub.h Show resolved Hide resolved

yaakov-stein force-pushed the add_csa_set_support branch from 557e9ea to e70a6ab Compare March 31, 2026 18:37

yaakov-stein commented Mar 31, 2026

View reviewed changes

yaakov-stein force-pushed the add_csa_set_support branch from e70a6ab to 683d9a7 Compare March 31, 2026 18:54

yaakov-stein marked this pull request as ready for review March 31, 2026 18:55

yaakov-stein requested a review from qdeslandes March 31, 2026 18:55

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/program.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/cgroup_sock_addr.c Outdated Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

tests/unit/libbpfilter/chain.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

tests/e2e/hooks/cgroup_sock_addr_connect4.sh Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/matcher/packet.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/cgroup_sock_addr.c Show resolved Hide resolved

github-actions bot reviewed Mar 31, 2026

View reviewed changes

src/libbpfilter/cgen/matcher/set.c Show resolved Hide resolved

yaakov-stein added 3 commits April 1, 2026 08:49

lib: cgen: program: hoist set protocol checks into dedup loop

f88971e

For BF_MATCHER_SET, iterate set key components through the same checked_layers dedup logic as regular matchers. Extract _bf_program_check_proto to share the check between both paths.

lib: chain: validate set components per hook and layer

a0ce260

Check each set key component against the chain's hook and track set components in the per-layer compatibility check. Extract _bf_rule_check_layer to share the logic between regular matchers and set components.

lib: cgen: packet: move set codegen from set.c to packet.c

9a79cd3

yaakov-stein force-pushed the add_csa_set_support branch from 683d9a7 to 2f9ddba Compare April 1, 2026 16:26

yaakov-stein added 3 commits April 1, 2026 09:28

lib: cgen: add cgroup_sock_addr set codegen

392e248

tests: add cgroup_sock_addr set tests

73c0dee

doc: bfcli: document cgroup_sock_addr set support and ip6 set keys

5ab9f35

yaakov-stein force-pushed the add_csa_set_support branch from 2f9ddba to 5ab9f35 Compare April 1, 2026 16:29

github-actions bot reviewed Apr 1, 2026

View reviewed changes

		@@ -204,6 +208,98 @@ static int _bf_cgroup_sock_addr_generate_port(struct bf_program *program,
		bf_matcher_payload(matcher), 2, BPF_REG_1);
		}

	{
	{
	const struct bf_matcher_meta *meta = bf_matcher_get_meta(type);

Conversation

yaakov-stein commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Refactoring: make set codegen composable/generic

CGROUP_SOCK_ADDR set codegen

Why can't we use the current set codegen?

Test plan

Uh oh!

github-actions bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude review of PR #493 (5ab9f35)

Suggestions

Nits

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yaakov-stein Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yaakov-stein commented Mar 31, 2026 •

edited

Loading

`CGROUP_SOCK_ADDR` set codegen

github-actions bot commented Mar 31, 2026 •

edited

Loading

Claude review of PR #493 (`5ab9f35`)

yaakov-stein Mar 31, 2026 •

edited

Loading