Conversation
Claude review of PR #493 (5ab9f35)Previous must-fix items (NULL pointer dereferences in Suggestions
Nits
|
8a5b67b to
1def2a9
Compare
Extract the copy-to-scratch logic from bf_stub_stx_payload into a new function with explicit parameters. bf_stub_load copies size bytes from R6 + src_offset to scratch[dst_scratch_offset], using min(src, dst) alignment for access size selection. For each chunk, the largest access width is picked where both src_off and dst_off are aligned and remaining >= width. This ensures optimal width for packet access (R6 = memory pointer) and verifier-safe width for context access (R6 = ctx pointer). bf_stub_stx_payload is rewritten as a thin wrapper around bf_stub_load.
1def2a9 to
557e9ea
Compare
Extract reusable set codegen helpers from the existing packet-specific set code: - bf_set_generate_map_lookup: the 5-instruction map-lookup tail shared by both trie and hash paths (load set FD, compute key pointer, call bpf_map_lookup_elem, jump to next rule on NULL). - bf_set_generate_trie_lookup: complete LPM trie key assembly and lookup. Writes prefixlen at scratch[4], copies the address to scratch[8] via bf_stub_load, then calls bf_set_generate_map_lookup. Replaces ~30 lines of IPv4/IPv6-branching trie bytecode that would otherwise be duplicated between packet and cgroup_sock_addr flavors. The trie path in bf_matcher_generate_set is inlined since it reduces to three calls with the new helpers. The hash path calls bf_set_generate_map_lookup instead of inlining the sequence.
557e9ea to
e70a6ab
Compare
| - Notes | ||
| * - :rspan:`1` Source address | ||
| - :rspan:`1` ``ip6.saddr`` | ||
| * - :rspan:`2` Source address |
There was a problem hiding this comment.
drive-by fix (document ip6.d/saddr works in sets)
e70a6ab to
683d9a7
Compare
| @@ -204,6 +208,98 @@ static int _bf_cgroup_sock_addr_generate_port(struct bf_program *program, | |||
| bf_matcher_payload(matcher), 2, BPF_REG_1); | |||
| } | |||
There was a problem hiding this comment.
Claude: nit: _bf_cgroup_sock_addr_ctx_offset is a non-trivial function mapping matcher types to bpf_sock_addr field offsets, with a sentinel return convention ((size_t)-1 for unsupported types). A brief Doxygen comment documenting the purpose, the return convention, and the handled matcher types would help maintainability. Other comparable helpers in this file (e.g., _bf_cgroup_sock_addr_generate_port) have documentation.
For BF_MATCHER_SET, iterate set key components through the same checked_layers dedup logic as regular matchers. Extract _bf_program_check_proto to share the check between both paths.
Check each set key component against the chain's hook and track set components in the per-layer compatibility check. Extract _bf_rule_check_layer to share the logic between regular matchers and set components.
683d9a7 to
2f9ddba
Compare
2f9ddba to
5ab9f35
Compare
|
|
||
| if (!meta) { | ||
| return bf_err_r(-EINVAL, "missing meta for '%s'", | ||
| bf_matcher_type_to_str(type)); |
There was a problem hiding this comment.
Claude: suggestion: In the hash-map loop of _bf_matcher_pkt_generate_set, bf_stub_load_header is called for every set component. When multiple components share the same protocol layer (e.g. ip4.saddr and ip4.daddr are both L3), the second call redundantly reloads R6 from the same stack slot, emitting unnecessary BPF instructions.
Consider tracking the loaded layer (similar to how _bf_program_check_proto uses checked_layers) to skip redundant header loads. This reduces the generated BPF instruction count for multi-component sets with same-layer keys.
| */ | ||
| static int _bf_rule_check_layer(enum bf_matcher_type type, | ||
| uint32_t layer_hdr_id[_BF_MATCHER_LAYER_MAX]) | ||
| { |
There was a problem hiding this comment.
Claude: nit: _bf_rule_check_layer takes layer_hdr_id as an array parameter (decays to pointer) but does not assert() it. The style guide requires assert() for pointer preconditions, and every other function in this file that takes a pointer asserts on it (e.g. _bf_rule_references_empty_set, _bf_rule_has_incompatible_matchers).
| { | |
| { | |
| const struct bf_matcher_meta *meta = bf_matcher_get_meta(type); |
Should be:
{
assert(layer_hdr_id);
const struct bf_matcher_meta *meta = bf_matcher_get_meta(type);| (! udp6_connect ${HOST_IP6_ADDR} 9991) | ||
| ${FROM_NS} ${BFCLI} ruleset flush | ||
|
|
||
| # (ip6.daddr, udp.dport) multi-component hash set |
There was a problem hiding this comment.
Claude: nit: The behavioral test blocks in connect6.sh are ordered: multi-component hash set, then single-component hash set, then trie set. In connect4.sh the order is: single-component hash, trie, multi-component. Consider using a consistent ordering across the IPv4 and IPv6 test files for readability.
Summary
Before this change, sets (hash maps and LPM tries used for bulk matching) were only supported on packet-based hooks. This meant users who wanted to filter against a list of IPs or ports on a
CONNECT/SENDMSGhook had to create one rule per value, which scales poorly.This change adds set support for all four
CGROUP_SOCK_ADDRhooks, reusing the same BPF map infrastructure (hash maps for exact match, LPM tries for prefix match) that packet hooks already use.Refactoring: make set codegen composable/generic
Set codegen previously lived entirely in
set.cand was tightly coupled to packet-based header loading. To reuse it forCGROUP_SOCK_ADDRhooks, we extracted the generic pieces (copy field to scratch, map/trie lookup) into shared helpers and moved the packet-specific logic intopacket.c. Protocol checks and layer conflict detection were also hoisted out of set codegen intoprogram.c's existing dedup loop, so they now cover set components the same way they cover individual matchers.As a drive-by, set components are now validated individually against the hook's supported matchers.
CGROUP_SOCK_ADDRset codegenWith the shared helpers in place,
CGROUP_SOCK_ADDRset codegen maps each key component to itsbpf_sock_addrcontext field offset, copies the field to the scratch buffer, and calls the shared map lookup. No header parsing needed —r6already points to the context.Why can't we use the current set codegen?
Packet-based hooks receive a pointer to raw packet memory. The BPF program parses headers layer by layer (L2 -> L3 -> L4), and
r6points to the current header being inspected.CGROUP_SOCK_ADDRhooks receive abpf_sock_addrcontext struct instead. The kernel pre-extracts socket metadata into typed fields:This difference has two consequences for set codegen:
No header parsing needed.
r6already points to the context. Instead of loading a header pointer and extracting fields from parsed packet data, we read directly from context field offsets (r6 + offsetof(bpf_sock_addr, field)).BPF verifier constraints on access width. Packet memory can be read at any width. Context struct fields must be read at widths that respect field alignment. The verifier rejects misaligned reads from context pointers. The new
bf_stub_loadfunction handles this by picking the largest safe access width based on both source and destination alignment.The diagram below shows the two paths:
Test plan
CGROUP_SOCK_ADDRhook has dry-run parsing tests and behavioral tests covering hash sets, trie sets, and multi-component sets