Skip to content

lib: add bf_vector and use it in cgen#487

Open
pzmarzly wants to merge 2 commits intofacebook:mainfrom
pzmarzly:push-ukrnssyykoqn
Open

lib: add bf_vector and use it in cgen#487
pzmarzly wants to merge 2 commits intofacebook:mainfrom
pzmarzly:push-ukrnssyykoqn

Conversation

@pzmarzly
Copy link
Copy Markdown
Contributor

@pzmarzly pzmarzly commented Mar 24, 2026

I initially planned to use bf_vector in bf_hashset, but I got convinced by @yaakov-stein that it's not a good idea. However, we could use bf_vector in other places, especially where we use bf_dynbuf or where we handroll vector-like behavior.

Commits:

  • lib: core: add bf_vector - implementation. Vector doubles in size when it hits the limits. Has a helper to remove elements as well, though shrinking is not implemented.
  • cgen: use bf_vector for img - Instead of managing img, img_size, and img_cap by hand, use bf_vector.

@meta-cla meta-cla bot added the cla signed label Mar 24, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

Claude review of PR #487 (73c0440)

Suggestions

  • Overflow guard in bf_vector_addsrc/libbpfilter/core/vector.c:165vec->size + 1 can theoretically wrap; bf_vector_add_many already uses __builtin_add_overflow for the same pattern
  • Missing test for bf_vector_inittests/unit/libbpfilter/core/vector.c — public function only tested indirectly via bf_vector_new

Nits

  • _end naming conventionsrc/libbpfilter/include/bpfilter/core/vector.h:66 — should use __end to match __next/__r/__idx pattern in other macros
  • _BF_VECTOR_MAX_CAP commentsrc/libbpfilter/core/vector.c:16 — says "multiplying by 1.5x" but growth uses cap + cap / 2

Workflow run

@pzmarzly pzmarzly marked this pull request as draft March 24, 2026 13:30
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from 2bbe579 to 419e434 Compare March 24, 2026 14:33
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from 419e434 to e77f462 Compare March 24, 2026 15:18
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from e77f462 to 305dd19 Compare March 24, 2026 16:07
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from 305dd19 to f5a94cd Compare March 24, 2026 16:57
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from f5a94cd to e9d06a6 Compare March 24, 2026 17:37
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from e9d06a6 to 4cd032f Compare March 24, 2026 17:53
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch 2 times, most recently from 703e225 to a8759b5 Compare March 24, 2026 18:32
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from a8759b5 to c828488 Compare March 24, 2026 19:26
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from 309f17f to fba6566 Compare March 25, 2026 19:08
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from fba6566 to c37916b Compare March 25, 2026 19:18
@pzmarzly pzmarzly changed the title lib: add bf_vector and use it to replace bf_dynbuf lib: add bf_vector and use it in cgen Mar 25, 2026
@pzmarzly pzmarzly marked this pull request as ready for review March 25, 2026 19:29
@pzmarzly pzmarzly requested a review from qdeslandes as a code owner March 25, 2026 19:29
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from c37916b to 975b503 Compare March 26, 2026 14:23
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from 975b503 to dd1fe11 Compare March 26, 2026 15:24
(elem) && (elem) < (vec)->data + ((vec)->size * (vec)->elem_size); \
(elem) = (elem) + (vec)->elem_size)

/**
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: bf_vector_foreach performs pointer arithmetic on void * ((elem) + (vec)->elem_size and (vec)->data + ...), which is a GCC extension not valid in standard C.

Since vector.h is a public header under include/bpfilter/core/, any downstream consumer using this macro with -pedantic -std=c17 will get a compile error. The pedantic integration test (tests/integration/pedantic_c.sh) includes all public headers with -pedantic-errors -std=c17 -Werror — the macro definition alone won't trigger an error (unexpanded macros aren't compiled), but any call site under those flags will fail.

For comparison, bf_list_foreach in list.h avoids this by iterating over typed pointers (bf_list_node *). Consider casting through char *:

Suggested change
/**
#define bf_vector_foreach(vec, elem) \
for (void *(elem) = (vec)->data; \
(elem) && (char *)(elem) < (char *)(vec)->data + ((vec)->size * (vec)->elem_size); \
(elem) = (void *)((char *)(elem) + (vec)->elem_size))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but the codebase is already not C17 compliant in different places as well.

* @brief Iterate over every element of a `bf_vector`.
*
* `elem` is declared as a pointer to the element type and will point to each
* element in turn. Do not add or remove elements during iteration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add

This should not be an issue.

remove elements during iteration

Same here, as we don't shrink the memory buffer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add

If that addition causes growth/reallocation, void *(elem) = (vec)->data; will point to the old memory address. I could turn the iterator into index int i = 0; and use double-for trick to have void *(elem) = (vec)->data + i * (vec)->elem_size;, but I think it's an overkill?

remove elements during iteration

But we shift the next elements left, i.e. 5th element becomes 4th, 4th becomes 3rd, etc. The foreach will skip an element because of (elem) = (elem) + (vec)->elem_size.

@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch 2 times, most recently from eb242f7 to 9c2bbd0 Compare April 1, 2026 15:36
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch 2 times, most recently from 456d0b3 to d5af70d Compare April 1, 2026 16:00
@pzmarzly pzmarzly force-pushed the push-ukrnssyykoqn branch from d5af70d to 73c0440 Compare April 1, 2026 16:07
assert(vec);
assert(elem);

r = _bf_vector_grow(vec, vec->size + 1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: bf_vector_add passes vec->size + 1 to _bf_vector_grow without overflow protection. If vec->size is SIZE_MAX, this wraps to 0, causing _bf_vector_grow to see required <= vec->cap and return success, after which the memcpy writes past the allocated buffer.

bf_vector_add_many already guards the equivalent addition with __builtin_add_overflow (line 186). The same pattern should be applied here for consistency:

size_t required;
int r;

assert(vec);
assert(elem);

if (__builtin_add_overflow(vec->size, 1, &required))
    return -ENOMEM;

r = _bf_vector_grow(vec, required);

While SIZE_MAX elements is unreachable in practice, the inconsistency between the two functions makes the code harder to reason about.

for (void *(elem) = (vec)->data, \
*_end = (vec)->data ? \
(vec)->data + ((vec)->size * (vec)->elem_size) : \
NULL; \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The internal sentinel variable _end uses a single-underscore prefix, while other macros in the codebase consistently use double underscores for internal variables: __next in bf_list_foreach (list.h:107), __r in EMIT (program.h:103), __idx in bf_jmpctx_get (jmp.h:60). Consider renaming to __end to match the established convention.

#include "bpfilter/helper.h"

#define _BF_VECTOR_INIT_CAP 8
// Largest number that won't overflow when multiplying by 1.5x.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The comment says "Largest number that won't overflow when multiplying by 1.5x" but the growth factor is applied as cap + cap / 2 (integer addition), not a literal multiplication. A more precise comment would be: "Largest capacity where cap + cap / 2 does not exceed SIZE_MAX."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants