Skip to content

scx: add lib/bpf_cpumask.h and dedupe the cpumask-alloc helpers#3635

Open
jkkm wants to merge 3 commits into
sched-ext:mainfrom
jkkm:scx-lib-bpf-cpumask
Open

scx: add lib/bpf_cpumask.h and dedupe the cpumask-alloc helpers#3635
jkkm wants to merge 3 commits into
sched-ext:mainfrom
jkkm:scx-lib-bpf-cpumask

Conversation

@jkkm

@jkkm jkkm commented Jun 9, 2026

Copy link
Copy Markdown

Three commits:

  1. scx: add lib/bpf_cpumask.h for bpf_cpumask kptr lifecycle helpers
  2. scx_lavd,bpfland,flash,tickless: use create_save_bpfmask() from lib
  3. scx: add idempotent init_bpfmask() and adopt in five schedulers

Problem

Schedulers hand-roll two near-identical cpumask-allocation helpers:

  • the bare bpf_cpumask_create()/bpf_kptr_xchg()/bpf_cpumask_release()
    sequence that installs a fresh mask into a kptr slot; and
  • an idempotent wrapper around it ("allocate only if the slot is empty")
    used where a mask is populated incrementally.
    lib/percpu.h already had the bare helper (create_save_bpfmask) but
    bundled with the arena scx_bitmap library and the per-CPU storage map,
    so it could not be reused without pulling those in.

Change

  • Add lib/bpf_cpumask.h (depends only on <scx/common.bpf.h>):
    • create_save_bpfmask() — allocate + install, releasing any prior mask.
    • init_bpfmask() — idempotent: allocate only if the slot is empty.
      lib/percpu.h now includes it (dropping its two debug bpf_printk()s).
  • create_save_bpfmask(): adopt in scx_lavd, scx_bpfland,
    scx_flash, scx_tickless (their local copy was byte-identical).
  • init_bpfmask(): adopt in scx_bpfland, scx_flash, scx_tickless
    (had a local init_cpumask wrapper) and scx_cosmos, scx_beerland
    (had it inlined). The idempotency is load-bearing — their
    enable_primary_cpu/enable_sibling_cpu syscall progs build a mask one
    CPU at a time, so the guard must be preserved (it is, in the shared
    helper).
    This header manages the kernel struct bpf_cpumask kptr object and is
    deliberately distinct from lib/cpumask.h, the arena-resident
    scx_bitmap reimplementation.

Left as-is (deliberate per-scheduler behavior)

  • scx_rusty / scx_layered: scx_bpf_error() (fatal) on failure.
  • scx_p2dq: treats an already-populated slot as an error.
  • scx_mitosis: inline create+xchg (no named helper).
    These encode intentional error policy and aren't behavior-identical, so
    they're not converted here.

Testing

cargo build + veristat on an NR_CPUS=1024 kernel: all converted
schedulers (scx_lavd, scx_bpfland, scx_flash, scx_tickless,
scx_cosmos, scx_beerland) and scx_p2dq (a lib/percpu.h consumer)
build with 0 verifier failures. Net ~130 lines of duplication removed.
No functional change (other than the dropped debug prints in percpu.h).

Kyle McMartin added 3 commits June 8, 2026 13:25
Several schedulers hand-roll the same
bpf_cpumask_create()/bpf_kptr_xchg()/bpf_cpumask_release() dance to
install a freshly allocated mask into a kptr slot. lib/percpu.h already
had such a helper (create_save_bpfmask), but it is bundled with the
arena scx_bitmap library and the per-CPU storage map, so it can't be
reused without pulling those in.

Move create_save_bpfmask() into a new lib/bpf_cpumask.h that depends
only on <scx/common.bpf.h> (the kfunc prototypes), so any scheduler can
allocate kptr cpumasks without the arena/percpu baggage. lib/percpu.h
now includes it.

This header is for the kernel "struct bpf_cpumask" kptr object and is
distinct from lib/cpumask.h, which is an arena-resident bitmap
(scx_bitmap) reimplementation.

The relocated helper drops the two debug bpf_printk()s the percpu.h copy
had; callers already propagate the -ENOMEM return.

No functional change for schedulers (other than the dropped debug
prints).

Signed-off-by: Kyle McMartin <jkkm@meta.com>
These four schedulers each defined a private cpumask-alloc helper
(calloc_cpumask) whose body is byte-for-byte the create/xchg/release
sequence now provided by lib/bpf_cpumask.h. Drop the local copies and
call the shared helper.

Only schedulers whose local helper is semantically identical are
converted here; others (cosmos/beerland's idempotent variant, rusty's
scx_bpf_error() variant, p2dq's error-on-already-set variant) have
intentionally different behavior and are left as-is.

No functional change.

Signed-off-by: Kyle McMartin <jkkm@meta.com>
scx_bpfland, scx_flash, scx_tickless each had an init_cpumask() wrapper
that guards re-allocation (if the slot is already populated, do nothing)
around the create/xchg primitive; scx_cosmos and scx_beerland had the
same logic inlined into their init_cpumask(). The guard is load-bearing:
their enable_primary_cpu()/enable_sibling_cpu() syscall progs are called
once per CPU to build a shared mask incrementally, so without it each
call would xchg in a fresh empty mask and drop the bits set so far.

Hoist that idempotent variant into lib/bpf_cpumask.h as init_bpfmask()
(allocate only if the slot is empty) and switch all five schedulers to
it, removing the duplicated wrappers/inline copies.

No functional change.

Signed-off-by: Kyle McMartin <jkkm@meta.com>

/*
* Initialize a cpumask (if not already initialized).
*/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these comments aren't needed?

@arighi arighi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, we can save some code duplication.

How about renaming init_bpfmask() -> init_bpf_cpumask() and create_save_bpfmask() to create_bpf_cpumask()? I think it'd be more clear to express that we're dealing with a cpumask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants