scx_layered: add per-layer irq_protect with fallback policy by hodgesds · Pull Request #3586 · sched-ext/scx

hodgesds · 2026-05-18T15:17:04Z

Layers can now opt in to clearing their CPUs from every IRQ's /proc/irq/<N>/smp_affinity mask via a new irq_protect: bool field on LayerCommon. This is useful for GPU training workloads where collective-communication threads (e.g. NCCL) need very low jitter and must not share CPUs with NIC/GPU IRQ handlers.

The protector snapshots every numeric IRQ's affinity at startup, applies fresh masks after each refresh_cpumasks based on the union of protected layers' CPUs, and restores originals on shutdown. Kernel managed IRQs that reject affinity writes are recorded after the first failure and never retried.

A new --irq-protect-fallback={spread,all} flag selects what happens when an IRQ's home affinity is fully covered by protected layers (or when every CPU on the system is protected):

spread (default) preserves per-IRQ locality; per-IRQ spill to the unprotected set; and when no CPU is unprotected, pin each IRQ to a single CPU round-robin across all CPUs.
all gives every IRQ the same mask: the system-wide unprotected set, or all CPUs when nothing is unprotected.

Mode transitions and per-IRQ spill events are logged once. Includes unit tests covering normal strip + spill, managed-IRQ skip, redundant write elision, restore, spread mode (basic + wrap), spread<->normal transition, and both All-mode behaviors.

Layers can now opt in to clearing their CPUs from every IRQ's `/proc/irq/<N>/smp_affinity` mask via a new `irq_protect: bool` field on `LayerCommon`. This is useful for GPU training workloads where collective-communication threads (e.g. NCCL) need very low jitter and must not share CPUs with NIC/GPU IRQ handlers. The protector snapshots every numeric IRQ's affinity at startup, applies fresh masks after each `refresh_cpumasks` based on the union of protected layers' CPUs, and restores originals on shutdown. Kernel managed IRQs that reject affinity writes are recorded after the first failure and never retried. A new `--irq-protect-fallback={spread,all}` flag selects what happens when an IRQ's home affinity is fully covered by protected layers (or when every CPU on the system is protected): * `spread` (default) preserves per-IRQ locality; per-IRQ spill to the unprotected set; and when no CPU is unprotected, pin each IRQ to a single CPU round-robin across all CPUs. * `all` gives every IRQ the same mask: the system-wide unprotected set, or all CPUs when nothing is unprotected. Mode transitions and per-IRQ spill events are logged once. Includes unit tests covering normal strip + spill, managed-IRQ skip, redundant write elision, restore, spread mode (basic + wrap), spread<->normal transition, and both All-mode behaviors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Daniel Hodges <hodgesd@meta.com>

hodgesds force-pushed the layered-irq-protect branch from 5bfe7ef to dbfe858 Compare May 18, 2026 19:15

hodgesds force-pushed the layered-irq-protect branch from dbfe858 to a404122 Compare May 19, 2026 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

scx_layered: add per-layer irq_protect with fallback policy#3586

scx_layered: add per-layer irq_protect with fallback policy#3586
hodgesds wants to merge 1 commit into
sched-ext:mainfrom
hodgesds:layered-irq-protect

hodgesds commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hodgesds commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant