Skip to content

datadev/emulator: MSI-X/MSI/INTx IRQ cascade + virtual MSI domain#280

Open
ruck314 wants to merge 5 commits into
pre-releasefrom
msi-irq
Open

datadev/emulator: MSI-X/MSI/INTx IRQ cascade + virtual MSI domain#280
ruck314 wants to merge 5 commits into
pre-releasefrom
msi-irq

Conversation

@ruck314
Copy link
Copy Markdown
Contributor

@ruck314 ruck314 commented May 26, 2026

Summary

  • datadev driver: replaces the unconditional INTx assignment (dev->irq = pcidev->irq) in DataDev_Probe with a pci_alloc_irq_vectors(... PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_INTX) cascade. The kernel tries each type in priority order; probe fails only if all three are unavailable. The negotiated type is recorded in dev->irqType and logged (Init: Probe: using <kind> interrupts). pci_free_irq_vectors() is added to both the probe error path and DataDev_Remove (after free_irq, before pci_disable_device).
  • emulator: adds a software-only virtual PCI-MSI / MSI-X irq domain (virt_msi.{c,h}) and an emu_irq_mode modparam (intx / msi / msix) so the virtual host bridge can advertise any of the three IRQ capabilities. Without this, CI could only ever exercise the INTx fall-through.
  • CI: tests/test_irq_modes.sh now sweeps emu_irq_mode across all three values per CPU cell, asserting both the cfg-loop dataplane (cfgIrqHold / polled) and that datadev's probe-time cascade picked the matching branch. The GPU phase runs the legacy single-mode pass.
  • cross-kernel build: a #ifndef PCI_IRQ_INTX shim provides the new name (renamed from PCI_IRQ_LEGACY in 6.11) so the cascade builds on kernel 5.15 (Ubuntu 22.04); the emulator guards pci_msi_create_irq_domain for 6.19+. pci_irq_vector()'s signed errno is captured and rejected before being stored into the uint32_t dev->irq.
  • docs: updates the test_irq_modes.sh descriptions in test-applications.rst and ci-pipeline.rst for the new sweep, and adds the PCI_IRQ_INTX (6.11) guard to the kernel-compatibility matrix.

Why

A new FPGA target supports MSI only -- it never asserts the legacy INTx pin -- so the existing dev->irq = pcidev->irq path receives no interrupts on that hardware. Picking up MSI-X on the same cascade for free lets future FPGA targets that expose MSI-X capability use it without another driver change.

What changed

Area Files Net LoC
Driver cascade common/driver/data_dev_top.c, common/driver/dma_common.h +56 / -14
Emulator MSI/MSI-X domain emulator/driver/{Makefile, src/dma_engine.{c,h}, src/emu_main.c, src/virt_pci_host.{c,h}, src/virt_msi.{c,h} (new)} +658 / -36
CI sweep tests/test_irq_modes.sh +109 / -10
Docs docs/explanation/{ci-pipeline,kernel-compatibility}.rst, docs/reference/test-applications.rst +20 / -4

Hardware compatibility

Existing INTx-only FPGA bitstreams keep working: the cascade lands on PCI_IRQ_INTX exactly when no MSI/MSI-X cap is advertised, returning the same pcidev->irq byte 0x3C had before. Old INTx-only datadev builds running against new MSI-capable FPGA bitstreams also keep working -- without the cascade they never set the MSI/MSI-X Enable bit, so the hardware delivers on INTA# the legacy way (per PCIe spec, the function gates INTx assertion based on those Enable bits).

CI verification

End-to-end run via scripts/ci-local/run_cell.sh --container ubuntu:24.04 --load-test 1 --phase cpu on the KVM parity harness (kernel 6.17.0-1011-azure):

[PASS] irq_modes intx/cfgIrqHold=1
[PASS] irq_modes intx/cfgIrqHold=100000
[PASS] irq_modes intx/polled
[PASS] irq_modes intx -- datadev selected: using legacy INTx interrupts
[PASS] irq_modes msi/cfgIrqHold=1
[PASS] irq_modes msi/cfgIrqHold=100000
[PASS] irq_modes msi/polled
[PASS] irq_modes msi  -- datadev selected: using MSI interrupts
[PASS] irq_modes msix/cfgIrqHold=1
[PASS] irq_modes msix/cfgIrqHold=100000
[PASS] irq_modes msix/polled
[PASS] irq_modes msix -- datadev selected: using MSI-X interrupts
=== Results: 13 passed, 0 failed ===
=== ALL CPU TESTS PASSED ===

Real-hardware verification recipe

For each card type, after insmod datadev.ko:

dmesg | grep "datadev .*: Init: Probe: using"   # expect MSI-X / MSI / legacy INTx
cat /proc/interrupts | grep datadev             # expect PCI-MSIX / PCI-MSI / IO-APIC controller
data_dev/app/bin/dma_rate -p /dev/datadev_0     # baseline throughput
rmmod datadev                                   # clean teardown, no leak warnings

Repeat load/unload in the same boot to catch teardown-order bugs (free_irq -> pci_free_irq_vectors -> pci_disable_device).

Test plan

  • CI: full CPU matrix (Ubuntu 24.04, 22.04, RockyLinux 9, Debian experimental, Fedora rawhide) runs tests/test_irq_modes.sh with the emu_irq_mode sweep; all subtests per cell PASS
  • CI: GPU matrix unchanged (sweep is gated off when GPU_ENABLED=1; legacy single-mode pass runs)
  • Cross-kernel build: 5.15 (no PCI_IRQ_INTX), 6.11+ (rename), 6.19+ (pci_msi_create_irq_domain guard) all compile
  • Hardware bring-up: MSI-X-capable card, MSI-only card, INTx-only card -- each shows the matching dmesg line and dma_rate matches baseline

Notes

  • IRQF_SHARED is kept unconditionally. The kernel accepts it as a no-op on MSI/MSI-X vectors, the existing AxisG2_Irq already implements the spurious-interrupt check shared mode requires, and not changing Dma_Init's call site keeps the diff out of the shared-with-rce_*/emulator translation unit.
  • Single-vector MSI/MSI-X. The handler is unchanged; multi-vector MSI-X would be a follow-up if FW ever exposes more vectors.
  • dev->irqType is for diagnostic logging only -- the shared Dma_Init/Dma_Clean paths do not branch on it.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Comment thread emulator/driver/src/virt_msi.c Outdated
Comment thread emulator/driver/src/virt_msi.c
Comment thread emulator/driver/src/virt_pci_host.c
Comment thread tests/test_irq_modes.sh
Comment thread emulator/driver/src/virt_pci_host.h Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Comment thread common/driver/data_dev_top.c Outdated
Comment thread emulator/driver/src/emu_main.c Outdated
Comment thread emulator/driver/src/virt_msi.c Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comment thread emulator/driver/src/virt_pci_host.c
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread common/driver/data_dev_top.c
Comment thread emulator/driver/src/virt_msi.h Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@ruck314 ruck314 changed the title datadev: MSI-X / MSI / legacy INTx interrupt cascade datadev/emulator: MSI-X/MSI/INTx IRQ cascade + virtual MSI domain May 26, 2026
datadev: replace the unconditional INTx assignment in DataDev_Probe with
a pci_alloc_irq_vectors(... PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_INTX)
cascade. The kernel negotiates the best available type; probe fails only
if all three are unavailable. The negotiated type is recorded in
dev->irqType and logged. pci_free_irq_vectors() is released on the probe
error path and in DataDev_Remove (after free_irq, before
pci_disable_device). pci_irq_vector()'s signed errno is validated before
being stored into the uint32_t dev->irq.

emulator: add a software-only virtual PCI-MSI / MSI-X irq domain
(virt_msi.{c,h}) and an emu_irq_mode modparam (intx | msi | msix) so the
virtual host bridge can advertise any of the three IRQ capabilities,
letting CI exercise every datadev cascade branch. MSI hwirqs are
recycled, irq_work is synced on teardown, the MSI-X table/PBA live in
BAR0, and the Status Capabilities-List bit is gated on irq_mode.

CI: tests/test_irq_modes.sh sweeps emu_irq_mode {intx, msi, msix} per CPU
cell, verifying the cfgIrqHold/polled dataplane and asserting datadev's
probe-time cascade selected the matching branch. The GPU phase runs the
legacy single-mode pass.

cross-kernel: a #ifndef PCI_IRQ_INTX shim provides the name renamed from
PCI_IRQ_LEGACY in 6.11 so the cascade builds on kernel 5.15; the emulator
guards pci_msi_create_irq_domain for 6.19+.

docs: update the test_irq_modes.sh descriptions in test-applications.rst
and ci-pipeline.rst for the new sweep, and add the PCI_IRQ_INTX (6.11)
guard to the kernel-compatibility matrix.
ruck314 added 4 commits May 26, 2026 10:02
The IRQ cascade selects by advertised capability, not by what the
firmware actually delivers. A bitstream exposing both legacy INTx and
MSI lets pci_alloc_irq_vectors() prefer MSI; if the user logic only
drives the legacy pin the driver enables interrupts that never arrive,
surfacing as DMA timeouts. Require exactly one advertised interrupt
type and fail the probe with a clear message otherwise.
The register map is hardwired to a 16 MB window (2 * USER_SIZE: DMA
engine, PCIe PHY, AxiVersion, and the user region). Firmware built with
a different BAR0 size has a mismatched address map, which otherwise
surfaces as out-of-bounds register access. Validate the size at probe
and fail with a clear message to catch incompatible builds early.
…k loads

The exact-match BAR0 check broke emulator-backed CI: the emulator
requests 16 MB but shrinks the BAR when it cannot reserve that much
contiguous memory, so fragmented runners presented a 4 MB BAR and the
probe rejected it. The register map only tops out at 16 MB, so reject
solely oversized BARs (the mismatched-firmware case this guards against)
and tolerate a smaller BAR.
The virtual function asserted the INTx pin unconditionally, so in
msi/msix modes it advertised INTx plus the MSI/MSI-X capability. The
datadev probe now rejects a function exposing more than one interrupt
type, which broke the test_irq_modes.sh MSI/MSI-X subtests. Assert
INTA# only in INTx mode and leave the pin clear for MSI/MSI-X so each
mode advertises exactly one type, matching the single-IRQ-type firmware
policy the driver enforces.
@ruck314 ruck314 requested a review from JJL772 May 29, 2026 07:24
@ruck314 ruck314 marked this pull request as ready for review May 29, 2026 07:24
// Enforce that the FPGA advertises exactly one interrupt type. The cascade
// below selects by *advertised capability*, not by what the firmware can
// actually deliver: a bitstream that exposes both legacy INTx and MSI lets
// pci_alloc_irq_vectors() prefer MSI, and if the user logic only drives the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user logic means what here? FPGA firmware?

This check does not feel strictly necessary because Linux already negotiates the interrupt type if multiple are available. Is the firmware expected to only advertise one at a time? Is our firmware only advertising MSI-X or MSI right now? Does the firmware not properly respect the configured interrupt type?


// IRQ
uint32_t irq;
uint8_t irqType;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually used anywhere except probe(). It can be removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants