datadev/emulator: MSI-X/MSI/INTx IRQ cascade + virtual MSI domain#280
Open
ruck314 wants to merge 5 commits into
Open
datadev/emulator: MSI-X/MSI/INTx IRQ cascade + virtual MSI domain#280ruck314 wants to merge 5 commits into
ruck314 wants to merge 5 commits into
Conversation
datadev: replace the unconditional INTx assignment in DataDev_Probe with
a pci_alloc_irq_vectors(... PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_INTX)
cascade. The kernel negotiates the best available type; probe fails only
if all three are unavailable. The negotiated type is recorded in
dev->irqType and logged. pci_free_irq_vectors() is released on the probe
error path and in DataDev_Remove (after free_irq, before
pci_disable_device). pci_irq_vector()'s signed errno is validated before
being stored into the uint32_t dev->irq.
emulator: add a software-only virtual PCI-MSI / MSI-X irq domain
(virt_msi.{c,h}) and an emu_irq_mode modparam (intx | msi | msix) so the
virtual host bridge can advertise any of the three IRQ capabilities,
letting CI exercise every datadev cascade branch. MSI hwirqs are
recycled, irq_work is synced on teardown, the MSI-X table/PBA live in
BAR0, and the Status Capabilities-List bit is gated on irq_mode.
CI: tests/test_irq_modes.sh sweeps emu_irq_mode {intx, msi, msix} per CPU
cell, verifying the cfgIrqHold/polled dataplane and asserting datadev's
probe-time cascade selected the matching branch. The GPU phase runs the
legacy single-mode pass.
cross-kernel: a #ifndef PCI_IRQ_INTX shim provides the name renamed from
PCI_IRQ_LEGACY in 6.11 so the cascade builds on kernel 5.15; the emulator
guards pci_msi_create_irq_domain for 6.19+.
docs: update the test_irq_modes.sh descriptions in test-applications.rst
and ci-pipeline.rst for the new sweep, and add the PCI_IRQ_INTX (6.11)
guard to the kernel-compatibility matrix.
The IRQ cascade selects by advertised capability, not by what the firmware actually delivers. A bitstream exposing both legacy INTx and MSI lets pci_alloc_irq_vectors() prefer MSI; if the user logic only drives the legacy pin the driver enables interrupts that never arrive, surfacing as DMA timeouts. Require exactly one advertised interrupt type and fail the probe with a clear message otherwise.
The register map is hardwired to a 16 MB window (2 * USER_SIZE: DMA engine, PCIe PHY, AxiVersion, and the user region). Firmware built with a different BAR0 size has a mismatched address map, which otherwise surfaces as out-of-bounds register access. Validate the size at probe and fail with a clear message to catch incompatible builds early.
…k loads The exact-match BAR0 check broke emulator-backed CI: the emulator requests 16 MB but shrinks the BAR when it cannot reserve that much contiguous memory, so fragmented runners presented a 4 MB BAR and the probe rejected it. The register map only tops out at 16 MB, so reject solely oversized BARs (the mismatched-firmware case this guards against) and tolerate a smaller BAR.
The virtual function asserted the INTx pin unconditionally, so in msi/msix modes it advertised INTx plus the MSI/MSI-X capability. The datadev probe now rejects a function exposing more than one interrupt type, which broke the test_irq_modes.sh MSI/MSI-X subtests. Assert INTA# only in INTx mode and leave the pin clear for MSI/MSI-X so each mode advertises exactly one type, matching the single-IRQ-type firmware policy the driver enforces.
JJL772
reviewed
May 30, 2026
| // Enforce that the FPGA advertises exactly one interrupt type. The cascade | ||
| // below selects by *advertised capability*, not by what the firmware can | ||
| // actually deliver: a bitstream that exposes both legacy INTx and MSI lets | ||
| // pci_alloc_irq_vectors() prefer MSI, and if the user logic only drives the |
Member
There was a problem hiding this comment.
user logic means what here? FPGA firmware?
This check does not feel strictly necessary because Linux already negotiates the interrupt type if multiple are available. Is the firmware expected to only advertise one at a time? Is our firmware only advertising MSI-X or MSI right now? Does the firmware not properly respect the configured interrupt type?
|
|
||
| // IRQ | ||
| uint32_t irq; | ||
| uint8_t irqType; |
Member
There was a problem hiding this comment.
This is not actually used anywhere except probe(). It can be removed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
dev->irq = pcidev->irq) inDataDev_Probewith apci_alloc_irq_vectors(... PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_INTX)cascade. The kernel tries each type in priority order; probe fails only if all three are unavailable. The negotiated type is recorded indev->irqTypeand logged (Init: Probe: using <kind> interrupts).pci_free_irq_vectors()is added to both the probe error path andDataDev_Remove(afterfree_irq, beforepci_disable_device).virt_msi.{c,h}) and anemu_irq_modemodparam (intx/msi/msix) so the virtual host bridge can advertise any of the three IRQ capabilities. Without this, CI could only ever exercise the INTx fall-through.tests/test_irq_modes.shnow sweepsemu_irq_modeacross all three values per CPU cell, asserting both the cfg-loop dataplane (cfgIrqHold / polled) and that datadev's probe-time cascade picked the matching branch. The GPU phase runs the legacy single-mode pass.#ifndef PCI_IRQ_INTXshim provides the new name (renamed fromPCI_IRQ_LEGACYin 6.11) so the cascade builds on kernel 5.15 (Ubuntu 22.04); the emulator guardspci_msi_create_irq_domainfor 6.19+.pci_irq_vector()'s signed errno is captured and rejected before being stored into theuint32_tdev->irq.test_irq_modes.shdescriptions intest-applications.rstandci-pipeline.rstfor the new sweep, and adds thePCI_IRQ_INTX(6.11) guard to the kernel-compatibility matrix.Why
A new FPGA target supports MSI only -- it never asserts the legacy INTx pin -- so the existing
dev->irq = pcidev->irqpath receives no interrupts on that hardware. Picking up MSI-X on the same cascade for free lets future FPGA targets that expose MSI-X capability use it without another driver change.What changed
common/driver/data_dev_top.c,common/driver/dma_common.hemulator/driver/{Makefile, src/dma_engine.{c,h}, src/emu_main.c, src/virt_pci_host.{c,h}, src/virt_msi.{c,h} (new)}tests/test_irq_modes.shdocs/explanation/{ci-pipeline,kernel-compatibility}.rst,docs/reference/test-applications.rstHardware compatibility
Existing INTx-only FPGA bitstreams keep working: the cascade lands on
PCI_IRQ_INTXexactly when no MSI/MSI-X cap is advertised, returning the samepcidev->irqbyte 0x3C had before. Old INTx-only datadev builds running against new MSI-capable FPGA bitstreams also keep working -- without the cascade they never set the MSI/MSI-X Enable bit, so the hardware delivers on INTA# the legacy way (per PCIe spec, the function gates INTx assertion based on those Enable bits).CI verification
End-to-end run via
scripts/ci-local/run_cell.sh --container ubuntu:24.04 --load-test 1 --phase cpuon the KVM parity harness (kernel 6.17.0-1011-azure):Real-hardware verification recipe
For each card type, after
insmod datadev.ko:Repeat load/unload in the same boot to catch teardown-order bugs (
free_irq->pci_free_irq_vectors->pci_disable_device).Test plan
tests/test_irq_modes.shwith theemu_irq_modesweep; all subtests per cell PASSGPU_ENABLED=1; legacy single-mode pass runs)PCI_IRQ_INTX), 6.11+ (rename), 6.19+ (pci_msi_create_irq_domainguard) all compiledma_ratematches baselineNotes
IRQF_SHAREDis kept unconditionally. The kernel accepts it as a no-op on MSI/MSI-X vectors, the existingAxisG2_Irqalready implements the spurious-interrupt check shared mode requires, and not changingDma_Init's call site keeps the diff out of the shared-with-rce_*/emulatortranslation unit.dev->irqTypeis for diagnostic logging only -- the sharedDma_Init/Dma_Cleanpaths do not branch on it.