[6.17] Backport patches to allow ATS to be always on for certain ATS-capable devices #291

nirmoy · 2026-01-26T16:04:46Z

Enable ATS to remain always-on for CXL.cache devices and specific NVIDIA GPUs by adding pci_ats_always_on() API and SMMU driver support.

https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2139088

…le devices Controlled by the IOMMU driver, ATS is usually enabled "on demand", when a device requests a translation service from its associated IOMMU HW running on the channel of a given PASID. This is working even when a device has no translation on its RID, i.e. RID is IOMMU bypassed. On the other hand, certain PCIe device requires non-PASID ATS, when its RID stream is IOMMU bypassed. Call this "always on". For instance, the CXL spec notes in "3.2.5.13 Memory Type on CXL.cache": "To source requests on CXL.cache, devices need to get the Host Physical Address (HPA) from the Host by means of an ATS request on CXL.io." In other word, the CXL.cache capability relies on ATS. Otherwise, it won't have access to the host physical memory. Introduce a new pci_ats_always_on() for IOMMU driver to scan a PCI device, to shift ATS policies between "on demand" and "always on". Add the support for CXL.cache devices first. Non-CXL devices will be added in quirks.c file. Suggested-by: Vikram Sethi <vsethi@nvidia.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>

…GPUs Some non-CXL NVIDIA GPU devices support non-PASID ATS function when their RIDs are IOMMU bypassed. This is slightly different than the default ATS policy which would only enable ATS on demand: when a non-zero PASID line is enabled in SVA use cases. Introduce a pci_dev_specific_ats_always_on() quirk function to support a list of IDs for these device. Then, include it pci_ats_always_on(). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>

When a device's default substream attaches to an identity domain, the SMMU driver currently sets the device's STE between two modes: Mode 1: Cfg=Translate, S1DSS=Bypass, EATS=1 Mode 2: Cfg=bypass (EATS is ignored by HW) When there is an active PASID (non-default substream), mode 1 is used. And when there is no PASID support or no active PASID, mode 2 is used. The driver will also downgrade an STE from mode 1 to mode 2, when the last active substream becomes inactive. However, there are PCIe devices that demand ATS to be always on. For these devices, their STEs have to use the mode 1 as HW ignores EATS with mode 2. Change the driver accordingly: - always use the mode 1 - never downgrade to mode 2 - allocate and retain a CD table (see note below) Note that these devices might not support PASID, i.e. doing non-PASID ATS. In such a case, the ssid_bits is set to 0. However, s1cdmax must be set to a !0 value in order to keep the S1DSS field effective. Thus, when a master requires ats_always_on, set its s1cdmax to minimal 1, meaning the CD table will have a dummy entry (SSID=1) that will be never used. Now, for these device, arm_smmu_cdtab_allocated() will always return true, v.s. false prior to this change. When its default substream is attached to an IDENTITY domain, its first CD is NULL in the table, which is a totally valid case. Thus, drop the WARN_ON(). Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>

nirmoy · 2026-01-26T16:06:05Z

This removes the SMMU strom: kernel: arm-smmu-v3 arm-smmu-v3.27.auto: event: UNKNOWN client for VR but still needs a validation on Spark. It should allow us to remove those spark iommu quirk.

nvmochs

Verified these match the LKML series.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

jamieNguyenNVIDIA · 2026-01-26T21:11:07Z

The ports look good, though there are new-line characters between the Suggested-by: and Signed-off-by: in a couple of the commits:

NVIDIA: VR: SAUCE: PCI: Allow ATS to be always on for CXL.cache capable devices
...
Suggested-by: Vikram Sethi <vsethi@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

NVIDIA: VR: SAUCE: PCI: Allow ATS to be always on for non-CXL NVIDIA GPUs
...
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

But I won't hold up the PR over these nits, so...:
Acked-by: Jamie Nguyen <jamien@nvidia.com>

clsotog

Acked-by: Carol L Soto <csoto@nvidia.com>

nicolinc added 3 commits January 26, 2026 02:56

nirmoy requested review from clsotog, jamieNguyenNVIDIA and nvmochs January 26, 2026 16:08

nvmochs approved these changes Jan 26, 2026

View reviewed changes

clsotog approved these changes Jan 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[6.17] Backport patches to allow ATS to be always on for certain ATS-capable devices #291

[6.17] Backport patches to allow ATS to be always on for certain ATS-capable devices #291

Uh oh!

nirmoy commented Jan 26, 2026

Uh oh!

nirmoy commented Jan 26, 2026 •

edited

Loading

Uh oh!

nvmochs left a comment

Uh oh!

jamieNguyenNVIDIA commented Jan 26, 2026

Uh oh!

clsotog left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[6.17] Backport patches to allow ATS to be always on for certain ATS-capable devices #291

Are you sure you want to change the base?

[6.17] Backport patches to allow ATS to be always on for certain ATS-capable devices #291

Uh oh!

Conversation

nirmoy commented Jan 26, 2026

Uh oh!

nirmoy commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nvmochs left a comment

Choose a reason for hiding this comment

Uh oh!

jamieNguyenNVIDIA commented Jan 26, 2026

Uh oh!

clsotog left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nirmoy commented Jan 26, 2026 •

edited

Loading