Skip to content

Conversation

@nirmoy
Copy link
Collaborator

@nirmoy nirmoy commented Jan 26, 2026

Enable ATS to remain always-on for CXL.cache devices and specific NVIDIA GPUs by adding pci_ats_always_on() API and SMMU driver support.

https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2139088

…le devices

Controlled by the IOMMU driver, ATS is usually enabled "on demand", when a
device requests a translation service from its associated IOMMU HW running
on the channel of a given PASID. This is working even when a device has no
translation on its RID, i.e. RID is IOMMU bypassed.

On the other hand, certain PCIe device requires non-PASID ATS, when its RID
stream is IOMMU bypassed. Call this "always on".

For instance, the CXL spec notes in "3.2.5.13 Memory Type on CXL.cache":
"To source requests on CXL.cache, devices need to get the Host Physical
 Address (HPA) from the Host by means of an ATS request on CXL.io."
In other word, the CXL.cache capability relies on ATS. Otherwise, it won't
have access to the host physical memory.

Introduce a new pci_ats_always_on() for IOMMU driver to scan a PCI device,
to shift ATS policies between "on demand" and "always on".

Add the support for CXL.cache devices first. Non-CXL devices will be added
in quirks.c file.

Suggested-by: Vikram Sethi <vsethi@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com)
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…GPUs

Some non-CXL NVIDIA GPU devices support non-PASID ATS function when their
RIDs are IOMMU bypassed. This is slightly different than the default ATS
policy which would only enable ATS on demand: when a non-zero PASID line
is enabled in SVA use cases.

Introduce a pci_dev_specific_ats_always_on() quirk function to support a
list of IDs for these device. Then, include it pci_ats_always_on().

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com)
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
When a device's default substream attaches to an identity domain, the SMMU
driver currently sets the device's STE between two modes:

  Mode 1: Cfg=Translate, S1DSS=Bypass, EATS=1
  Mode 2: Cfg=bypass (EATS is ignored by HW)

When there is an active PASID (non-default substream), mode 1 is used. And
when there is no PASID support or no active PASID, mode 2 is used.

The driver will also downgrade an STE from mode 1 to mode 2, when the last
active substream becomes inactive.

However, there are PCIe devices that demand ATS to be always on. For these
devices, their STEs have to use the mode 1 as HW ignores EATS with mode 2.

Change the driver accordingly:
  - always use the mode 1
  - never downgrade to mode 2
  - allocate and retain a CD table (see note below)

Note that these devices might not support PASID, i.e. doing non-PASID ATS.
In such a case, the ssid_bits is set to 0. However, s1cdmax must be set to
a !0 value in order to keep the S1DSS field effective. Thus, when a master
requires ats_always_on, set its s1cdmax to minimal 1, meaning the CD table
will have a dummy entry (SSID=1) that will be never used.

Now, for these device, arm_smmu_cdtab_allocated() will always return true,
v.s. false prior to this change. When its default substream is attached to
an IDENTITY domain, its first CD is NULL in the table, which is a totally
valid case. Thus, drop the WARN_ON().


Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/linux-iommu/cover.1768624180.git.nicolinc@nvidia.com)
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
@nirmoy
Copy link
Collaborator Author

nirmoy commented Jan 26, 2026

This removes the SMMU strom: kernel: arm-smmu-v3 arm-smmu-v3.27.auto: event: UNKNOWN client for VR but still needs a validation on Spark. It should allow us to remove those spark iommu quirk.

Copy link
Collaborator

@nvmochs nvmochs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified these match the LKML series.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

@jamieNguyenNVIDIA
Copy link
Collaborator

The ports look good, though there are new-line characters between the Suggested-by: and Signed-off-by: in a couple of the commits:

NVIDIA: VR: SAUCE: PCI: Allow ATS to be always on for CXL.cache capable devices
...
Suggested-by: Vikram Sethi <vsethi@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
NVIDIA: VR: SAUCE: PCI: Allow ATS to be always on for non-CXL NVIDIA GPUs
...
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

But I won't hold up the PR over these nits, so...:
Acked-by: Jamie Nguyen <jamien@nvidia.com>

Copy link
Collaborator

@clsotog clsotog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked-by: Carol L Soto <csoto@nvidia.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants