Linux-PCI Archive mirror
 help / color / mirror / Atom feed
From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Lukas Wunner <lukas@wunner.de>
Subject: Re: [PATCH v3] PCI/PTM: Do not enable PTM solely based on the capability existense
Date: Fri, 31 Oct 2025 07:09:59 +0100	[thread overview]
Message-ID: <20251031060959.GY2912318@black.igk.intel.com> (raw)
In-Reply-To: <20251030205937.GA1648870@bhelgaas>

On Thu, Oct 30, 2025 at 03:59:37PM -0500, Bjorn Helgaas wrote:
> In subject, s/existense/existence/
> 
> Actually, I'd try to include something more specific like "enable PTM
> only if it advertises a role".

Okay.

> On Thu, Oct 30, 2025 at 02:46:05PM +0100, Mika Westerberg wrote:
> > It is not advisable to enable PTM solely based on the fact that the
> > capability exists. Instead there are separate bits in the capability
> > register that need to be set for the feature to be enabled for a given
> > component (this is suggestion from Intel PCIe folks, and also shown in
> > PCIe r7.0 sec 6.21.1 figure 6-21):
> 
> Can we start with a minimal statement of what's wrong?  Is the problem
> that 01:00.0 sent a PTM Request Message that 00:07.0 detected as an
> ACS violation?

The problem is that once the PCIe Switch is hotplugged we get tons of AER
errors like below (here upstream port is 2b:00.0, in the previous example
it was 01:00.0):

[  156.337979] pci 0000:2b:00.0: PTM enabled, 4ns granularity
[  156.350822] pcieport 0000:00:07.1: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:07.1
[  156.361417] pcieport 0000:00:07.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver I
D)
[  156.372656] pcieport 0000:00:07.1:   device [8086:e44f] error status/mask=00200000/00000000
[  156.381041] pcieport 0000:00:07.1:    [21] ACSViol                (First)
[  156.387842] pcieport 0000:00:07.1: AER:   TLP Header: 0x34000000 0x00000052 0x00000000 0x00000000
[  156.396731] pcieport 0000:00:07.1: AER: broadcast error_detected message
[  156.403498] pcieport 0000:00:07.1: AER: broadcast mmio_enabled message
[  156.410060] pcieport 0000:00:07.1: AER: broadcast resume message
[  156.416131] pcieport 0000:00:07.1: AER: device recovery successful
[  156.422345] pcieport 0000:00:07.1: AER: Uncorrectable (Non-Fatal) error message received from 0000:00:07.1

Here 00:07.1 is the PCIe Root Port.

> I guess we enabled PTM on 01:00.0 even though it doesn't advertise any
> roles in the PTM Capability, and it sent a PTM Request Message anyway?

Yes, I think so.

> Weird to expose a PTM Capability and not advertise any roles, and also
> weird to send PTM Messages when enabled in that case.
> 
> >   - PCIe Endpoint that has PTM capability must to declare requester
> >     capable
> >   - PCIe Switch Upstream Port that has PTM capability must declare
> >     at least responder capable
> >   - PCIe Root Port must declare root port capable.
> > 
> > Currently we see following:
> > 
> >   pci 0000:01:00.0: [8086:5786] type 01 class 0x060400 PCIe Switch Upstream Port
> >   pci 0000:01:00.0: PCI bridge to [bus 00]
> >   pci 0000:01:00.0:   bridge window [io  0x0000-0x0fff]
> >   pci 0000:01:00.0:   bridge window [mem 0x00000000-0x000fffff]
> >   pci 0000:01:00.0:   bridge window [mem 0x00000000-0x000fffff 64bit pref]
> 
> I don't think the windows are relevant.

Okay.

> >   pci 0000:01:00.0: PTM enabled, 4ns granularity
> >   ...
> >   pcieport 0000:00:07.0: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:07.0
> >   pcieport 0000:00:07.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
> >   pcieport 0000:00:07.0:   device [8086:e44e] error status/mask=00200000/00000000
> >   pcieport 0000:00:07.0:    [21] ACSViol                (First)
> 
> Is there any Header Log info here?  I assume if there is, it would
> show a PTM Message?

I pasted it above. Does it tell anything useful to you?

> > The 01:00.0 PCIe Upstream Port has this:
> > 
> >   Capabilities: [220 v1] Precision Time Measurement
> > 		PTMCap: Requester- Responder- Root-
> > 
> > This happens because Linux sees the PTM capability and blindly enables
> > PTM which then causes the AER error to trigger.
> > 
> > Fix this by enabling PTM only if the above described criteria is met.
> > ...
> 
> > +++ b/drivers/pci/pcie/ptm.c
> > @@ -81,9 +81,24 @@ void pci_ptm_init(struct pci_dev *dev)
> >  		dev->ptm_granularity = 0;
> >  	}
> >  
> > -	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> > -	    pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM)
> > -		pci_enable_ptm(dev, NULL);
> > +	switch (pci_pcie_type(dev)) {
> > +	case PCI_EXP_TYPE_ROOT_PORT:
> > +		/*
> > +		 * Root Port must declare Root Capable if we want to
> > +		 * enable PTM for it.
> > +		 */
> > +		if (dev->ptm_root)
> > +			pci_enable_ptm(dev, NULL);
> > +		break;
> > +	case PCI_EXP_TYPE_UPSTREAM:
> > +		/*
> > +		 * Switch Upstream Ports must at least declare Responder
> > +		 * Capable if we want to enable PTM for it.
> > +		 */
> > +		if (cap & PCI_PTM_CAP_RES)
> > +			pci_enable_ptm(dev, NULL);
> > +		break;
> > +	}
> >  }
> >  
> >  void pci_save_ptm_state(struct pci_dev *dev)
> > @@ -144,6 +159,18 @@ static int __pci_enable_ptm(struct pci_dev *dev)
> >  			return -EINVAL;
> >  	}
> >  
> > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT ||
> > +	    pci_pcie_type(dev) == PCI_EXP_TYPE_LEG_END) {
> > +		u32 cap;
> > +		/*
> > +		 * PCIe Endpoint must declare Requester Capable before we
> > +		 * can enable PTM for it.
> > +		 */
> > +		pci_read_config_dword(dev, ptm + PCI_PTM_CAP, &cap);
> > +		if (!(cap & PCI_PTM_CAP_REQ))
> > +			return -EINVAL;
> > +	}
> 
> The asymmetry of testing PCI_PTM_CAP_ROOT back in pci_ptm_init() (via
> dev->ptm_root) but testing PCI_PTM_CAP_REQ here feels a little
> confusing to me.
> 
> Also, we already read PCI_PTM_CAP in pci_ptm_init(), and we did cache
> ptm_root.  Maybe we should also cache ptm_responder and ptm_requester
> and test all of them here in __pci_enable_ptm() and drop the tests in
> pci_ptm_init()?

Sure I can do it that way too.

> >  	pci_read_config_dword(dev, ptm + PCI_PTM_CTRL, &ctrl);
> >  
> >  	ctrl |= PCI_PTM_CTRL_ENABLE;
> > -- 
> > 2.50.1
> > 

  reply	other threads:[~2025-10-31  6:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-30 13:46 [PATCH v3] PCI/PTM: Do not enable PTM solely based on the capability existense Mika Westerberg
2025-10-30 20:59 ` Bjorn Helgaas
2025-10-31  6:09   ` Mika Westerberg [this message]
2025-11-11  0:10     ` Bjorn Helgaas
2025-11-11  6:01       ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251031060959.GY2912318@black.igk.intel.com \
    --to=mika.westerberg@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).