Linux-PCI Archive mirror
 help / color / mirror / Atom feed
From: "Pavel Shirshov" <pshirshov@7mind.io>
To: linux-pci@vger.kernel.org
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: PCI/ASPM: Intel Battlemage (Arc Pro B70) bricks at boot when `pcie_aspm.policy=powersupersave` enables ASPM_L1.1 on AMD root port link
Date: Thu, 07 May 2026 23:14:04 +0100	[thread overview]
Message-ID: <3f0598ec-1f93-4817-b54d-c7433e03830e@app.fastmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 11349 bytes --]

The report and the patch below are completely claude'd but the quirk in the patch works.

PCI/ASPM: Intel Battlemage (Arc Pro B70) bricks at boot when
pcie_aspm.policy=powersupersave enables ASPM_L1.1 on AMD root port link

================================================================
SUMMARY
================================================================

On Linux 7.0.3, an Intel Arc Pro B70 (Battlemage / BMG-G31, GPU PCI
ID 8086:e223) plugged into an AMD Ryzen 9 5950X system fails to wake
from D3cold during PCI core enumeration when the kernel is booted
with pcie_aspm.policy=powersupersave. The card is permanently
inaccessible until reboot with a different policy.
pcie_aspm.policy=powersave (L0s+L1, no substates) works correctly.

The failure surfaces in PCI core first; downstream xe driver bind
then fails with -EPROTO:

    pcieport 0000:02:01.0: Unable to change power state from D3cold
                           to D0, device inaccessible
    pcieport 0000:02:02.0: Unable to change power state from D3cold
                           to D0, device inaccessible
    xe 0000:03:00.0: Unable to change power state from D3cold to D0,
                     device inaccessible
    xe 0000:03:00.0: [drm] Running in SR-IOV VF mode
                     [misdetected: dead config space reads as 0xff]
    xe 0000:03:00.0: [drm] *ERROR* VF: Tile0: GT0: Failed to reset
                     GuC state (-EPROTO)
    xe 0000:03:00.0: probe with driver xe failed with error -71

After the brick, "lspci -vvv -s 03:00.0" reports
"!!! Unknown header type 7f" -- the canonical signature of a PCI
device whose config space reads return all-ones, i.e. the link to the
device is dead.


================================================================
HARDWARE
================================================================

CPU / root complex:
    AMD Ryzen 9 5950X (Starship/Matisse). The root port hosting the
    BMG card is 0000:00:01.1 -- "Advanced Micro Devices, Inc. [AMD]
    Starship/Matisse GPP Bridge" (subsystem 1022:1453).

GPU:
    Intel Arc Pro B70 -- 8086:e223 (BMG-G31, subsystem 8086:1701).

On-card topology -- the card has a two-layer on-board PCIe switch:
    0000:01:00.0  Intel 8086:e2ff -- BMG card upstream switch port,
                                     PCIe 5.0 x16 capable (currently
                                     downgraded to Gen4 x16).
    0000:02:01.0  Intel 8086:e2f0 -- BMG card downstream switch
                                     port, PCIe Gen1 x1 internal.
    0000:03:00.0  Intel 8086:e223 -- GPU endpoint, PCIe Gen1 x1
                                     internal.

Other:
    BIOS has PCIe ASPM enabled in firmware. pcie_aspm=force is NOT
    set on the kernel command line. Motherboard: ASRock X570
    (specifics in attached dmidecode.txt).


================================================================
REPRODUCER
================================================================

Boot any kernel >= 7.0 with kernel command line containing:

    pcie_aspm.policy=powersupersave xe.force_probe=*

(Also reproduces under earlier 6.x kernels.)

Reverting the cmdline to "pcie_aspm.policy=powersave" and rebooting
restores the card. No firmware reset is required between attempts --
the brick is purely a runtime link-state failure during kernel boot.


================================================================
ASPM NEGOTIATION
================================================================

Captured with "lspci -vvv" on a working policy=powersave boot
(attached: 20260507-204348-powersave-7.0.3.tar.zst).

Link 1: 00:01.1 AMD root  <->  01:00.0 BMG upstream
    Lower end (AMD root, L1SubCap):
        PCI-PM_L1.2-  PCI-PM_L1.1+  ASPM_L1.2-  ASPM_L1.1+
    Upper end (BMG upstream, L1SubCap):
        PCI-PM_L1.2+  PCI-PM_L1.1+  ASPM_L1.2+  ASPM_L1.1+
    Active L1SubCtl1 under policy=powersave:
        PCI-PM_L1.2-  PCI-PM_L1.1-  ASPM_L1.2-  ASPM_L1.1-

Link 2: 01:00.0  <->  02:01.0   (card-internal switch)
    No L1SS capability advertised on either end.

Link 3: 02:01.0  <->  03:00.0   (card-internal to GPU)
    No L1SS capability advertised on either end.

Conclusion: only Link 1 -- the platform-facing AMD<->BMG link -- is
L1SS-capable on both ends, and the intersection is ASPM_L1.1 only
(the AMD GPP root port advertises L1.1 but not L1.2). With
policy=powersupersave, the kernel arms ASPM_L1.1 on this link. After
that, every D3cold->D0 transition fails.

Both ends advertise multi-retimer support (Retimer+ 2Retimers+ on
the AMD root port and on the BMG upstream port). Retimers + L1SS
have a history of wake-recovery problems on other platforms; this
may be the same class of issue.


================================================================
TIMELINE -- failed boot, kernel 7.0.3
================================================================

Excerpted from dmesg-relevant.txt in the powersupersave capture:

    28.792s  pcieport 0000:00:01.1: PME: Signaling with IRQ 48
                     [AMD root port for BMG]
    28.842s  pcieport 0000:02:01.0: Unable to change power state from
                     D3cold to D0, device inaccessible
    28.843s  pcieport 0000:02:02.0: Unable to change power state from
                     D3cold to D0, device inaccessible
    ...
    29.034s  xe 0000:03:00.0: Unable to change power state from
                     D3cold to D0, device inaccessible
    29.035s  xe 0000:03:00.0: [drm] Running in SR-IOV VF mode
    29.035s  xe 0000:03:00.0: [drm] *ERROR* VF: Tile0: GT0: Failed
                     to reset GuC state (-EPROTO)
    29.035s  xe 0000:03:00.0: probe with driver xe failed with
                     error -71

The PCI core's first wake attempt at 28.842s (the immediate parent
bridge of the BMG GPU) fails before any driver probe runs. This
confirms the failure is in the PCI/ASPM layer, not in xe; xe just
sees the resulting dead config space and misclassifies the PF as a
VF.


================================================================
WORKING-POLICY LSPCI EXCERPTS  (relevant capabilities)
================================================================

policy=powersave baseline, root port 00:01.1:

    LnkCap:  Speed 16GT/s, Width x16, ASPM L1, Exit Latency L1 <64us
    LnkCtl:  ASPM L1 Enabled
    LnkSta:  Speed 16GT/s, Width x16
    Capabilities: [370 v1] L1 PM Substates
        L1SubCap:  PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+
                   L1_PM_Substates+
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
        L1SubCtl2:

policy=powersave baseline, BMG upstream 01:00.0:

    LnkCap:  Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <32us
    LnkCtl:  ASPM L1 Enabled
    LnkSta:  Speed 16GT/s (downgraded), Width x16
    Capabilities: [244 v1] L1 PM Substates
        L1SubCap:  PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
                   L1_PM_Substates+
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
        L1SubCtl2: T_PwrOn=14us


================================================================
PROPOSED FIX
================================================================

Disable both L1SS substates on the BMG card's upstream switch port
(8086:e2ff) via a DECLARE_PCI_FIXUP_FINAL. Standard ASPM L1 still
applies, so the link still benefits from the deepest substate the
BMG silicon handles correctly. The quirk keys on the card upstream
port, which is shared across the BMG product family, so it covers
all current BMG SKUs without enumerating individual GPU-endpoint
IDs.

The patch is in the attached intel-bmg-disable-l1ss.patch. With the
patch applied, pcie_aspm.policy=powersupersave boots cleanly on this
hardware (verification in progress at time of report).

Empirical narrowing -- ASPM_L1.1 specifically is the trigger.

  An intermediate version of the quirk passed only
  PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2 to
  pci_disable_link_state(), leaving the PCI-PM substate bits armed.
  After applying that variant, lspci reported

      L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1-

  on the BMG upstream port -- i.e. only the two ASPM substate bits
  were cleared, the PCI-PM substate bits stayed armed -- yet the
  system booted, xe bound, and the GPU operated normally. Combined
  with the AMD root port advertising only ASPM_L1.1+ (not L1.2),
  this isolates ASPM_L1.1 as the specific bit whose activation
  bricks the BMG card. The PCI-PM L1.x substates were also disabled
  in the final patch for hygiene, but they are not load-bearing for
  the fix on this hardware (the GPU does not enter D3hot during
  normal operation, so PCI-PM substates are inert).

Remaining open questions for review:

  1. Is the underlying defect in the AMD Starship root port (cannot
     wake the link from ASPM_L1.1) or in the BMG e2ff upstream port
     (cannot exit ASPM_L1.1 cleanly)? If the former, future BMG
     cards on Intel platforms may not need this quirk; if the
     latter, the quirk is correct for BMG everywhere. We do not
     have a non-AMD reproducer to disambiguate.

  2. Should the quirk also apply to the AMD Starship/Matisse GPP
     Bridge itself (1022:1483 / 1022:1484-class IDs, see
     lspci-nn.txt)? That would be a broader brushstroke but might
     protect other devices presenting the same negotiation.


================================================================
WORKAROUND IN USE
================================================================

Until the quirk lands upstream, downstream users on this hardware
must boot with pcie_aspm.policy=powersave (or default), losing
~25 W of idle savings that the deeper substates would otherwise
provide.


================================================================
ATTACHMENTS
================================================================

Tarballs produced by debug/20260507-aspm-capture.sh:

    20260507-204348-powersave-7.0.3.tar.zst
        -- working baseline

    20260507-205055-powersupersave-7.0.3.tar.zst
        -- failed reproduction

Each tarball contains:

    manifest.txt           kernel, policy, hostname, GPU BDFs
    cmdline.txt            kernel command line
    uname.txt              kernel version
    nixos.txt              userspace metadata
    dmidecode.txt          BIOS/board info
    lspci-tree.txt         PCI topology
    lspci-nn.txt           PCI device list
    lspci-vvv-all.txt      full system lspci -vvv

    gpu-03_00_0/           per-device captures for the GPU and
                           every PCI ancestor up to the root
                           complex:
        lspci-vvv.txt          GPU
        parent-0-02_01_0.txt   BMG card-internal downstream switch
        parent-1-01_00_0.txt   BMG card upstream port (e2ff)
        parent-2-00_01_1.txt   AMD root port
        sysfs.txt              selected sysfs attributes

    dmesg-full.txt                full kernel ring buffer
    dmesg-relevant.txt            filtered for PCI/xe/ASPM/L1
    journal-kernel-current-boot.txt
    journal-kernel-prev-boot.txt
    drivers.txt                   xe / i915 driver state,
                                  /sys/class/drm

Patch: intel-bmg-disable-l1ss.patch  (attached separately)

NixOS 26.05 (nixpkgsRevision:
    549bd84d6279f9852cae6225e372cc67fb91a4c1)

Kernel:
    7.0.3 #1-NixOS SMP PREEMPT_DYNAMIC Thu Apr 30 09:13:05 UTC 2026

[-- Attachment #2: 20260507-204348-powersave-7.0.3.tar.zst --]
[-- Type: application/zstd, Size: 189717 bytes --]

[-- Attachment #3: 20260507-205055-powersupersave-7.0.3.tar.zst --]
[-- Type: application/zstd, Size: 61237 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: intel-bmg-disable-l1ss.patch --]
[-- Type: text/x-patch; name="intel-bmg-disable-l1ss.patch", Size: 4555 bytes --]

PCI/ASPM: disable L1.1/L1.2 substates for Intel Battlemage discrete GPU upstream port

Intel Battlemage (BMG-G21 / BMG-G31, e.g. Arc Pro B70) discrete GPU cards
expose a two-layer on-card PCIe switch:

    AMD/Intel root port  <->  8086:e2ff (BMG card upstream)
                              8086:e2f0 (BMG card downstream)
                              8086:e22x (BMG GPU endpoint, e.g. e223 = Arc Pro B70)

The platform-facing link (root port <-> 8086:e2ff) is the only link in the
chain that advertises L1 PM Substates support on both ends. On AMD
Starship/Matisse (Ryzen 5xxx) root ports, the intersection is ASPM_L1.1
only (the AMD port advertises L1.1 but not L1.2). When pcie_aspm.policy=
powersupersave arms ASPM_L1.1 on this link, the BMG card cannot recover
from the resulting low-power state on subsequent D3cold->D0 transition,
leaving the device permanently inaccessible:

    pcieport 0000:02:01.0: Unable to change power state from D3cold to D0, device inaccessible
    xe 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
    xe 0000:03:00.0: [drm] Running in SR-IOV VF mode    [misdetected: dead config space]
    xe 0000:03:00.0: [drm] *ERROR* VF: Tile0: GT0: Failed to reset GuC state (-EPROTO)
    xe 0000:03:00.0: probe with driver xe failed with error -71

Reproduces deterministically on Linux 7.0.3 with an Arc Pro B70 in an
AMD Ryzen 9 5950X system. pcie_aspm.policy=powersave (L0s+L1 only, no
substates) works correctly; pcie_aspm.policy=powersupersave bricks the
card on every boot. The 6.x-era blanket `no_d3cold` quirk for Battlemage
was narrowed to ASUS NUC13 only in 7.0, but that change is orthogonal:
the failure here is link-state, not device-state, and surfaces
regardless of d3cold_allowed.

Disable all four L1SS substates (ASPM_L1.1, ASPM_L1.2, PCI-PM_L1.1,
PCI-PM_L1.2) on the BMG card's upstream port via a final PCI fixup.
Standard ASPM L1 still applies, so the link still benefits from the
deepest substate the BMG silicon actually handles correctly. The
quirk is keyed on the upstream-port device ID 0xe2ff so it covers
all current Battlemage SKUs (the GPU-endpoint ID varies by SKU, but
the upstream switch is shared).

Empirical narrowing (verified post-fix): with a partial mask that
disabled only ASPM_L1.{1,2} but left PCI-PM_L1.{1,2} armed, the
system boots and operates correctly. This isolates ASPM_L1.1 as the
specific trigger of the brick (the AMD root port advertises ASPM_L1.1
but not ASPM_L1.2, so ASPM_L1.2 cannot have been activated). The
PCI-PM substates only activate during D3hot transitions which the
GPU does not undergo during normal use; they are disabled here for
hygiene rather than necessity.

Reported-by: Pavel Shirshov <pavel@7mind.io>
Signed-off-by: <FILL IN BEFORE SUBMITTING UPSTREAM>

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6289,6 +6289,34 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
+
+/*
+ * Intel Battlemage discrete GPU cards (BMG-G21 / BMG-G31; Arc B580,
+ * Arc Pro B50/B60/B70) expose a two-layer on-card PCIe switch. The
+ * platform-facing link, between the host root port and the card's
+ * upstream switch port (PCI device ID 0xe2ff), is the only link in the
+ * chain advertising L1 PM Substates on both ends. On at least AMD
+ * Starship/Matisse root ports, where the intersection is ASPM_L1.1
+ * only, arming L1.1 leaves the BMG card unable to wake from D3cold:
+ *
+ *   pcieport 0000:02:01.0: Unable to change power state from D3cold
+ *                          to D0, device inaccessible
+ *   xe 0000:03:00.0: probe with driver xe failed with error -71
+ *
+ * Reproduces deterministically with pcie_aspm.policy=powersupersave,
+ * works correctly with policy=powersave (no substates). Disable L1SS
+ * substates on the BMG card upstream port; standard L1 ASPM is
+ * unaffected.
+ */
+static void quirk_intel_bmg_no_l1ss(struct pci_dev *dev)
+{
+	pci_disable_link_state(dev, PCIE_LINK_STATE_L1_2 |
+				    PCIE_LINK_STATE_L1_1 |
+				    PCIE_LINK_STATE_L1_2_PCIPM |
+				    PCIE_LINK_STATE_L1_1_PCIPM);
+	pci_info(dev, "intel-bmg-aspm-quirk: L1.1/L1.2 substates disabled on BMG upstream port\n");
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0xe2ff, quirk_intel_bmg_no_l1ss);
 #endif

 #ifdef CONFIG_PCIE_DPC

                 reply	other threads:[~2026-05-07 22:14 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f0598ec-1f93-4817-b54d-c7433e03830e@app.fastmail.com \
    --to=pshirshov@7mind.io \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).