From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: D Scott Phillips <scott@os.amperecomputing.com>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
linux-pci@vger.kernel.org,
"Michał Winiarski" <michal.winiarski@intel.com>,
"Igor Mammedov" <imammedo@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
"Mika Westerberg" <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH 24/25] PCI: Perform reset_resource() and build fail list in sync
Date: Wed, 25 Jun 2025 20:45:34 +0300 (EEST) [thread overview]
Message-ID: <e20e3171-7aa5-0646-7934-e6b10cdefc4e@linux.intel.com> (raw)
In-Reply-To: <86plf0lgit.fsf@scott-ph-mail.amperecomputing.com>
[-- Attachment #1: Type: text/plain, Size: 10743 bytes --]
On Wed, 18 Jun 2025, D Scott Phillips wrote:
> Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> writes:
>
> > Resetting resource is problematic as it prevent attempting to allocate
> > the resource later, unless something in between restores the resource.
> > Similarly, if fail_head does not contain all resources that were reset,
> > those resource cannot be restored later.
> >
> > The entire reset/restore cycle adds complexity and leaving resources
> > into reseted state causes issues to other code such as for checks done
> > in pci_enable_resources(). Take a small step towards not resetting
> > resources by delaying reset until the end of resource assignment and
> > build failure list (fail_head) in sync with the reset to avoid leaving
> > behind resources that cannot be restored (for the case where the caller
> > provides fail_head in the first place to allow restore somewhere in the
> > callchain, as is not all callers pass non-NULL fail_head).
> >
> > The Expansion ROM check is temporarily left in place while building the
> > failure list until the upcoming change which reworks optional resource
> > handling.
> >
> > Ideally, whole resource reset could be removed but doing that in a big
> > step would make the impact non-tractable due to complexity of all
> > related code.
> >
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>
> Hi Ilpo, I'm seeing a crash on arm64 at boot that I bisected to this
> change. I don't think it's the same as the other issues reported. I've
> confirmed the crash is still there after your follow up patches. The
> crash itself is below[1].
>
> It looks like the problem begins when:
>
> amdgpu_device_resize_fb_bar()
> pci_resize_resource()
> pci_reassign_bridge_resources()
> __pci_bus_size_bridges()
>
> adds pci_hotplug_io_size to `realloc_head`. The io resource allocation
> has failed earlier because the root port doesn't have an io window[2].
>
> Then with this patch, pci_reassign_bridge_resources()'s call to
> __pci_bridge_assign_resources() now returns the io added space for
> hotplug in the `failed` list where the old code dropped it and did not.
>
> That sends pci_reassign_bridge_resources() into the `cleanup:` path,
> where I think the cleanup code doesn't properly release the resources
> that were assigned by __pci_bridge_assign_resources() and there's a
> conflict reported in pci_claim_resource() where a restored resource is
> found as conflicting with itself:
>
> > pcieport 000d:00:01.0: bridge window [mem 0x340000000000-0x340017ffffff 64bit pref]: can't claim; address conflict with PCI Bus 000d:01 [mem 0x340000000000-0x340017ffffff 64bit pref]
>
> Setting `pci=hpiosize=0` avoids this crash, as does this change:
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 16d5d390599a..59ece11702da 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -2442,7 +2442,7 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
> LIST_HEAD(saved);
> LIST_HEAD(added);
> LIST_HEAD(failed);
> - unsigned int i;
> + unsigned int i, relevant_fails;
> int ret;
>
> down_read(&pci_bus_sem);
> @@ -2490,7 +2490,16 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
> __pci_bridge_assign_resources(bridge, &added, &failed);
> BUG_ON(!list_empty(&added));
>
> - if (!list_empty(&failed)) {
> + relevant_fails = 0;
> + list_for_each_entry(dev_res, &failed, list) {
> + restore_dev_resource(dev_res);
> + if (((dev_res->res->flags ^ type) & PCI_RES_TYPE_MASK) == 0)
> + relevant_fails++;
> + }
> + free_list(&failed);
> +
> + /* Cleanup if we had failures in resources of interest */
> + if (relevant_fails != 0) {
> ret = -ENOSPC;
> goto cleanup;
> }
> @@ -2509,11 +2518,6 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
> return 0;
>
> cleanup:
> - /* Restore size and flags */
> - list_for_each_entry(dev_res, &failed, list)
> - restore_dev_resource(dev_res);
> - free_list(&failed);
> -
> /* Revert to the old configuration */
> list_for_each_entry(dev_res, &saved, list) {
> struct resource *res = dev_res->res;
>
> I don't know this code well enough to know if that changes is completely
> bonkers or what.
Hi again,
Thanks for all the details what you think went wrong, it was really
useful. I think you have it towards the right direction but a more
targetted seems enough to address this (this needs to be confirmed, please
test the patch below).
The most correct solution would be to make all the resource fitting code
to focus on the resources that match the type filter. However, that looks
way too scary change at the moment to implement, and especially, let it
end up into stable (to fix this issue). So it looks this somewhat band-aid
solution similar to your attempt might be better as a fix for now.
In medium term, I'd want to avoid using type as a filter and base all
such decisions on matching the bridge window resource the dev resource
belongs to. I've some work towards that direction already which removes
lots of complexity in which bridge window is going to be selected as
there will be a single place to make always the same decision. That change
is also going to simplify the internal interfaces between functions very
noticably (but the change require more testing before I've enough
confidence to submit it). That work doesn't cover this resize side yet but
it should be extended there as well.
So please test this somewhat band-aid patch:
From 971686ed85e341e7234f8fe8b666140187f63ad1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com>
Date: Wed, 25 Jun 2025 20:30:43 +0300
Subject: [PATCH 1/1] PCI: Fix failure detectiong during resource resize
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Since the commit 96336ec70264 ("PCI: Perform reset_resource() and build
fail list in sync") the failed list is always built and returned to let
the caller decide if what to do with the failures. The caller may want
to retry resource fitting and assignment and before that can happen,
the resources should be restored to their original state (a reset
effectively clears the struct resource), which requires returning them
on the failed list so that the original state remains stored in the
associated struct pci_dev_resource.
Resource resizing is different from the ordinary resource fitting and
assignment in that it only considers part of the resources. This means
failures for other resource types are not relevant at all and should be
ignored. As resize doesn't unassign such unrelated resources, those
resource ending up into the failed list implies assignment of that
resource must have failed before resize too. The check in
pci_reassign_bridge_resources() to decide if the whole assignment is
successful, however, is based on list emptiness which may cause false
negatives when the failed list resources with unrelated type.
If the failed list is not empty, call pci_required_resource_failed()
and extend it to be able to filter on specific resource types too (if
provided).
Calling pci_required_resource_failed() at this point is slightly
problematic because the resource itself is reset when the failed list
is constructed in __assign_resources_sorted(). As a result,
pci_resource_is_optional() does not have access to the original
resource flags. This could be worked around by restoring and
re-reseting the resource around the call to pci_resource_is_optional(),
however, it shouldn't cause issue as resource resizing is meant for
64-bit prefetchable resources according to Christian König (see the
Link which unfortunately doesn't point directly to Christian's reply
because lore didn't store that email at all).
Link: https://lore.kernel.org/all/c5d1b5d8-8669-5572-75a7-0b480f581ac1@linux.intel.com/
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
drivers/pci/setup-bus.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 07c3d021a47e..8284bbdc44b4 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -28,6 +28,10 @@
#include <linux/acpi.h>
#include "pci.h"
+#define PCI_RES_TYPE_MASK \
+ (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+ IORESOURCE_MEM_64)
+
unsigned int pci_flags;
EXPORT_SYMBOL_GPL(pci_flags);
@@ -384,13 +388,19 @@ static bool pci_need_to_release(unsigned long mask, struct resource *res)
}
/* Return: @true if assignment of a required resource failed. */
-static bool pci_required_resource_failed(struct list_head *fail_head)
+static bool pci_required_resource_failed(struct list_head *fail_head,
+ unsigned long type)
{
struct pci_dev_resource *fail_res;
+ type &= ~PCI_RES_TYPE_MASK;
+
list_for_each_entry(fail_res, fail_head, list) {
int idx = pci_resource_num(fail_res->dev, fail_res->res);
+ if (type && (fail_res->flags & PCI_RES_TYPE_MASK) != type)
+ continue;
+
if (!pci_resource_is_optional(fail_res->dev, idx))
return true;
}
@@ -504,7 +514,7 @@ static void __assign_resources_sorted(struct list_head *head,
}
/* Without realloc_head and only optional fails, nothing more to do. */
- if (!pci_required_resource_failed(&local_fail_head) &&
+ if (!pci_required_resource_failed(&local_fail_head, 0) &&
list_empty(realloc_head)) {
list_for_each_entry(save_res, &save_head, list) {
struct resource *res = save_res->res;
@@ -1704,10 +1714,6 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
}
}
-#define PCI_RES_TYPE_MASK \
- (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
- IORESOURCE_MEM_64)
-
static void pci_bridge_release_resources(struct pci_bus *bus,
unsigned long type)
{
@@ -2445,8 +2451,12 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
free_list(&added);
if (!list_empty(&failed)) {
- ret = -ENOSPC;
- goto cleanup;
+ if (pci_required_resource_failed(&failed, type)) {
+ ret = -ENOSPC;
+ goto cleanup;
+ }
+ /* Only resources with unrelated types failed (again) */
+ free_list(&failed);
}
list_for_each_entry(dev_res, &saved, list) {
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
--
2.39.5
next prev parent reply other threads:[~2025-06-25 17:45 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 17:56 [PATCH 00/25] PCI: Resource fitting/assignment fixes and cleanups Ilpo Järvinen
2025-02-14 9:59 ` Xiaochun Lee
2024-12-16 17:56 ` [PATCH 01/25] PCI: Remove add_align overwrite unrelated to size0 Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 02/25] PCI: size0 is unrelated to add_align Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 03/25] PCI: Simplify size1 assignment logic Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 04/25] PCI: Optional bridge window size too may need relaxing Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 05/25] PCI: Fix old_size lower bound in calculate_iosize() too Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 06/25] PCI: Use SZ_* instead of literals in setup-bus.c Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 07/25] PCI: resource_set_range/size() conversions Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 08/25] PCI: Add a helper to identify IOV resources Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 09/25] PCI: Check resource_size() separately Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 10/25] PCI: Add pci_resource_num() helper Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 11/25] PCI: Add dev & res local variables to resource assignment funcs Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 12/25] PCI: Converge return paths in __assign_resources_sorted() Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 13/25] PCI: Refactor pdev_sort_resources() & __dev_sort_resources() Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 14/25] PCI: Use while loop and break instead of gotos Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 15/25] PCI: Rename retval to ret Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 16/25] PCI: Consolidate assignment loop next round preparation Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 17/25] PCI: Remove wrong comment from pci_reassign_resource() Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 18/25] PCI: Add restore_dev_resource() Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 19/25] PCI: Extend enable to check for any optional resource Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 20/25] PCI: Always have realloc_head in __assign_resources_sorted() Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 21/25] PCI: Indicate optional resource assignment failures Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 22/25] PCI: Add debug print when releasing resources before retry Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 23/25] PCI: Use res->parent to check is resource is assigned Ilpo Järvinen
2024-12-16 17:56 ` [PATCH 24/25] PCI: Perform reset_resource() and build fail list in sync Ilpo Järvinen
2025-04-01 2:35 ` Guenter Roeck
2025-04-01 10:18 ` Ilpo Järvinen
2025-04-01 12:07 ` Ilpo Järvinen
2025-04-01 14:15 ` Guenter Roeck
2025-04-01 17:38 ` Ilpo Järvinen
2025-04-11 19:37 ` Ondřej Jirman
2025-04-14 9:52 ` Ilpo Järvinen
2025-04-14 12:19 ` Ondřej Jirman
2025-04-14 13:15 ` Ilpo Järvinen
2025-04-14 13:43 ` Ondřej Jirman
2025-04-14 13:52 ` Ilpo Järvinen
2025-04-01 13:28 ` Guenter Roeck
2025-05-06 15:03 ` Tudor Ambarus
2025-05-06 15:53 ` Ilpo Järvinen
2025-05-28 11:22 ` Tudor Ambarus
2025-05-28 11:39 ` Tudor Ambarus
2025-05-28 13:09 ` Tudor Ambarus
2025-05-30 6:55 ` Ilpo Järvinen
2025-05-30 6:38 ` Ilpo Järvinen
2025-05-30 14:48 ` Ilpo Järvinen
2025-06-02 14:40 ` Tudor Ambarus
2025-06-02 15:08 ` Ilpo Järvinen
2025-06-02 18:42 ` Tudor Ambarus
2025-06-03 8:13 ` Ilpo Järvinen
2025-06-03 10:36 ` Tudor Ambarus
2025-06-03 10:48 ` Tudor Ambarus
2025-06-03 11:43 ` Tudor Ambarus
2025-06-03 14:23 ` Ilpo Järvinen
2025-06-03 14:43 ` Ilpo Järvinen
2025-06-03 14:13 ` Ilpo Järvinen
2025-06-03 15:25 ` Tudor Ambarus
2025-06-03 17:03 ` Ilpo Järvinen
2025-06-03 17:09 ` Ilpo Järvinen
2025-06-02 12:32 ` Tudor Ambarus
2025-06-19 0:30 ` D Scott Phillips
2025-06-24 12:48 ` Ilpo Järvinen
2025-06-25 17:45 ` Ilpo Järvinen [this message]
2025-06-25 20:33 ` D Scott Phillips
2025-06-26 9:22 ` Ilpo Järvinen
2025-06-26 14:53 ` D Scott Phillips
2024-12-16 17:56 ` [PATCH 25/25] PCI: Rework optional resource handling Ilpo Järvinen
2025-02-13 21:46 ` [PATCH 00/25] PCI: Resource fitting/assignment fixes and cleanups Bjorn Helgaas
2025-02-14 8:18 ` Xiaochun XC17 Li | 李小春 Xavier
2025-02-14 11:53 ` Ilpo Järvinen
2025-02-14 21:28 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e20e3171-7aa5-0646-7934-e6b10cdefc4e@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=michal.winiarski@intel.com \
--cc=mika.westerberg@linux.intel.com \
--cc=scott@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.