LKML Archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>, <linux-pm@vger.kernel.org>,
	<loongarch@lists.linux.dev>, <linux-acpi@vger.kernel.org>,
	<linux-arch@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	<x86@kernel.org>, Russell King <linux@armlinux.org.uk>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Miguel Luis <miguel.luis@oracle.com>,
	"James Morse" <james.morse@arm.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, <linuxarm@huawei.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>, <justin.he@arm.com>,
	<jianyong.wu@arm.com>
Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
Date: Thu, 25 Apr 2024 13:31:50 +0100	[thread overview]
Message-ID: <20240425133150.000009fa@Huawei.com> (raw)
In-Reply-To: <86il06rd19.wl-maz@kernel.org>

On Wed, 24 Apr 2024 16:33:22 +0100
Marc Zyngier <maz@kernel.org> wrote:

> On Wed, 24 Apr 2024 13:54:38 +0100,
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > On Tue, 23 Apr 2024 13:01:21 +0100
> > Marc Zyngier <maz@kernel.org> wrote:
> >   
> > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:  
> > > > 
> > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> 
> [...]
> 
> > > >     
> > > > > +	/*
> > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > +	 */
> > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > +		set_cpu_present(cpu, false);
> > > > > +		set_cpu_possible(cpu, false);
> > > > > +		return 0;
> > > > > +	}    
> > > 
> > > It seems dangerous to clear those this late in the game, given how
> > > disconnected from the architecture code this is. Are we sure that
> > > nothing has sampled these cpumasks beforehand?  
> > 
> > Hi Marc,
> > 
> > Any firmware that does this is being considered as buggy already
> > but given it is firmware and the spec doesn't say much about this,
> > there is always the possibility.  
> 
> There is no shortage of broken firmware out there, and I expect this
> trend to progress.
> 
> > Not much happens between the point where these are setup and
> > the point where the the gic inits and this code runs, but even if careful
> > review showed it was fine today, it will be fragile to future changes.
> > 
> > I'm not sure there is a huge disadvantage for such broken firmware in
> > clearing these masks from the point of view of what is used throughout
> > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > being onlined later.  
> 
> I totally agree on the goal, I simply question the way you get to it.
> 
> > 
> > We could add a set_cpu_broken() with appropriate mask.
> > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > us checking such a mask in the generic ACPI code, but we could check it in
> > arch_register_cpu() and just not register the cpu if it matches.
> > That will cover the vCPU hotplug case.
> >
> > Does that sounds sensible, or would you prefer something else?  
> 
> 
> Such a 'broken_rdists' mask is exactly what I have in mind, just
> keeping it private to the GIC driver, and not expose it anywhere else.
> You can then fail the hotplug event early, and avoid changing the
> global masks from within the GIC driver. At least, we don't mess with
> the internals of the kernel, and the CPU is properly marked as dead
> (that mechanism should already work).
> 
> I'd expect the handling side to look like this (will not compile, but
> you'll get the idea):
Hi Marc,

In general this looks good - but...

I haven't gotten to the bottom of why yet (and it might be a side
effect of how I hacked the test by lying in minimal fashion and
just frigging the MADT read functions) but the hotplug flow is only getting
as far as calling __cpu_up() before it seems to enter an infinite loop.
That is it never gets far enough to fail this test.

Getting stuck in a psci cpu_on call.  I'm guessing something that
we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
Looks like it gets to
SMCCC smc
and is never seen again.

Any ideas on where to look?  The one advantage so far of the higher level
approach is we never tried the hotplug callbacks at all so avoided hitting
that call.  One (little bit horrible) solution that might avoid this would 
be to add another cpuhp state very early on and fail at that stage.
I'm not keen on doing that without a better explanation than I have so far!

Thanks,

J

 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 6fb276504bcc..e8f02bfd0e21 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr)
>  	u64 typer;
>  	u32 aff;
>  
> +	if (cpumask_test_cpu(smp_processor_id(), &broken_rdists))
> +		return 1;
> +
>  	/*
>  	 * Convert affinity to a 32bit value that can be matched to
>  	 * GICR_TYPER bits [63:32].
> @@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void)
>  		!gicv3_nolpi);
>  }
>  
> -static void gic_cpu_init(void)
> +static int gic_cpu_init(void)
>  {
>  	void __iomem *rbase;
> -	int i;
> +	int ret, i;
>  
>  	/* Register ourselves with the rest of the world */
> -	if (gic_populate_rdist())
> -		return;
> +	ret = gic_populate_rdist();
> +	if (ret)
> +		return ret;
>  
>  	gic_enable_redist(true);
>  
> @@ -1286,6 +1290,8 @@ static void gic_cpu_init(void)
>  
>  	/* initialise system registers */
>  	gic_cpu_sys_reg_init();
> +
> +	return 0;
>  }
>  
>  #ifdef CONFIG_SMP
> @@ -1295,7 +1301,11 @@ static void gic_cpu_init(void)
>  
>  static int gic_starting_cpu(unsigned int cpu)
>  {
> -	gic_cpu_init();
> +	int ret;
> +
> +	ret = gic_cpu_init();
> +	if (ret)
> +		return ret;
>  
>  	if (gic_dist_supports_lpis())
>  		its_cpu_init();
> 
> But the question is: do you rely on these masks having been
> "corrected" anywhere else?
> 
> Thanks,
> 
> 	M.
> 


  parent reply	other threads:[~2024-04-25 12:31 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18 13:53 [PATCH v7 00/16] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
2024-04-18 13:53 ` [PATCH v7 01/16] ACPI: processor: Simplify initial onlining to use same path for cold and hotplug Jonathan Cameron
2024-04-22 18:46   ` Rafael J. Wysocki
2024-04-23  6:18   ` Hanjun Guo
2024-04-26  9:23   ` Gavin Shan
2024-04-18 13:53 ` [PATCH v7 02/16] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
2024-04-23  6:22   ` Hanjun Guo
2024-04-26  9:20   ` Gavin Shan
2024-04-18 13:53 ` [PATCH v7 03/16] ACPI: processor: Drop duplicated check on _STA (enabled + present) Jonathan Cameron
2024-04-22 18:48   ` Rafael J. Wysocki
2024-04-23  6:49   ` Hanjun Guo
2024-04-23  9:31     ` Rafael J. Wysocki
2024-04-23 11:13       ` Hanjun Guo
2024-04-26  9:24   ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 04/16] ACPI: processor: Move checks and availability of acpi_processor earlier Jonathan Cameron
2024-04-22 18:56   ` Rafael J. Wysocki
2024-04-24 16:53     ` Jonathan Cameron
2024-04-23 11:53   ` Hanjun Guo
2024-04-24 17:18     ` Jonathan Cameron
2024-04-25  1:20       ` Hanjun Guo
2024-04-18 13:54 ` [PATCH v7 05/16] ACPI: processor: Add acpi_get_processor_handle() helper Jonathan Cameron
2024-04-22 18:59   ` Rafael J. Wysocki
2024-04-26  9:15   ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 06/16] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
2024-04-22 19:02   ` Rafael J. Wysocki
2024-04-23 11:58   ` Hanjun Guo
2024-04-26  9:18   ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 07/16] ACPI: scan: switch to flags for acpi_scan_check_and_detach() Jonathan Cameron
2024-04-22 19:05   ` Rafael J. Wysocki
2024-04-23 12:02   ` Hanjun Guo
2024-04-26  9:25   ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 08/16] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
2024-04-22 19:10   ` Rafael J. Wysocki
2024-04-23 12:06   ` Hanjun Guo
2024-04-26 11:48   ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 09/16] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
2024-04-22 10:46   ` Jonathan Cameron
2024-04-23 12:10   ` Hanjun Guo
2024-04-18 13:54 ` [PATCH v7 10/16] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
2024-04-22 10:39   ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
2024-04-22 10:40   ` Jonathan Cameron
2024-04-23 12:01     ` Marc Zyngier
2024-04-24 12:54       ` Jonathan Cameron
2024-04-24 15:33         ` Marc Zyngier
2024-04-24 16:35           ` Salil Mehta
2024-04-24 17:08             ` Jonathan Cameron
2024-04-25 10:23               ` Jonathan Cameron
2024-04-25 12:31           ` Jonathan Cameron [this message]
2024-04-25 15:00             ` Jonathan Cameron
2024-04-25 16:55               ` Jonathan Cameron
2024-04-26 12:41                 ` Marc Zyngier
2024-04-25  9:28         ` Jonathan Cameron
2024-04-25  9:56           ` Jonathan Cameron
2024-04-25 10:13             ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 12/16] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
2024-04-22 10:44   ` Jonathan Cameron
2024-04-26  9:36   ` Gavin Shan
2024-04-26  9:57     ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 13/16] arm64: arch_register_cpu() variant to check if an ACPI handle is now available Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 14/16] arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is enabled Jonathan Cameron
2024-04-24 17:24   ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 15/16] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 16/16] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron
2024-04-18 19:50 ` [PATCH v7 00/16] ACPI/arm64: add support for virtual cpu hotplug Rafael J. Wysocki
2024-04-22 19:16   ` Rafael J. Wysocki
2024-04-19 15:39 ` Miguel Luis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240425133150.000009fa@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=jean-philippe@linaro.org \
    --cc=jianyong.wu@arm.com \
    --cc=justin.he@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@huawei.com \
    --cc=loongarch@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=miguel.luis@oracle.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=salil.mehta@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).