All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	linux-pm@vger.kernel.org, loongarch@lists.linux.dev,
	linux-acpi@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	x86@kernel.org, Miguel Luis <miguel.luis@oracle.com>,
	James Morse <james.morse@arm.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linuxarm@huawei.com, justin.he@arm.com, jianyong.wu@arm.com
Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
Date: Sat, 13 Apr 2024 01:23:48 +0200	[thread overview]
Message-ID: <878r1iyxkr.ffs@tglx> (raw)
In-Reply-To: <ZhmtO6zBExkQGZLk@shell.armlinux.org.uk>

Russell!

On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
>> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
>> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
>> > being taken - so I've no idea why the "make_present" case takes these
>> > locks.
>> 
>> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
>> boot must hold the appropriate write locks. Otherwise it would be
>> possible to online a CPU which just got marked present, but the
>> registration has not completed yet.
>
> Yes. As far as I've been able to determine, arch_register_cpu()
> doesn't manipulate any of the CPU masks. All it seems to be doing
> is initialising the struct cpu, registering the embedded struct
> device, and setting up the sysfs links to its NUMA node.
>
> There is nothing obvious in there which manipulates any CPU masks, and
> this is rather my fundamental point when I said "I couldn't find
> anything in arch_register_cpu() that depends on ...".
>
> If there is something, then comments in the code would be a useful aid
> because it's highly non-obvious where such a manipulation is located,
> and hence why the locks are necessary.

acpi_processor_hotadd_init()
...
         acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);

That ends up in fiddling with cpu_present_mask.

I grant you that arch_register_cpu() is not, but it might rely on the
external locking too. I could not be bothered to figure that out.

>> Define "real hotplug" :)
>> 
>> Real physical hotplug does not really exist. That's at least true for
>> x86, where the physical hotplug support was chased for a while, but
>> never ended up in production.
>> 
>> Though virtualization happily jumped on it to hot add/remove CPUs
>> to/from a guest.
>> 
>> There are limitations to this and we learned it the hard way on X86. At
>> the end we came up with the following restrictions:
>> 
>>     1) All possible CPUs have to be advertised at boot time via firmware
>>        (ACPI/DT/whatever) independent of them being present at boot time
>>        or not.
>> 
>>        That guarantees proper sizing and ensures that associations
>>        between hardware entities and software representations and the
>>        resulting topology are stable for the lifetime of a system.
>> 
>>        It is really required to know the full topology of the system at
>>        boot time especially with hybrid CPUs where some of the cores
>>        have hyperthreading and the others do not.
>> 
>> 
>>     2) Hot add can only mark an already registered (possible) CPU
>>        present. Adding non-registered CPUs after boot is not possible.
>> 
>>        The CPU must have been registered in #1 already to ensure that
>>        the system topology does not suddenly change in an incompatible
>>        way at run-time.
>> 
>> The same restriction would apply to real physical hotplug. I don't think
>> that's any different for ARM64 or any other architecture.
>
> This makes me wonder whether the Arm64 has been barking up the wrong
> tree then, and whether the whole "present" vs "enabled" thing comes
> from a misunderstanding as far as a CPU goes.
>
> However, there is a big difference between the two. On x86, a processor
> is just a processor. On Arm64, a "processor" is a slice of the system
> (includes the interrupt controller, PMUs etc) and we must enumerate
> those even when the processor itself is not enabled. This is the whole
> reason there's a difference between "present" and "enabled" and why
> there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> The processor never actually goes away in arm64, it's just prevented
> from being used.

It's the same on X86 at least in the physical world.

Thanks,

        tglx


WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	linux-pm@vger.kernel.org, loongarch@lists.linux.dev,
	linux-acpi@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	x86@kernel.org, Miguel Luis <miguel.luis@oracle.com>,
	James Morse <james.morse@arm.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linuxarm@huawei.com, justin.he@arm.com, jianyong.wu@arm.com
Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
Date: Sat, 13 Apr 2024 01:23:48 +0200	[thread overview]
Message-ID: <878r1iyxkr.ffs@tglx> (raw)
In-Reply-To: <ZhmtO6zBExkQGZLk@shell.armlinux.org.uk>

Russell!

On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
>> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
>> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
>> > being taken - so I've no idea why the "make_present" case takes these
>> > locks.
>> 
>> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
>> boot must hold the appropriate write locks. Otherwise it would be
>> possible to online a CPU which just got marked present, but the
>> registration has not completed yet.
>
> Yes. As far as I've been able to determine, arch_register_cpu()
> doesn't manipulate any of the CPU masks. All it seems to be doing
> is initialising the struct cpu, registering the embedded struct
> device, and setting up the sysfs links to its NUMA node.
>
> There is nothing obvious in there which manipulates any CPU masks, and
> this is rather my fundamental point when I said "I couldn't find
> anything in arch_register_cpu() that depends on ...".
>
> If there is something, then comments in the code would be a useful aid
> because it's highly non-obvious where such a manipulation is located,
> and hence why the locks are necessary.

acpi_processor_hotadd_init()
...
         acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);

That ends up in fiddling with cpu_present_mask.

I grant you that arch_register_cpu() is not, but it might rely on the
external locking too. I could not be bothered to figure that out.

>> Define "real hotplug" :)
>> 
>> Real physical hotplug does not really exist. That's at least true for
>> x86, where the physical hotplug support was chased for a while, but
>> never ended up in production.
>> 
>> Though virtualization happily jumped on it to hot add/remove CPUs
>> to/from a guest.
>> 
>> There are limitations to this and we learned it the hard way on X86. At
>> the end we came up with the following restrictions:
>> 
>>     1) All possible CPUs have to be advertised at boot time via firmware
>>        (ACPI/DT/whatever) independent of them being present at boot time
>>        or not.
>> 
>>        That guarantees proper sizing and ensures that associations
>>        between hardware entities and software representations and the
>>        resulting topology are stable for the lifetime of a system.
>> 
>>        It is really required to know the full topology of the system at
>>        boot time especially with hybrid CPUs where some of the cores
>>        have hyperthreading and the others do not.
>> 
>> 
>>     2) Hot add can only mark an already registered (possible) CPU
>>        present. Adding non-registered CPUs after boot is not possible.
>> 
>>        The CPU must have been registered in #1 already to ensure that
>>        the system topology does not suddenly change in an incompatible
>>        way at run-time.
>> 
>> The same restriction would apply to real physical hotplug. I don't think
>> that's any different for ARM64 or any other architecture.
>
> This makes me wonder whether the Arm64 has been barking up the wrong
> tree then, and whether the whole "present" vs "enabled" thing comes
> from a misunderstanding as far as a CPU goes.
>
> However, there is a big difference between the two. On x86, a processor
> is just a processor. On Arm64, a "processor" is a slice of the system
> (includes the interrupt controller, PMUs etc) and we must enumerate
> those even when the processor itself is not enabled. This is the whole
> reason there's a difference between "present" and "enabled" and why
> there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> The processor never actually goes away in arm64, it's just prevented
> from being used.

It's the same on X86 at least in the physical world.

Thanks,

        tglx


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-04-12 23:23 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
2024-04-12 14:37 ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 17:42   ` Rafael J. Wysocki
2024-04-12 17:42     ` Rafael J. Wysocki
2024-04-22  3:53   ` Gavin Shan
2024-04-22  3:53     ` Gavin Shan
2024-04-12 14:37 ` [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 18:10   ` Rafael J. Wysocki
2024-04-12 18:10     ` Rafael J. Wysocki
2024-04-15 15:48     ` Jonathan Cameron
2024-04-15 15:48       ` Jonathan Cameron
2024-04-15 16:16       ` Rafael J. Wysocki
2024-04-15 16:16         ` Rafael J. Wysocki
2024-04-15 16:19         ` Rafael J. Wysocki
2024-04-15 16:19           ` Rafael J. Wysocki
2024-04-15 16:50           ` Jonathan Cameron
2024-04-15 16:50             ` Jonathan Cameron
2024-04-15 17:34             ` Jonathan Cameron
2024-04-15 17:34               ` Jonathan Cameron
2024-04-15 17:41               ` Rafael J. Wysocki
2024-04-15 17:41                 ` Rafael J. Wysocki
2024-04-16 17:35                 ` Jonathan Cameron
2024-04-16 17:35                   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 18:30   ` Rafael J. Wysocki
2024-04-12 18:30     ` Rafael J. Wysocki
2024-04-12 20:16     ` Russell King (Oracle)
2024-04-12 20:16       ` Russell King (Oracle)
2024-04-12 20:54       ` Thomas Gleixner
2024-04-12 20:54         ` Thomas Gleixner
2024-04-12 21:52         ` Russell King (Oracle)
2024-04-12 21:52           ` Russell King (Oracle)
2024-04-12 23:23           ` Thomas Gleixner [this message]
2024-04-12 23:23             ` Thomas Gleixner
2024-04-15  8:45             ` Jonathan Cameron
2024-04-15  8:45               ` Jonathan Cameron
2024-04-15  9:16               ` Jonathan Cameron
2024-04-15  9:16                 ` Jonathan Cameron
2024-04-15  9:31                 ` Jonathan Cameron
2024-04-15  9:31                   ` Jonathan Cameron
2024-04-15 11:57                 ` Jonathan Cameron
2024-04-15 11:57                   ` Jonathan Cameron
2024-04-15 11:37               ` Rafael J. Wysocki
2024-04-15 11:37                 ` Rafael J. Wysocki
2024-04-15 11:56                 ` Jonathan Cameron
2024-04-15 11:56                   ` Jonathan Cameron
2024-04-15 12:04                   ` Rafael J. Wysocki
2024-04-15 12:04                     ` Rafael J. Wysocki
2024-04-15 12:23                     ` Jonathan Cameron
2024-04-15 12:23                       ` Jonathan Cameron
2024-04-16 17:41                       ` Jonathan Cameron
2024-04-16 17:41                         ` Jonathan Cameron
2024-04-16 19:02                         ` Rafael J. Wysocki
2024-04-16 19:02                           ` Rafael J. Wysocki
2024-04-17 10:39                           ` Jonathan Cameron
2024-04-17 10:39                             ` Jonathan Cameron
2024-04-15 12:37                     ` Salil Mehta
2024-04-15 12:37                       ` Salil Mehta
2024-04-15 12:41                       ` Rafael J. Wysocki
2024-04-15 12:41                         ` Rafael J. Wysocki
2024-04-15 11:51         ` Salil Mehta
2024-04-15 11:51           ` Salil Mehta
2024-04-15 12:51           ` Rafael J. Wysocki
2024-04-15 12:51             ` Rafael J. Wysocki
2024-04-15 15:31             ` Salil Mehta
2024-04-15 15:31               ` Salil Mehta
2024-04-15 16:38               ` Rafael J. Wysocki
2024-04-15 16:38                 ` Rafael J. Wysocki
2024-04-17 15:01                 ` Salil Mehta
2024-04-17 15:01                   ` Salil Mehta
2024-04-17 16:19                   ` Rafael J. Wysocki
2024-04-17 16:19                     ` Rafael J. Wysocki
2024-04-15 10:52     ` Jonathan Cameron
2024-04-15 10:52       ` Jonathan Cameron
2024-04-15 11:11       ` Jonathan Cameron
2024-04-15 11:11         ` Jonathan Cameron
2024-04-15 11:52       ` Rafael J. Wysocki
2024-04-15 11:52         ` Rafael J. Wysocki
2024-04-15 11:07     ` Salil Mehta
2024-04-15 11:07       ` Salil Mehta
2024-04-16 14:00   ` Jonathan Cameron
2024-04-16 14:00     ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 04/18] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 05/18] ACPI: utils: Add an acpi_sta_enabled() helper and use it in acpi_processor_make_present() Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 06/18] ACPI: scan: Add parameter to allow defering some actions in acpi_scan_check_and_detach Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 07/18] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 08/18] ACPI: convert acpi_processor_post_eject() to use IS_ENABLED() Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 09/18] ACPI: Check _STA present bit before making CPUs not present Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 10/18] ACPI: Warn when the present bit changes but the feature is not enabled Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 11/18] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 12/18] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 13/18] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 14/18] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 15/18] arm64: arch_register_cpu() variant to allow checking of ACPI _STA Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 16/18] ACPI: add support to (un)register CPUs based on the _STA enabled bit Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 17/18] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 18/18] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron
2024-04-12 14:37   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r1iyxkr.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=jean-philippe@linaro.org \
    --cc=jianyong.wu@arm.com \
    --cc=justin.he@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@huawei.com \
    --cc=loongarch@lists.linux.dev \
    --cc=miguel.luis@oracle.com \
    --cc=rafael@kernel.org \
    --cc=salil.mehta@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.