All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Hector Martin <marcan@marcan.st>
To: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Zayd Qumsieh <zayd_qumsieh@apple.com>,
	Justin Lu <ih_justin@apple.com>,
	Ryan Houdek <Houdek.Ryan@fex-emu.org>,
	Mark Brown <broonie@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Miguel Luis <miguel.luis@oracle.com>,
	Joey Gouly <joey.gouly@arm.com>,
	Christoph Paasch <cpaasch@apple.com>,
	Kees Cook <keescook@chromium.org>,
	Sami Tolvanen <samitolvanen@google.com>,
	Baoquan He <bhe@redhat.com>,
	Joel Granados <j.granados@samsung.com>,
	Dawei Li <dawei.li@shingroup.cn>,
	Andrew Morton <akpm@linux-foundation.org>,
	Florent Revest <revest@chromium.org>,
	David Hildenbrand <david@redhat.com>,
	Stefan Roesch <shr@devkernel.io>,
	Andy Chiu <andy.chiu@sifive.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Oleg Nesterov <oleg@redhat.com>, Helge Deller <deller@gmx.de>,
	Zev Weiss <zev@bewilderbeest.net>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Asahi Linux <asahi@lists.linux.dev>
Subject: Re: [PATCH 0/4] arm64: Support the TSO memory model
Date: Fri, 12 Apr 2024 03:43:36 +0900	[thread overview]
Message-ID: <f6484dcd-ebf6-4b6f-be17-69b05539e33b@marcan.st> (raw)
In-Reply-To: <28ab55b3-e699-4487-b332-f1f20a6b22a1@marcan.st>



On 2024/04/11 23:19, Hector Martin wrote:
>>
>> An alternative option is to go down the SPARC RMO route and just enable
>> TSO statically (although presumably in the firmware) for Apple silicon.
>> I'm assuming that has a performance impact for native code?
> 
> Correct. We already have this as a bootloader option, but it is not
> desirable. Plus, userspace code still needs a way to *discover* that TSO
> is enabled for correctness, so it can automatically decide whether to
> use stronger or weaker instructions.

To add some numbers to this (I was just made aware of this paper):

https://www.sra.uni-hannover.de/Publications/2023/tosting-arcs23/wrenger_23_arcs.pdf

Using TSO globally has, on average, a 9% performance hit, so that is
clearly off the table as a general solution.

Meanwhile, more detailed microbenchmarks often show TSO as having better
performance than outright using acquire/release instructions without
TSO. Therefore, just giving up on TSO and using acq/rel semantics for
emulators is also not an acceptable solution.

Additionally, the general load/store instructions on ARM have more
flexible addressing modes than the synchronizing ones, and since general
x86 emulation requires *all* loads and stores to be like this in a
non-TSO model (without much more complex/expensive program analysis to
determine where this can be elided), the perf impact is definitely worse
for emulation (e.g. stack accesses are affected) than for a
microbenchmark where only the "target" test instructions are being modified.

- Hector

WARNING: multiple messages have this Message-ID (diff)
From: Hector Martin <marcan@marcan.st>
To: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Zayd Qumsieh <zayd_qumsieh@apple.com>,
	Justin Lu <ih_justin@apple.com>,
	Ryan Houdek <Houdek.Ryan@fex-emu.org>,
	Mark Brown <broonie@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Miguel Luis <miguel.luis@oracle.com>,
	Joey Gouly <joey.gouly@arm.com>,
	Christoph Paasch <cpaasch@apple.com>,
	Kees Cook <keescook@chromium.org>,
	Sami Tolvanen <samitolvanen@google.com>,
	Baoquan He <bhe@redhat.com>,
	Joel Granados <j.granados@samsung.com>,
	Dawei Li <dawei.li@shingroup.cn>,
	Andrew Morton <akpm@linux-foundation.org>,
	Florent Revest <revest@chromium.org>,
	David Hildenbrand <david@redhat.com>,
	Stefan Roesch <shr@devkernel.io>,
	Andy Chiu <andy.chiu@sifive.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Oleg Nesterov <oleg@redhat.com>, Helge Deller <deller@gmx.de>,
	Zev Weiss <zev@bewilderbeest.net>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Asahi Linux <asahi@lists.linux.dev>
Subject: Re: [PATCH 0/4] arm64: Support the TSO memory model
Date: Fri, 12 Apr 2024 03:43:36 +0900	[thread overview]
Message-ID: <f6484dcd-ebf6-4b6f-be17-69b05539e33b@marcan.st> (raw)
In-Reply-To: <28ab55b3-e699-4487-b332-f1f20a6b22a1@marcan.st>



On 2024/04/11 23:19, Hector Martin wrote:
>>
>> An alternative option is to go down the SPARC RMO route and just enable
>> TSO statically (although presumably in the firmware) for Apple silicon.
>> I'm assuming that has a performance impact for native code?
> 
> Correct. We already have this as a bootloader option, but it is not
> desirable. Plus, userspace code still needs a way to *discover* that TSO
> is enabled for correctness, so it can automatically decide whether to
> use stronger or weaker instructions.

To add some numbers to this (I was just made aware of this paper):

https://www.sra.uni-hannover.de/Publications/2023/tosting-arcs23/wrenger_23_arcs.pdf

Using TSO globally has, on average, a 9% performance hit, so that is
clearly off the table as a general solution.

Meanwhile, more detailed microbenchmarks often show TSO as having better
performance than outright using acquire/release instructions without
TSO. Therefore, just giving up on TSO and using acq/rel semantics for
emulators is also not an acceptable solution.

Additionally, the general load/store instructions on ARM have more
flexible addressing modes than the synchronizing ones, and since general
x86 emulation requires *all* loads and stores to be like this in a
non-TSO model (without much more complex/expensive program analysis to
determine where this can be elided), the perf impact is definitely worse
for emulation (e.g. stack accesses are affected) than for a
microbenchmark where only the "target" test instructions are being modified.

- Hector

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-04-11 18:43 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11  0:51 [PATCH 0/4] arm64: Support the TSO memory model Hector Martin
2024-04-11  0:51 ` Hector Martin
2024-04-11  0:51 ` [PATCH 1/4] prctl: Introduce PR_{SET,GET}_MEM_MODEL Hector Martin
2024-04-11  0:51   ` Hector Martin
2024-04-11  0:51 ` [PATCH 2/4] arm64: Implement PR_{GET,SET}_MEM_MODEL for always-TSO CPUs Hector Martin
2024-04-11  0:51   ` Hector Martin
2024-04-11  0:51 ` [PATCH 3/4] arm64: Introduce scaffolding to add ACTLR_EL1 to thread state Hector Martin
2024-04-11  0:51   ` Hector Martin
2024-04-11  0:51 ` [PATCH 4/4] arm64: Implement Apple IMPDEF TSO memory model control Hector Martin
2024-04-11  0:51   ` Hector Martin
2024-04-11  1:37 ` [PATCH 0/4] arm64: Support the TSO memory model Neal Gompa
2024-04-11  1:37   ` Neal Gompa
2024-04-11 13:28 ` Will Deacon
2024-04-11 13:28   ` Will Deacon
2024-04-11 14:19   ` Hector Martin
2024-04-11 14:19     ` Hector Martin
2024-04-11 18:43     ` Hector Martin [this message]
2024-04-11 18:43       ` Hector Martin
2024-04-16  2:22       ` Zayd Qumsieh
2024-04-16  2:22         ` Zayd Qumsieh
2024-04-19 16:58         ` Will Deacon
2024-04-19 16:58           ` Will Deacon
2024-04-19 18:05           ` Catalin Marinas
2024-04-19 18:05             ` Catalin Marinas
2024-04-19 16:58     ` Will Deacon
2024-04-19 16:58       ` Will Deacon
2024-04-20 11:37       ` Marc Zyngier
2024-04-20 11:37         ` Marc Zyngier
2024-05-02  0:10         ` Zayd Qumsieh
2024-05-02  0:10           ` Zayd Qumsieh
2024-05-02 13:25           ` Marc Zyngier
2024-05-02 13:25             ` Marc Zyngier
2024-05-06  8:20             ` Jonas Oberhauser
2024-05-06  8:20               ` Jonas Oberhauser
2024-04-20 12:13       ` Eric Curtin
2024-04-20 12:13         ` Eric Curtin
2024-04-20 12:15         ` Eric Curtin
2024-04-20 12:15           ` Eric Curtin
2024-05-06 11:21         ` Sergio Lopez Pascual
2024-05-06 11:21           ` Sergio Lopez Pascual
2024-05-06 16:12           ` Marc Zyngier
2024-05-06 16:12             ` Marc Zyngier
2024-05-06 16:20             ` Eric Curtin
2024-05-06 16:20               ` Eric Curtin
2024-05-06 22:04             ` Sergio Lopez Pascual
2024-05-06 22:04               ` Sergio Lopez Pascual
2024-05-02  0:16   ` Zayd Qumsieh
2024-05-02  0:16     ` Zayd Qumsieh
2024-05-07 10:24   ` Alex Bennée
2024-05-07 10:24     ` Alex Bennée
2024-05-07 14:52     ` Ard Biesheuvel
2024-05-07 14:52       ` Ard Biesheuvel
2024-05-09 11:13       ` Catalin Marinas
2024-05-09 11:13         ` Catalin Marinas
2024-05-09 12:31         ` Neal Gompa
2024-05-09 12:31           ` Neal Gompa
2024-05-09 12:56           ` Catalin Marinas
2024-05-09 12:56             ` Catalin Marinas
2024-04-16  2:11 ` Zayd Qumsieh
2024-04-16  2:11   ` Zayd Qumsieh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6484dcd-ebf6-4b6f-be17-69b05539e33b@marcan.st \
    --to=marcan@marcan.st \
    --cc=Houdek.Ryan@fex-emu.org \
    --cc=akpm@linux-foundation.org \
    --cc=andy.chiu@sifive.com \
    --cc=anshuman.khandual@arm.com \
    --cc=ardb@kernel.org \
    --cc=asahi@lists.linux.dev \
    --cc=bhe@redhat.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cpaasch@apple.com \
    --cc=david@redhat.com \
    --cc=dawei.li@shingroup.cn \
    --cc=deller@gmx.de \
    --cc=ih_justin@apple.com \
    --cc=j.granados@samsung.com \
    --cc=joey.gouly@arm.com \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=miguel.luis@oracle.com \
    --cc=mjguzik@gmail.com \
    --cc=ojeda@kernel.org \
    --cc=oleg@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=omosnace@redhat.com \
    --cc=revest@chromium.org \
    --cc=samitolvanen@google.com \
    --cc=shr@devkernel.io \
    --cc=will@kernel.org \
    --cc=zayd_qumsieh@apple.com \
    --cc=zev@bewilderbeest.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.