From: "Maciej W. Rozycki" <macro@orcam.me.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
Arnd Bergmann <arnd@kernel.org>,
linux-alpha@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
Richard Henderson <richard.henderson@linaro.org>,
Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
Matt Turner <mattst88@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Marc Zyngier <maz@kernel.org>,
linux-kernel@vger.kernel.org, Michael Cree <mcree@orcon.net.nz>,
Frank Scheiner <frank.scheiner@web.de>
Subject: Re: [PATCH 00/14] alpha: cleanups for 6.10
Date: Thu, 30 May 2024 23:57:29 +0100 (BST) [thread overview]
Message-ID: <alpine.DEB.2.21.2405302115130.23854@angie.orcam.me.uk> (raw)
In-Reply-To: <CAHk-=wi7WfDSfunEXmCqDnH+55gumjhDar-KO_=66ziuP33piw@mail.gmail.com>
On Wed, 29 May 2024, Linus Torvalds wrote:
> > The only difference here is that with
> > hardware read-modify-write operations atomicity for sub-word accesses is
> > guaranteed by the ISA, however for software read-modify-write it has to be
> > explictly coded using the usual load-locked/store-conditional sequence in
> > a loop.
>
> I have some bad news for you: the old alpha CPU's not only screwed up
> the byte/word design, they _also_ screwed up the
> load-locked/store-conditional.
>
> You'd think that LL/SC would be done at a cacheline level, like any
> sane person would do.
>
> But no.
>
> The 21064 actually did atomicity with an external pin on the bus, the
> same way people used to do before caches even existed.
Umm, 8086's LOCK#, anyone?
> Yes, it has an internal L1 D$, but it is a write-through cache, and
> clearly things like cache coherency weren't designed for. In fact,
> LL/SC is even documented to not work in the external L2 cache
> ("Bcache" - don't ask me why the odd naming).
Board cache, I suppose.
> So LL/SC on the 21064 literally works on external memory.
>
> Quoting the reference manual:
>
> "A.6 Load Locked and Store Conditional
> The 21064 provides the ability to perform locked memory accesses through
> the LDxL (Load_Locked) and STxC (Store_Conditional) cycle command pair.
> The LDxL command forces the 21064 to bypass the Bcache and request data
> directly from the external memory interface. The memory interface logic must
> set a special interlock flag as it returns the data, and may
> optionally keep the
> locked address"
>
> End result: a LL/SC pair is very very slow. It was incredibly slow
> even for the time. I had benchmarks, I can't recall them, but I'd like
> to say "hundreds of cycles". Maybe thousands.
Interesting and disappointing, given how many years the Alpha designers
had to learn from the MIPS R4000. Which they borrowed from already after
all and which they had first-hand experience with present onboard, from
the R4000 DECstation systems built at their WSE facility. Hmm, I wonder
if there was patent avoidance involved.
> So actual reliable byte operations are not realistically possible on
> the early alpha CPU's. You can do them with LL/SC, sure, but
> performance would be so horrendously bad that it would be just sad.
Hmm, performance with a 30 years old system? Who cares! It mattered 30
years ago, maybe 25. And the performance of a system that runs slowly is
still infinitely better than one of a system that doesn't boot anymore,
isn't it?
> The 21064A had some "fast lock" mode which allows the data from the
> LDQ_L to come from the Bcache. So it still isn't exactly fast, and it
> still didn't work at CPU core speeds, but at least it worked with the
> external cache.
>
> Compilers will generate the sequence that DEC specified, which isn't
> thread-safe.
>
> In fact, it's worse than "not thread safe". It's not even safe on UP
> with interrupts, or even signals in user space.
Ouch, I find it a surprising oversight. Come to think of it indeed the
plain unlocked read-modify-write sequences are unsafe. I don't suppose
any old DECies are still around, but any idea how this was sorted in DEC's
own commercial operating systems (DU and OVMS)?
So this seems like something that needs to be sorted in the compiler, by
always using a locked sequence for 8-bit and 16-bit writes with non-BWX
targets. I can surely do it myself, not a big deal, and I reckon such a
change to GCC should be pretty compact and self-contained, as all the bits
are already within `alpha_expand_mov_nobwx' anyway.
I'm not sure if Richard will be happy to accept it, but it seems to me
the right thing to do at this point and with that in place there should be
no safety concern for RCU or anything with the old Alphas, with no effort
at all on the Linux side as all the burden will be on the compiler. We
may want to probe for the associated compiler option though and bail out
if unsupported.
Will it be enough to keep Linux support at least until the next obstacle?
Maciej
next prev parent reply other threads:[~2024-05-30 22:57 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 8:11 [PATCH 00/14] alpha: cleanups for 6.10 Arnd Bergmann
2024-05-03 8:11 ` [PATCH 01/14] alpha: sort scr_mem{cpy,move}w() out Arnd Bergmann
2024-05-03 8:11 ` [PATCH 02/14] alpha: fix modversions for strcpy() et.al Arnd Bergmann
2024-05-03 8:11 ` [PATCH 03/14] alpha: add clone3() support Arnd Bergmann
2024-05-03 8:11 ` [PATCH 04/14] alpha: don't make functions public without a reason Arnd Bergmann
2024-05-03 8:11 ` [PATCH 05/14] alpha: sys_sio: fix misspelled ifdefs Arnd Bergmann
2024-05-03 8:11 ` [PATCH 06/14] alpha: missing includes Arnd Bergmann
2024-05-03 8:11 ` [PATCH 07/14] alpha: core_lca: take the unused functions out Arnd Bergmann
2024-05-03 8:11 ` [PATCH 08/14] alpha: jensen, t2 - make __EXTERN_INLINE same as for the rest Arnd Bergmann
2024-05-03 8:11 ` [PATCH 09/14] alpha: trim the unused stuff from asm-offsets.c Arnd Bergmann
2024-05-03 8:11 ` [PATCH 10/14] alpha: remove DECpc AXP150 (Jensen) support Arnd Bergmann
2024-05-03 16:07 ` Linus Torvalds
2024-05-03 17:00 ` Al Viro
2024-05-03 20:07 ` Arnd Bergmann
2024-05-03 8:11 ` [PATCH 11/14] alpha: sable: remove early machine support Arnd Bergmann
2024-05-03 8:11 ` [PATCH 12/14] alpha: remove LCA and APECS based machines Arnd Bergmann
2024-05-03 8:11 ` [PATCH 13/14] alpha: cabriolet: remove EV5 CPU support Arnd Bergmann
2024-05-03 8:11 ` [PATCH 14/14] alpha: drop pre-EV56 support Arnd Bergmann
2024-05-04 15:00 ` Richard Henderson
2024-05-06 10:06 ` Arnd Bergmann
2024-06-03 6:02 ` Jiri Slaby
2024-06-04 13:58 ` Greg KH
2024-05-03 16:06 ` [PATCH 00/14] alpha: cleanups for 6.10 Matt Turner
2024-05-03 20:15 ` Arnd Bergmann
2024-05-06 9:16 ` Michael Cree
2024-05-06 10:11 ` Arnd Bergmann
2024-05-03 16:53 ` John Paul Adrian Glaubitz
2024-05-03 17:19 ` Paul E. McKenney
2024-05-27 23:49 ` Maciej W. Rozycki
2024-05-28 14:43 ` Paul E. McKenney
2024-05-29 18:50 ` Maciej W. Rozycki
2024-05-29 22:09 ` Paul E. McKenney
2024-05-30 22:59 ` Maciej W. Rozycki
2024-05-31 3:56 ` Maciej W. Rozycki
2024-05-31 19:33 ` Paul E. McKenney
2024-06-03 16:22 ` Maciej W. Rozycki
2024-06-03 17:08 ` Paul E. McKenney
2024-07-01 23:50 ` Maciej W. Rozycki
2024-05-30 1:08 ` Linus Torvalds
2024-05-30 22:57 ` Maciej W. Rozycki [this message]
2024-05-31 0:10 ` Linus Torvalds
2024-06-03 11:09 ` Maciej W. Rozycki
2024-06-03 11:36 ` John Paul Adrian Glaubitz
2024-06-03 16:57 ` Linus Torvalds
2024-07-01 23:48 ` Maciej W. Rozycki
2024-05-31 15:48 ` Arnd Bergmann
2024-05-31 16:32 ` Linus Torvalds
2024-05-31 16:54 ` Arnd Bergmann
2024-06-01 13:51 ` David Laight
2024-07-01 23:48 ` Maciej W. Rozycki
2024-07-02 1:13 ` Linus Torvalds
2024-07-03 0:12 ` Maciej W. Rozycki
2024-07-03 0:50 ` Linus Torvalds
2024-07-04 22:21 ` Maciej W. Rozycki
2024-06-03 11:33 ` Maciej W. Rozycki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.21.2405302115130.23854@angie.orcam.me.uk \
--to=macro@orcam.me.uk \
--cc=arnd@arndb.de \
--cc=arnd@kernel.org \
--cc=frank.scheiner@web.de \
--cc=glaubitz@physik.fu-berlin.de \
--cc=ink@jurassic.park.msu.ru \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mattst88@gmail.com \
--cc=maz@kernel.org \
--cc=mcree@orcon.net.nz \
--cc=paulmck@kernel.org \
--cc=richard.henderson@linaro.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).