All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: paulmck@kernel.org
Cc: Boqun Feng <boqun.feng@gmail.com>, Marco Elver <elver@google.com>,
	 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	 Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	 syzbot <syzbot+b7c3ba8cdc2f6cf83c21@syzkaller.appspotmail.com>,
	 linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
	 Nathan Chancellor <nathan@kernel.org>,
	Arnd Bergmann <arnd@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	 Jiri Slaby <jirislaby@kernel.org>
Subject: Re: [PATCH v3] tty: tty_io: remove hung_up_tty_fops
Date: Sat, 4 May 2024 12:11:10 -0700	[thread overview]
Message-ID: <CAHk-=wi8mArAxxkO78CTSVRCyjim4hpGbzf2NFxNMAdXWR3oJA@mail.gmail.com> (raw)
In-Reply-To: <7aab574e-b6ba-4038-802f-2d64af09d802@paulmck-laptop>

On Sat, 4 May 2024 at 11:18, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> Here is my current thoughts for possible optimizations of non-volatile
> memory_order_relaxed atomics:
>
> o       Loads from the same variable that can legitimately be
>         reordered to be adjacent to one another can be fused
>         into a single load.

Let's call this "Rule 1"

I think you can extend this to also "can be forwarded from a previous store".

I also think you're too strict in saying "fused into a single load".
Let me show an example below.

> o       Stores to the same variable that can legitimately be
>         reordered to be adjacent to one another can be replaced
>         by the last store in the series.

Rule 2.

Ack, although again, I think you're being a bit too strict in your
language, and the rule can be relaxed.

> o       Loads and stores may not be invented.

Rule 3.

I think you can and should relax this. You can invent loads all you want.

> o       The only way that a computation based on the value from
>         a given load can instead use some other load is if the
>         two loads are fused into a single load.

Rule 4.

I'm not convinced that makes sense, and I don't think it's true as written.

I think I understand what you are trying to say, but I think you're
saying it in a way that only confuses a compiler person.

In particular, the case I do not think is true is very much the
"spill" case: if you have code like this:

    a = expression involving '__atomic_load_n(xyz, RELAXED)'

then it's perfectly fine to spill the result of that load and reload
the value. So the "computation based on the value" *is* actually based
on "some other load" (the reload).

I really *REALLY* think you need to explain the semantics in concrete
terms that a compiler writer would understand and agree with.

So to explain your rules to an actual compiler person (and relax the
semantics a bit) I would rewrite your rules as:

 Rule 1: a strictly dominating load can be replaced by the value of a
preceding load or store

 Ruie 2: a strictly dominating store can remove preceding stores

 Rule 3: stores cannot be done speculatively (put another way: a
subsequent dominating store can only *remove* a store entirely, it
can't turn the store into one with speculative data)

 Rule 4: loads cannot be rematerialized (ie a load can be *combined*
as per Rule 1, but a load cannot be *split* into two loads)

Anyway, let's get to the examples of *why* I think your language was
bad and your rules were too strict.

Let's start with your Rule 3, where you said:

 - Loads and stores may not be invented

and while I think this should be very very true for stores, I think
inventing loads is not only valid, but a good idea. Example:

    if (a)
        b = __atomic_load_n(ptr) + 1;

can perfectly validly just be written as

    tmp = __atomic_load_n(ptr);
    if (a)
        b = tmp+1;

which in turn may allow other optimizations (ie depending on how 'b'
is used, the conditional may go away entirely, and you just end up
with 'b = tmp+!!a').

There's nothing wrong with extra loads that aren't used.

And to make that more explicit, let's look at Rule 1:

Example of Rule 1 (together with the above case):

    if (a)
        b = __atomic_load_n(ptr) + 1;
    else
        b =  __atomic_load_n(ptr) + 2;
    c = __atomic_load_n(ptr) + 3;

then that can perfectly validly re-write this all as

    tmp = __atomic_load_n(ptr);
    b = a ? tmp+1 : tmp+2;
    c = tmp + 3;

because my version of Rule 1 allows the dominating load used for 'c'
to be replaced by the value of a preceding load that was used for 'a'
and 'b'.

And to give an example of Rule 2, where you said "reordered to be
adjacent", I'm saying that all that matters is being strictly
dominant, so

    if (a)
        __atomic_store_n(ptr,1);
    else
        __atomic_store_n(ptr,2);
    __atomic_store_n(ptr,3);

can be perfectly validly be combined into just

    __atomic_store_n(ptr,3);

because the third store completely dominates the two others, even if
in the intermediate form they are not necessarily ever "adjacent".

(Your "adjacency" model might still be valid in how you could turn
first of the first stores to be a fall-through, then remove it, and
then turn the other to be a fall-through and then remove it, so maybe
your language isn't _tecnically_ wrong, But I think the whole
"dominating store" is how a compiler writer would think about it).

Anyway, the above are all normal optimizations that compilers
*already* do, and the only real issue is that with memory ordering,
the "dominance" thing will need to take into account the ordering
requirements of other memory operations with stricter memory ordering
in between. So, for example, if you have

    __atomic_store_n(ptr,1, RELAXED);
    __atomic_load_n(otherptr,2, ACQUIRE);
    __atomic_store_n(ptr,2, RELAXED);

then the second store doesn't dominate the first store, because
there's a stricter memory ordering instruction in between.

But I think the *important* rule is that "Rule 4", which is what
actually gives us the true "atomicity" rule:

 Loads cannot be rematerialized

IOW, when we do a

    a = __atomic_load_n(ptr);

and use the value of 'a' at any point later, it can *never* have
turned into a re-load of that ptr value. Yes, the load might have been
combined with an earlier one, and the load may have been moved up
earlier (by the "you can invent loads and then  combine them with the
later real one" rule).

So even atomic loads can be combined, they can be moved to outside of
loops etc. BUT THEY CANNOT BE RE-DONE.

I think that's really the fundamnetally different rule, and the one
that makes us use 'volatile' in many cases.

So we have a "READ_ONCE()" macro, but in many cases it's really
"READ_AT_MOST_ONCE()". It's usually not that you can't optimize the
read away - you can - but you absolutely must not turn it into two
reads.

Turning a read into two reads is what makes those TOCTOU issues problematic.

Honestly, I'd love to just have that "Rule 4" be there for *all*
variable accesses. Not just __atomic ones. If we had a compiler flag
like "-fno-TUCTOU", I'd enable it.

                    Linus

  reply	other threads:[~2024-05-04 19:11 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-21  8:18 [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4) syzbot
2023-04-21  8:21 ` Dmitry Vyukov
2023-04-21 15:12   ` Tetsuo Handa
2023-04-21 16:02     ` Tetsuo Handa
2023-04-23 23:34     ` Al Viro
2023-04-23 23:55       ` Tetsuo Handa
2023-04-24  0:44         ` Al Viro
2023-04-24  1:09           ` Tetsuo Handa
2023-04-25 14:47             ` Tetsuo Handa
2023-04-25 16:03               ` Al Viro
2023-04-25 22:09                 ` Tetsuo Handa
2023-04-26 11:05                   ` [PATCH] tty: tty_io: remove hung_up_tty_fops Tetsuo Handa
2023-04-28 16:27                     ` Nathan Chancellor
2023-04-28 16:41                       ` Tetsuo Handa
2023-04-28 17:11                         ` Al Viro
2023-04-29 10:43                           ` Tetsuo Handa
2023-04-28 17:31                         ` Greg Kroah-Hartman
2023-04-29 15:21                           ` Guenter Roeck
2023-05-01 18:42                             ` Geert Uytterhoeven
2023-05-14  1:02                     ` [PATCH v2] " Tetsuo Handa
2023-05-30 10:44                       ` Greg Kroah-Hartman
2023-05-30 11:57                         ` Tetsuo Handa
2023-05-30 12:51                           ` Greg Kroah-Hartman
2024-04-27  6:20                             ` [PATCH v3] " Tetsuo Handa
2024-04-27 19:02                               ` Linus Torvalds
2024-04-28 10:19                                 ` Tetsuo Handa
2024-04-28 18:50                                   ` Linus Torvalds
2024-04-29 13:55                                     ` Marco Elver
2024-04-29 15:38                                       ` Linus Torvalds
2024-05-01 18:45                                         ` Paul E. McKenney
2024-05-01 18:56                                           ` Linus Torvalds
2024-05-01 19:02                                             ` Paul E. McKenney
2024-05-01 20:14                                               ` Marco Elver
2024-05-01 21:06                                                 ` Linus Torvalds
2024-05-01 21:20                                                   ` Linus Torvalds
2024-05-01 21:49                                                     ` Paul E. McKenney
2024-05-01 22:32                                                       ` Paul E. McKenney
2024-05-02 16:37                                                         ` Boqun Feng
2024-05-03 23:59                                                           ` Paul E. McKenney
2024-05-04  0:14                                                             ` Linus Torvalds
2024-05-04  5:08                                                               ` Paul E. McKenney
2024-05-04 17:50                                                                 ` Linus Torvalds
2024-05-04 18:18                                                                   ` Paul E. McKenney
2024-05-04 19:11                                                                     ` Linus Torvalds [this message]
2024-05-04 19:25                                                                       ` Linus Torvalds
2024-05-04 22:17                                                                         ` Paul E. McKenney
2024-05-04 22:04                                                                       ` Paul E. McKenney
2024-05-02 14:14                                                   ` Marco Elver
2024-05-02 16:42                                                     ` Tetsuo Handa
2024-05-02 17:20                                                       ` Marco Elver
2024-05-02 17:29                                                       ` Linus Torvalds
2024-05-02 18:14                                                         ` Al Viro
2024-05-02 19:29                                                           ` Marco Elver
2024-05-02 23:54                                                         ` Tetsuo Handa
2024-05-03  1:12                                                           ` Linus Torvalds
2023-04-23 13:28   ` [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4) Tetsuo Handa
2023-04-23 14:00     ` Greg Kroah-Hartman
2023-04-23 14:03     ` Greg Kroah-Hartman
2023-04-23 14:17       ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wi8mArAxxkO78CTSVRCyjim4hpGbzf2NFxNMAdXWR3oJA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=arnd@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nathan@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=syzbot+b7c3ba8cdc2f6cf83c21@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.