All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	"joel@joelfernandes.org" <joel@joelfernandes.org>,
	"kaleshsingh@google.com" <kaleshsingh@google.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>
Subject: Re: [PATCH v7 00/11] Speedup mremap on ppc64
Date: Wed, 16 Jun 2021 11:44:39 +1000	[thread overview]
Message-ID: <1623807215.t2mo6ahd0q.astroid@bobo.none> (raw)
In-Reply-To: <CAHk-=wipa02d8tN-fCYJ=iH915yHtFr6wEDBcOeFtawVVF4niQ@mail.gmail.com>

Excerpts from Linus Torvalds's message of June 9, 2021 3:10 am:
> On Mon, Jun 7, 2021 at 3:10 AM Nick Piggin <npiggin@gmail.com> wrote:
>>
>> I'd really rather not do this, I'm not sure if micro benchmark captures everything.
> 
> I don't much care what powerpc code does _itnernally_ for this
> architecture-specific mis-design issue, but I really don't want to see
> more complex generic interfaces unless you have better hard numbers
> for them.
> 
> So far the numbers are: "no observable difference".
> 
> It would have to be not just observable, but actually meaningful for
> me to go "ok, we'll add this crazy flag that nobody else cares about".

Fair enough, will have to try get more numbers then I suppose.

> 
> And honestly, from everything I've seen on page table walker caches:
> they are great, but once you start remapping big ranges and
> invallidating megabytes of TLB's, the walker caches just aren't going
> to be your issue.

Remapping big ranges is going to have to invalidate intermediate caches
(aka PWC), so is unmapping. So we're stuck with the big hammer PWC 
invalidate there anyway.

It's mprotect and friends that would care here, possibly some THP thing...
but I guess those are probably down the list a little way.

I'm a bit less concerned about the PWCs that might be caching the regions
of the big mprotect() we just did, and more concerned about the effect 
of flushing all unrelated caches. Including on all other CPUs a threaded
program is running on. HANA, Java, are threaded and do mremaps, 
unfortunately.


> 
> But: numbers talk.  I'd take the sane generic interfaces as a first
> cut. If somebody then has really compelling numbers, we can _then_
> look at that "optimize for odd page table walker cache situation"
> case.

Yep okay. It's not the end of the world (or if it is we'd be able to get
numbers presumably).

> And in the meantime, maybe you can talk to the hardware people and
> tell them that you want the "flush range" capability to work right,
> and that if the walker cache is <i>so</i> important they shouldn't
> have made it a all-or-nothing flush.

I have, more than once :(

Fixing that would fix munmap etc cases as well, so yeah.

Thanks,
Nick


WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kaleshsingh@google.com" <kaleshsingh@google.com>,
	"joel@joelfernandes.org" <joel@joelfernandes.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v7 00/11] Speedup mremap on ppc64
Date: Wed, 16 Jun 2021 11:44:39 +1000	[thread overview]
Message-ID: <1623807215.t2mo6ahd0q.astroid@bobo.none> (raw)
In-Reply-To: <CAHk-=wipa02d8tN-fCYJ=iH915yHtFr6wEDBcOeFtawVVF4niQ@mail.gmail.com>

Excerpts from Linus Torvalds's message of June 9, 2021 3:10 am:
> On Mon, Jun 7, 2021 at 3:10 AM Nick Piggin <npiggin@gmail.com> wrote:
>>
>> I'd really rather not do this, I'm not sure if micro benchmark captures everything.
> 
> I don't much care what powerpc code does _itnernally_ for this
> architecture-specific mis-design issue, but I really don't want to see
> more complex generic interfaces unless you have better hard numbers
> for them.
> 
> So far the numbers are: "no observable difference".
> 
> It would have to be not just observable, but actually meaningful for
> me to go "ok, we'll add this crazy flag that nobody else cares about".

Fair enough, will have to try get more numbers then I suppose.

> 
> And honestly, from everything I've seen on page table walker caches:
> they are great, but once you start remapping big ranges and
> invallidating megabytes of TLB's, the walker caches just aren't going
> to be your issue.

Remapping big ranges is going to have to invalidate intermediate caches
(aka PWC), so is unmapping. So we're stuck with the big hammer PWC 
invalidate there anyway.

It's mprotect and friends that would care here, possibly some THP thing...
but I guess those are probably down the list a little way.

I'm a bit less concerned about the PWCs that might be caching the regions
of the big mprotect() we just did, and more concerned about the effect 
of flushing all unrelated caches. Including on all other CPUs a threaded
program is running on. HANA, Java, are threaded and do mremaps, 
unfortunately.


> 
> But: numbers talk.  I'd take the sane generic interfaces as a first
> cut. If somebody then has really compelling numbers, we can _then_
> look at that "optimize for odd page table walker cache situation"
> case.

Yep okay. It's not the end of the world (or if it is we'd be able to get
numbers presumably).

> And in the meantime, maybe you can talk to the hardware people and
> tell them that you want the "flush range" capability to work right,
> and that if the walker cache is <i>so</i> important they shouldn't
> have made it a all-or-nothing flush.

I have, more than once :(

Fixing that would fix munmap etc cases as well, so yeah.

Thanks,
Nick

  reply	other threads:[~2021-06-16  1:44 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07  5:51 [PATCH v7 00/11] Speedup mremap on ppc64 Aneesh Kumar K.V
2021-06-07  5:51 ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 01/11] mm/mremap: Fix race between MOVE_PMD mremap and pageout Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-08  0:06   ` Hugh Dickins
2021-06-08  0:06     ` Hugh Dickins
2021-06-08  7:52     ` Aneesh Kumar K.V
2021-06-08  7:52       ` Aneesh Kumar K.V
2021-06-08  9:42       ` Kirill A. Shutemov
2021-06-08  9:42         ` Kirill A. Shutemov
2021-06-08 11:17         ` Aneesh Kumar K.V
2021-06-08 11:17           ` Aneesh Kumar K.V
2021-06-08 12:05           ` Kirill A. Shutemov
2021-06-08 12:05             ` Kirill A. Shutemov
2021-06-08 20:39       ` Hugh Dickins
2021-06-08 20:39         ` Hugh Dickins
2021-06-07  5:51 ` [PATCH v7 02/11] mm/mremap: Fix race between MOVE_PUD " Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-14 14:55   ` [mm/mremap] ecf8443e51: vm-scalability.throughput -29.4% regression kernel test robot
2021-06-14 14:55     ` kernel test robot
2021-06-14 14:55     ` kernel test robot
2021-06-14 14:58     ` Linus Torvalds
2021-06-14 14:58       ` Linus Torvalds
2021-06-14 14:58       ` Linus Torvalds
2021-06-14 16:08     ` Aneesh Kumar K.V
2021-06-14 16:08       ` Aneesh Kumar K.V
2021-06-14 16:08       ` Aneesh Kumar K.V
2021-06-17  2:38       ` [LKP] " Liu, Yujie
2021-06-17  2:38         ` Liu, Yujie
2021-06-17  2:38         ` [LKP] " Liu, Yujie
2021-06-07  5:51 ` [PATCH v7 03/11] selftest/mremap_test: Update the test to handle pagesize other than 4K Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 04/11] selftest/mremap_test: Avoid crash with static build Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 05/11] mm/mremap: Convert huge PUD move to separate helper Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 06/11] mm/mremap: Don't enable optimized PUD move if page table levels is 2 Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 07/11] mm/mremap: Use pmd/pud_poplulate to update page table entries Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 08/11] powerpc/mm/book3s64: Fix possible build error Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 09/11] mm/mremap: Allow arch runtime override Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 10/11] powerpc/book3s64/mm: Update flush_tlb_range to flush page walk cache Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07  5:51 ` [PATCH v7 11/11] powerpc/mm: Enable HAVE_MOVE_PMD support Aneesh Kumar K.V
2021-06-07  5:51   ` Aneesh Kumar K.V
2021-06-07 10:10 ` [PATCH v7 00/11] Speedup mremap on ppc64 Nick Piggin
2021-06-07 10:10   ` Nick Piggin
2021-06-08  4:39   ` Aneesh Kumar K.V
2021-06-08  4:39     ` Aneesh Kumar K.V
2021-06-08  5:03     ` Nicholas Piggin
2021-06-08  5:03       ` Nicholas Piggin
2021-06-08 17:10   ` Linus Torvalds
2021-06-08 17:10     ` Linus Torvalds
2021-06-16  1:44     ` Nicholas Piggin [this message]
2021-06-16  1:44       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1623807215.t2mo6ahd0q.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=joel@joelfernandes.org \
    --cc=kaleshsingh@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.