From: Nicholas Piggin <npiggin@gmail.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Christophe Leroy <christophe.leroy@csgroup.eu>, "joel@joelfernandes.org" <joel@joelfernandes.org>, "kaleshsingh@google.com" <kaleshsingh@google.com>, "Kirill A . Shutemov" <kirill@shutemov.name>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>, "mpe@ellerman.id.au" <mpe@ellerman.id.au> Subject: Re: [PATCH v7 00/11] Speedup mremap on ppc64 Date: Wed, 16 Jun 2021 11:44:39 +1000 [thread overview] Message-ID: <1623807215.t2mo6ahd0q.astroid@bobo.none> (raw) In-Reply-To: <CAHk-=wipa02d8tN-fCYJ=iH915yHtFr6wEDBcOeFtawVVF4niQ@mail.gmail.com> Excerpts from Linus Torvalds's message of June 9, 2021 3:10 am: > On Mon, Jun 7, 2021 at 3:10 AM Nick Piggin <npiggin@gmail.com> wrote: >> >> I'd really rather not do this, I'm not sure if micro benchmark captures everything. > > I don't much care what powerpc code does _itnernally_ for this > architecture-specific mis-design issue, but I really don't want to see > more complex generic interfaces unless you have better hard numbers > for them. > > So far the numbers are: "no observable difference". > > It would have to be not just observable, but actually meaningful for > me to go "ok, we'll add this crazy flag that nobody else cares about". Fair enough, will have to try get more numbers then I suppose. > > And honestly, from everything I've seen on page table walker caches: > they are great, but once you start remapping big ranges and > invallidating megabytes of TLB's, the walker caches just aren't going > to be your issue. Remapping big ranges is going to have to invalidate intermediate caches (aka PWC), so is unmapping. So we're stuck with the big hammer PWC invalidate there anyway. It's mprotect and friends that would care here, possibly some THP thing... but I guess those are probably down the list a little way. I'm a bit less concerned about the PWCs that might be caching the regions of the big mprotect() we just did, and more concerned about the effect of flushing all unrelated caches. Including on all other CPUs a threaded program is running on. HANA, Java, are threaded and do mremaps, unfortunately. > > But: numbers talk. I'd take the sane generic interfaces as a first > cut. If somebody then has really compelling numbers, we can _then_ > look at that "optimize for odd page table walker cache situation" > case. Yep okay. It's not the end of the world (or if it is we'd be able to get numbers presumably). > And in the meantime, maybe you can talk to the hardware people and > tell them that you want the "flush range" capability to work right, > and that if the walker cache is <i>so</i> important they shouldn't > have made it a all-or-nothing flush. I have, more than once :( Fixing that would fix munmap etc cases as well, so yeah. Thanks, Nick
WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "kaleshsingh@google.com" <kaleshsingh@google.com>, "joel@joelfernandes.org" <joel@joelfernandes.org>, "Kirill A . Shutemov" <kirill@shutemov.name>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org> Subject: Re: [PATCH v7 00/11] Speedup mremap on ppc64 Date: Wed, 16 Jun 2021 11:44:39 +1000 [thread overview] Message-ID: <1623807215.t2mo6ahd0q.astroid@bobo.none> (raw) In-Reply-To: <CAHk-=wipa02d8tN-fCYJ=iH915yHtFr6wEDBcOeFtawVVF4niQ@mail.gmail.com> Excerpts from Linus Torvalds's message of June 9, 2021 3:10 am: > On Mon, Jun 7, 2021 at 3:10 AM Nick Piggin <npiggin@gmail.com> wrote: >> >> I'd really rather not do this, I'm not sure if micro benchmark captures everything. > > I don't much care what powerpc code does _itnernally_ for this > architecture-specific mis-design issue, but I really don't want to see > more complex generic interfaces unless you have better hard numbers > for them. > > So far the numbers are: "no observable difference". > > It would have to be not just observable, but actually meaningful for > me to go "ok, we'll add this crazy flag that nobody else cares about". Fair enough, will have to try get more numbers then I suppose. > > And honestly, from everything I've seen on page table walker caches: > they are great, but once you start remapping big ranges and > invallidating megabytes of TLB's, the walker caches just aren't going > to be your issue. Remapping big ranges is going to have to invalidate intermediate caches (aka PWC), so is unmapping. So we're stuck with the big hammer PWC invalidate there anyway. It's mprotect and friends that would care here, possibly some THP thing... but I guess those are probably down the list a little way. I'm a bit less concerned about the PWCs that might be caching the regions of the big mprotect() we just did, and more concerned about the effect of flushing all unrelated caches. Including on all other CPUs a threaded program is running on. HANA, Java, are threaded and do mremaps, unfortunately. > > But: numbers talk. I'd take the sane generic interfaces as a first > cut. If somebody then has really compelling numbers, we can _then_ > look at that "optimize for odd page table walker cache situation" > case. Yep okay. It's not the end of the world (or if it is we'd be able to get numbers presumably). > And in the meantime, maybe you can talk to the hardware people and > tell them that you want the "flush range" capability to work right, > and that if the walker cache is <i>so</i> important they shouldn't > have made it a all-or-nothing flush. I have, more than once :( Fixing that would fix munmap etc cases as well, so yeah. Thanks, Nick
next prev parent reply other threads:[~2021-06-16 1:44 UTC|newest] Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-07 5:51 [PATCH v7 00/11] Speedup mremap on ppc64 Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 01/11] mm/mremap: Fix race between MOVE_PMD mremap and pageout Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-08 0:06 ` Hugh Dickins 2021-06-08 0:06 ` Hugh Dickins 2021-06-08 7:52 ` Aneesh Kumar K.V 2021-06-08 7:52 ` Aneesh Kumar K.V 2021-06-08 9:42 ` Kirill A. Shutemov 2021-06-08 9:42 ` Kirill A. Shutemov 2021-06-08 11:17 ` Aneesh Kumar K.V 2021-06-08 11:17 ` Aneesh Kumar K.V 2021-06-08 12:05 ` Kirill A. Shutemov 2021-06-08 12:05 ` Kirill A. Shutemov 2021-06-08 20:39 ` Hugh Dickins 2021-06-08 20:39 ` Hugh Dickins 2021-06-07 5:51 ` [PATCH v7 02/11] mm/mremap: Fix race between MOVE_PUD " Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-14 14:55 ` [mm/mremap] ecf8443e51: vm-scalability.throughput -29.4% regression kernel test robot 2021-06-14 14:55 ` kernel test robot 2021-06-14 14:55 ` kernel test robot 2021-06-14 14:58 ` Linus Torvalds 2021-06-14 14:58 ` Linus Torvalds 2021-06-14 14:58 ` Linus Torvalds 2021-06-14 16:08 ` Aneesh Kumar K.V 2021-06-14 16:08 ` Aneesh Kumar K.V 2021-06-14 16:08 ` Aneesh Kumar K.V 2021-06-17 2:38 ` [LKP] " Liu, Yujie 2021-06-17 2:38 ` Liu, Yujie 2021-06-17 2:38 ` [LKP] " Liu, Yujie 2021-06-07 5:51 ` [PATCH v7 03/11] selftest/mremap_test: Update the test to handle pagesize other than 4K Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 04/11] selftest/mremap_test: Avoid crash with static build Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 05/11] mm/mremap: Convert huge PUD move to separate helper Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 06/11] mm/mremap: Don't enable optimized PUD move if page table levels is 2 Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 07/11] mm/mremap: Use pmd/pud_poplulate to update page table entries Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 08/11] powerpc/mm/book3s64: Fix possible build error Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 09/11] mm/mremap: Allow arch runtime override Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 10/11] powerpc/book3s64/mm: Update flush_tlb_range to flush page walk cache Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 5:51 ` [PATCH v7 11/11] powerpc/mm: Enable HAVE_MOVE_PMD support Aneesh Kumar K.V 2021-06-07 5:51 ` Aneesh Kumar K.V 2021-06-07 10:10 ` [PATCH v7 00/11] Speedup mremap on ppc64 Nick Piggin 2021-06-07 10:10 ` Nick Piggin 2021-06-08 4:39 ` Aneesh Kumar K.V 2021-06-08 4:39 ` Aneesh Kumar K.V 2021-06-08 5:03 ` Nicholas Piggin 2021-06-08 5:03 ` Nicholas Piggin 2021-06-08 17:10 ` Linus Torvalds 2021-06-08 17:10 ` Linus Torvalds 2021-06-16 1:44 ` Nicholas Piggin [this message] 2021-06-16 1:44 ` Nicholas Piggin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1623807215.t2mo6ahd0q.astroid@bobo.none \ --to=npiggin@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=aneesh.kumar@linux.ibm.com \ --cc=christophe.leroy@csgroup.eu \ --cc=joel@joelfernandes.org \ --cc=kaleshsingh@google.com \ --cc=kirill@shutemov.name \ --cc=linux-mm@kvack.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mpe@ellerman.id.au \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.