Re: [PATCH v2 4/6] mm: drop VMA lock before waiting for migration

LKML Archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com,
	jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com,
	michel@lespinasse.org, liam.howlett@oracle.com,
	jglisse@google.com, vbabka@suse.cz, minchan@google.com,
	dave@stgolabs.net, punit.agrawal@bytedance.com,
	lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com,
	ying.huang@intel.com, david@redhat.com, yuzhao@google.com,
	dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk,
	brauner@kernel.org, pasha.tatashin@soleen.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v2 4/6] mm: drop VMA lock before waiting for migration
Date: Mon, 12 Jun 2023 14:34:30 -0400	[thread overview]
Message-ID: <ZIdlNj+X2HDwfCeN@x1n> (raw)
In-Reply-To: <CAJuCfpGZvhBUdfNHojXwqZbspuhy0bstjT+-JMfwgmnqTnkoHA@mail.gmail.com>

On Mon, Jun 12, 2023 at 09:07:38AM -0700, Suren Baghdasaryan wrote:
> On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> > > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > >
> > > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > > > lock was dropped while in handle_mm_fault().
> > > > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > > > there are no guarantees it was not freed.
> > > > >
> > > > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > > > still make them match on behaviors, or there's reason not to do so?
> > > >
> > > > I think we could match their behavior by also dropping mmap_lock here
> > > > when fault is handled under mmap_lock (!(fault->flags &
> > > > FAULT_FLAG_VMA_LOCK)).
> > > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > > > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > > > of reusing existing flags?
> > > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> > > above reply.
> > >
> > > I took a closer look into using VM_FAULT_COMPLETED instead of
> > > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> > > mmap_lock we need to retry with an indication that the per-vma lock
> > > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> > > indicate such state seems strange to me ("retry" and "complete" seem
> >
> > Not relevant to this migration patch, but for the whole idea I was thinking
> > whether it should just work if we simply:
> >
> >         fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> > -       vma_end_read(vma);
> > +       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > +               vma_end_read(vma);
> >
> > ?
> 
> Today when we can't handle a page fault under per-vma locks we return
> VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is

Oh I see what I missed.  I think it may not be a good idea to reuse
VM_FAULT_RETRY just for that, because it makes VM_FAULT_RETRY even more
complicated - now it adds one more case where the lock is not released,
that's when PER_VMA even if !NOWAIT.

> retried under mmap_lock. The condition you suggest above would not
> drop per-vma lock for VM_FAULT_RETRY case and would break the current
> fallback mechanism.
> However your suggestion gave me an idea. I could indicate that per-vma
> lock got dropped using vmf structure (like Matthew suggested before)
> and once handle_pte_fault(vmf) returns I could check if it returned
> VM_FAULT_RETRY but per-vma lock is still held.
> If that happens I can
> call vma_end_read() before returning from __handle_mm_fault(). That
> way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock
> will be already released, so your condition in do_page_fault() will
> work correctly. That would eliminate the need for a new
> VM_FAULT_VMA_UNLOCKED flag. WDYT?

Sounds good.

So probably that's the major pain for now with the legacy fallback (I'll
have commented if I noticed it with the initial vma lock support..).  I
assume that'll go away as long as we recover the VM_FAULT_RETRY semantics
to before.

-- 
Peter Xu

next prev parent reply	other threads:[~2023-06-12 18:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09  0:51 [PATCH v2 0/6] Per-vma lock support for swap and userfaults Suren Baghdasaryan
2023-06-09  0:51 ` [PATCH v2 1/6] swap: remove remnants of polling from read_swap_cache_async Suren Baghdasaryan
2023-06-09  1:57   ` Huang, Ying
2023-06-09  3:13     ` Ming Lei
2023-06-09 18:50       ` Suren Baghdasaryan
2023-06-12  4:53   ` Christoph Hellwig
2023-06-09  0:51 ` [PATCH v2 2/6] mm: handle swap page faults under VMA lock if page is uncontended Suren Baghdasaryan
2023-06-09 20:25   ` Peter Xu
2023-06-09 20:35     ` Matthew Wilcox
2023-06-09 20:45       ` Peter Xu
2023-06-09 22:34         ` Suren Baghdasaryan
2023-06-12 13:59           ` Peter Xu
2023-06-12 16:09             ` Suren Baghdasaryan
2023-06-09  0:51 ` [PATCH v2 3/6] mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED Suren Baghdasaryan
2023-06-09 20:34   ` Peter Xu
2023-06-09  0:51 ` [PATCH v2 4/6] mm: drop VMA lock before waiting for migration Suren Baghdasaryan
2023-06-09 20:42   ` Peter Xu
2023-06-09 22:30     ` Suren Baghdasaryan
2023-06-10  1:29       ` Suren Baghdasaryan
2023-06-12 13:36         ` Peter Xu
2023-06-12 16:07           ` Suren Baghdasaryan
2023-06-12 18:34             ` Peter Xu [this message]
2023-06-12 18:44               ` Suren Baghdasaryan
2023-06-12 18:57                 ` Peter Xu
2023-06-12 13:28       ` Peter Xu
2023-06-12 15:41         ` Suren Baghdasaryan
2023-06-09  0:51 ` [PATCH v2 5/6] mm: implement folio wait under VMA lock Suren Baghdasaryan
2023-06-09 15:03   ` Matthew Wilcox
2023-06-09 18:49     ` Suren Baghdasaryan
2023-06-09 18:55       ` Suren Baghdasaryan
2023-06-09 19:36         ` Matthew Wilcox
2023-06-09 22:49           ` Suren Baghdasaryan
2023-06-09 20:54   ` Peter Xu
2023-06-09 22:48     ` Suren Baghdasaryan
2023-06-12 13:56       ` Peter Xu
2023-06-09  0:51 ` [PATCH v2 6/6] mm: handle userfaults " Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZIdlNj+X2HDwfCeN@x1n \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=brauner@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@android.com \
    --cc=laurent.dufour@fr.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).