Re: [PATCH v2 2/2] mm/hugetlb: support write-faults in shared mappings

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, stable <stable@vger.kernel.org>,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] mm/hugetlb: support write-faults in shared mappings
Date: Mon, 15 Aug 2022 17:07:32 +0200	[thread overview]
Message-ID: <CADFyXm7-0zXDG+ZHjft95aAAiSZh_RyAqgJw1nGsALwEL1XKiw@mail.gmail.com> (raw)
In-Reply-To: <20220815153549.0288a9c6@thinkpad>

On Mon, Aug 15, 2022 at 3:36 PM Gerald Schaefer
<gerald.schaefer@linux.ibm.com> wrote:
>
> On Thu, 11 Aug 2022 11:59:09 -0700
> Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> > On 08/11/22 12:34, David Hildenbrand wrote:
> > > If we ever get a write-fault on a write-protected page in a shared mapping,
> > > we'd be in trouble (again). Instead, we can simply map the page writable.
> > >
> > <snip>
> > >
> > > Reason is that uffd-wp doesn't clear the uffd-wp PTE bit when
> > > unregistering and consequently keeps the PTE writeprotected. Reason for
> > > this is to avoid the additional overhead when unregistering. Note
> > > that this is the case also for !hugetlb and that we will end up with
> > > writable PTEs that still have the uffd-wp PTE bit set once we return
> > > from hugetlb_wp(). I'm not touching the uffd-wp PTE bit for now, because it
> > > seems to be a generic thing -- wp_page_reuse() also doesn't clear it.
> > >
> > > VM_MAYSHARE handling in hugetlb_fault() for FAULT_FLAG_WRITE
> > > indicates that MAP_SHARED handling was at least envisioned, but could never
> > > have worked as expected.
> > >
> > > While at it, make sure that we never end up in hugetlb_wp() on write
> > > faults without VM_WRITE, because we don't support maybe_mkwrite()
> > > semantics as commonly used in the !hugetlb case -- for example, in
> > > wp_page_reuse().
> >
> > Nit,
> > to me 'make sure that we never end up in hugetlb_wp()' implies that
> > we would check for condition in callers as opposed to first thing in
> > hugetlb_wp().  However, I am OK with description as it.
>

Hi Gerald,

> Is that new WARN_ON_ONCE() in hugetlb_wp() meant to indicate a real bug?

Most probably, unless I am missing something important.

Something triggers FAULT_FLAG_WRITE on a VMA without VM_WRITE and
hugetlb_wp() would map the pte writable.
Consequently, we'd have a writable pte inside a VMA that does not have
write permissions, which is dubious. My check prevents that and bails
out.

Ordinary (!hugetlb) faults have maybe_mkwrite() (e.g., for FOLL_FORCE
or breaking COW) semantics such that we won't be mapping PTEs writable
if the VMA does not have write permissions.

I suspect that either

a) Some write fault misses a protection check and ends up triggering a
FAULT_FLAG_WRITE where we should actually fail early.

b) The write fault is valid and some VMA misses proper flags (VM_WRITE).

c) The write fault is valid (e.g., for breaking COW or FOLL_FORCE) and
we'd actually want maybe_mkwrite semantics.

> It is triggered by libhugetlbfs testcase "HUGETLB_ELFMAP=R linkhuge_rw"
> (at least on s390), and crashes our CI, because it runs with panic_on_warn
> enabled.
>
> Not sure if this means that we have bug elsewhere, allowing us to
> get to the WARN in hugetlb_wp().

That's what I suspect. Do you have a backtrace?

Note that I'm on vacation this week and might not reply as fast as usual.

next prev parent reply	other threads:[~2022-08-15 15:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-11 10:34 [PATCH v2 0/2] mm/hugetlb: fix write-fault handling for shared mappings David Hildenbrand
2022-08-11 10:34 ` [PATCH v2 1/2] mm/hugetlb: fix hugetlb not supporting softdirty tracking David Hildenbrand
2022-08-11 18:27   ` Mike Kravetz
2022-08-11 10:34 ` [PATCH v2 2/2] mm/hugetlb: support write-faults in shared mappings David Hildenbrand
2022-08-11 13:59   ` Peter Xu
2022-08-11 16:24     ` David Hildenbrand
2022-08-11 18:59   ` Mike Kravetz
2022-08-15 13:35     ` Gerald Schaefer
2022-08-15 15:07       ` David Hildenbrand [this message]
2022-08-15 15:59         ` Gerald Schaefer
2022-08-15 18:03           ` David Hildenbrand
2022-08-15 18:38             ` Gerald Schaefer
2022-08-15 21:43               ` Mike Kravetz
2022-08-16  9:33                 ` Gerald Schaefer
2022-08-16 20:43                   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADFyXm7-0zXDG+ZHjft95aAAiSZh_RyAqgJw1nGsALwEL1XKiw@mail.gmail.com \
    --to=david@redhat.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.