All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Zi Yan <zi.yan@cs.rutgers.edu>,
	"Aneesh Kumar K.V" <aneesh.kumar@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm: Fix race between __split_huge_pmd_locked() and GUP-fast
Date: Mon, 29 Apr 2024 10:41:20 -0400	[thread overview]
Message-ID: <28269666-40B5-4610-8547-81FF68B7613B@nvidia.com> (raw)
In-Reply-To: <45d59bbe-b8ec-4725-8a4d-c715130036a3@arm.com>

[-- Attachment #1: Type: text/plain, Size: 3902 bytes --]

On 29 Apr 2024, at 2:17, Anshuman Khandual wrote:

> On 4/28/24 02:18, Zi Yan wrote:
>> On 27 Apr 2024, at 16:45, Zi Yan wrote:
>>
>>> On 27 Apr 2024, at 15:11, John Hubbard wrote:
>>>
>>>> On 4/27/24 8:14 AM, Zi Yan wrote:
>>>>> On 27 Apr 2024, at 0:41, John Hubbard wrote:
>>>>>> On 4/25/24 10:07 AM, Ryan Roberts wrote:
>>>>>>> __split_huge_pmd_locked() can be called for a present THP, devmap or
>>>>>>> (non-present) migration entry. It calls pmdp_invalidate()
>>>>>>> unconditionally on the pmdp and only determines if it is present or not
>>>>>>> based on the returned old pmd. This is a problem for the migration entry
>>>>>>> case because pmd_mkinvalid(), called by pmdp_invalidate() must only be
>>>>>>> called for a present pmd.
>>>>>>>
>>>>>>> On arm64 at least, pmd_mkinvalid() will mark the pmd such that any
>>>>>>> future call to pmd_present() will return true. And therefore any
>>>>>>> lockless pgtable walker could see the migration entry pmd in this state
>>>>>>> and start interpretting the fields as if it were present, leading to
>>>>>>> BadThings (TM). GUP-fast appears to be one such lockless pgtable walker.
>>>>>>> I suspect the same is possible on other architectures.
>>>>>>>
>>>>>>> Fix this by only calling pmdp_invalidate() for a present pmd. And for
>>>>>> Yes, this seems like a good design decision (after reading through the
>>>>>> discussion that you all had in the other threads).
>>>>> This will only be good for arm64 and does not prevent other arch developers
>>>>> to write code breaking arm64, since only arm64's pmd_mkinvalid() can turn
>>>>> a swap entry to a pmd_present() entry.
>>>> Well, let's characterize it in a bit more detail, then:
>>>>
>>>> 1) This patch will make things better for arm64. That's important!
>>>>
>>>> 2) Equally important, this patch does not make anything worse for
>>>>    other CPU arches.
>>>>
>>>> 3) This patch represents a new design constraint on the CPU arch
>>>>    layer, and thus requires documentation and whatever enforcement
>>>>    we can provide, in order to keep future code out of trouble.
>>>>
>>>> 3.a) See the VM_WARN_ON() hunks below.
>>>>
>>>> 3.b) I like the new design constraint, because it is reasonable and
>>>>      clearly understandable: don't invalidate a non-present page
>>>>      table entry.
>>>>
>>>> I do wonder if there is somewhere else that this should be documented?
>> In terms of documentation, at least in Documentation/mm/arch_pgtable_helpers.rst,
>> pmd_mkinvalid() entry needs to add "do not call on an invalid entry as
>> it breaks arm64"
>
> s/invalid/non-present ?					^^^^^^^^^^^^^
>
> But validation via mm/debug_vm_pgtable.c would require a predictable return
> value from pmd_mkinvalid() e.g return old pmd when the entry is not present.
>
> 	ASSERT(pmd = pmd_mkinvalid(pmd)) - when pmd is not present
>
> Otherwise, wondering how the semantics could be validated in the test.

I thought about checking this in mm/debug_vm_pgtable.c but concluded it is
impossible. We want to make sure no one use pmd_mkinvalid() on
!pmd_present() entries but that requires pmd_mkinvalid() on input entries'
at code writing time. A runtime test like mm/debug_vm_pgtable.c does not help.

I also even thought about changing pmd_mkinvalid() input parameter type to
a new pmd_invalid_t, so the type system can enforce it, but when we read
from a PMD entry, unless we inspect the bits, there is no way of determining
it is valid or not statically.

To me, the most future proof method is to make arm64 pmd_mkinvalid() to return
without changing anything if the input entry is !pmd_present(). This aligns
arm64 pmd_mkinvalid() with other arches pmd_mkinvalid() semantics, so that
if someone writes code using pmd_mkinvalid(), which runs on arches other than
arm64, the code would also work on arm64. But I am not going to insist on this
and let Ryan to make the call.


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  reply	other threads:[~2024-04-29 14:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25 17:07 [PATCH v1] mm: Fix race between __split_huge_pmd_locked() and GUP-fast Ryan Roberts
2024-04-25 18:58 ` Zi Yan
2024-04-26  4:50   ` Anshuman Khandual
2024-04-26 14:33     ` Zi Yan
2024-04-29  3:36       ` Anshuman Khandual
2024-04-26  7:48   ` Ryan Roberts
2024-04-26  4:19 ` Anshuman Khandual
2024-04-26  7:43   ` Ryan Roberts
2024-04-26 14:49     ` Zi Yan
2024-04-26 14:53       ` Zi Yan
2024-04-27  4:25         ` John Hubbard
2024-04-27 15:07           ` Zi Yan
2024-04-29  5:31             ` Anshuman Khandual
2024-04-29  5:25       ` Anshuman Khandual
2024-04-29  5:07     ` Anshuman Khandual
2024-04-27  4:41 ` John Hubbard
2024-04-27 15:14   ` Zi Yan
2024-04-27 19:11     ` John Hubbard
2024-04-27 20:45       ` Zi Yan
2024-04-27 20:48         ` Zi Yan
2024-04-29  6:17           ` Anshuman Khandual
2024-04-29 14:41             ` Zi Yan [this message]
2024-04-29  9:29       ` Ryan Roberts
2024-04-29 14:45         ` Zi Yan
2024-04-29 15:29           ` Zi Yan
2024-04-29 15:35             ` Ryan Roberts
2024-04-29 15:34           ` Ryan Roberts
2024-04-29 16:02             ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28269666-40B5-4610-8547-81FF68B7613B@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.