All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Zi Yan <ziy@nvidia.com>, John Hubbard <jhubbard@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm: Fix race between __split_huge_pmd_locked() and GUP-fast
Date: Mon, 29 Apr 2024 11:01:08 +0530	[thread overview]
Message-ID: <755b4f9e-437f-468f-a43e-c93742ac9828@arm.com> (raw)
In-Reply-To: <6F2BB00A-DBCD-4482-B16E-B71A02847F0D@nvidia.com>



On 4/27/24 20:37, Zi Yan wrote:
> On 27 Apr 2024, at 0:25, John Hubbard wrote:
> 
>> On 4/26/24 7:53 AM, Zi Yan wrote:
>>
>> Hi Zi (and Ryan)!
>>
>>>>>>> lockless pgtable walker could see the migration entry pmd in this state
>>>>>>> and start interpretting the fields as if it were present, leading to
>>>>>>> BadThings (TM). GUP-fast appears to be one such lockless pgtable walker.
>>>>>> Could you please explain how bad things might happen ?
>>>>> See 2 places where pmdp_get_lockless() is called in gup.c, without the PTL.
>>>>> These could both return the swap pte for which pmd_mkinvalid() has been called.
>>>>> In both cases, this would lead to the pmd_present() check eroneously returning
>>>>> true, eventually causing incorrect interpretation of the pte fields. e.g.:
>>>>>
>>>>> gup_pmd_range()
>>>>>    pmd_t pmd = pmdp_get_lockless(pmdp);
>>>>>    gup_huge_pmd(pmd, ...)
>>>>>      page = nth_page(pmd_page(orig), (addr & ~PMD_MASK) >> PAGE_SHIFT);
>>>>>
>>>>> page is guff.
>>>>>
>>>>> Let me know what you think!
>>> Add JohnH to check GUP code.
>> Ryan is correct about this behavior.
>>
>> By the way, remember that gup is not the only lockless page table
>> walker: there is also the CPU hardware itself, which inconveniently
>> refuses to bother with taking page table locks. 🙂
>>
>> So if we have code that can make a non-present PTE appear to be present
>> to any of these page walkers, whether software or hardware, it's a
>> definitely Not Good and will lead directly to bugs.
> This issue does not bother hardware, because the PTE_VALID/PMD_SECT_VALID
> is always unset and hardware always sees this PMD as invalid. It is a pure
> software issue, since for THP splitting, we do not want hardware to access
> the page but still allow kernel to user pmd_page() to get the pfn, so
> pmd_present() returns true even if PTE_VALID/PMD_SECT_VALID is unset by
> setting and checking PMD_PRESENT_INVALID bit. pmd_mkinvalid() sets
> PMD_PRESENT_INVALID, turning a migration entry from !pmd_present() to
> pmd_present(), while it is always a invalid PMD to hardware.

Agreed, this is not a HW issue at all, MMU sees such an entry as invalid
even if pmd_present() returns true.

  reply	other threads:[~2024-04-29  5:31 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25 17:07 [PATCH v1] mm: Fix race between __split_huge_pmd_locked() and GUP-fast Ryan Roberts
2024-04-25 18:58 ` Zi Yan
2024-04-26  4:50   ` Anshuman Khandual
2024-04-26 14:33     ` Zi Yan
2024-04-29  3:36       ` Anshuman Khandual
2024-04-26  7:48   ` Ryan Roberts
2024-04-26  4:19 ` Anshuman Khandual
2024-04-26  7:43   ` Ryan Roberts
2024-04-26 14:49     ` Zi Yan
2024-04-26 14:53       ` Zi Yan
2024-04-27  4:25         ` John Hubbard
2024-04-27 15:07           ` Zi Yan
2024-04-29  5:31             ` Anshuman Khandual [this message]
2024-04-29  5:25       ` Anshuman Khandual
2024-04-29  5:07     ` Anshuman Khandual
2024-04-27  4:41 ` John Hubbard
2024-04-27 15:14   ` Zi Yan
2024-04-27 19:11     ` John Hubbard
2024-04-27 20:45       ` Zi Yan
2024-04-27 20:48         ` Zi Yan
2024-04-29  6:17           ` Anshuman Khandual
2024-04-29 14:41             ` Zi Yan
2024-04-29  9:29       ` Ryan Roberts
2024-04-29 14:45         ` Zi Yan
2024-04-29 15:29           ` Zi Yan
2024-04-29 15:35             ` Ryan Roberts
2024-04-29 15:34           ` Ryan Roberts
2024-04-29 16:02             ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=755b4f9e-437f-468f-a43e-c93742ac9828@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.