LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/gup: Fix hugepd for longterm R/O pin on Power
@ 2024-04-28 19:01 Peter Xu
  2024-04-28 19:01 ` [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework Peter Xu
  2024-04-28 19:01 ` [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests Peter Xu
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Xu @ 2024-04-28 19:01 UTC (permalink / raw
  To: linux-kernel, linux-mm
  Cc: Christophe Leroy, peterx, David Hildenbrand, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

This series should apply to both mm-stable and mm-unstable, I am not sure
whether it's even applicable to apply on mm-stable directly, but perhaps
not urgently needed either.  Anyway, it'll apply to either tree.  It also
means cc stable is not needed even if I had the Fixes attached.

Patch 1 fixes that bug in mm-stable, patch 2 enhances the gup_longterm to
be able to discover such issue.

In general, the previous hugetlb rework [1] on gup-slow introduced an issue
with R/O longterm pin.  Nobody yet found it in either a real report or test
case, probably because our test case doesn't yet cover it (not before patch
2), and it's also a pretty rare path: it only happens with Power longterm
R/O pins on a page cache that is installed as a hugepd read-only.

Please read each of the patch for details.

I retested "./run_vmtests.sh -t gup_test -a" on a Power8 system with a
Power8 VM, with 16MB hugepd hugepd entries installed.  Note that I tested
exactly the same matrix before, but patch 2 will change gup_longterm test,
so it's actually slightly different test carried out, and the new test
(gup_longterm.c, when apply patch 2 only) will hang mm-stable on Andrew's
tree with that 16MB huge page.

Thanks,

[1] https://lore.kernel.org/r/20240327152332.950956-1-peterx@redhat.com

Peter Xu (2):
  mm/gup: Fix hugepd handling in hugetlb rework
  mm/selftests: Don't prefault in gup_longterm tests

 mm/gup.c                                  | 64 ++++++++++++++---------
 tools/testing/selftests/mm/gup_longterm.c | 12 +++--
 2 files changed, 48 insertions(+), 28 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework
  2024-04-28 19:01 [PATCH 0/2] mm/gup: Fix hugepd for longterm R/O pin on Power Peter Xu
@ 2024-04-28 19:01 ` Peter Xu
  2024-04-29  7:17   ` David Hildenbrand
  2024-04-28 19:01 ` [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests Peter Xu
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Xu @ 2024-04-28 19:01 UTC (permalink / raw
  To: linux-kernel, linux-mm
  Cc: Christophe Leroy, peterx, David Hildenbrand, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

Commit a12083d721d7 added hugepd handling for gup-slow, reusing gup-fast
functions.  follow_hugepd() correctly took the vma pointer in, however
didn't pass it over into the lower functions, which was overlooked.

The issue is gup_fast_hugepte() uses the vma pointer to make the correct
decision on whether an unshare is needed for a FOLL_PIN|FOLL_LONGTERM.  Now
without vma ponter it will constantly return "true" (needs an unshare) for
a page cache, even though in the SHARED case it will be wrong to unshare.

The other problem is, even if an unshare is needed, it now returns 0 rather
than -EMLINK, which will not trigger a follow up FAULT_FLAG_UNSHARE fault.
That will need to be fixed too when the unshare is wanted.

gup_longterm test didn't expose this issue in the past because it didn't
yet test R/O unshare in this case, another separate patch will enable that
in future tests.

Fix it by passing vma correctly to the bottom, rename gup_fast_hugepte()
back to gup_hugepte() as it is shared between the fast/slow paths, and also
allow -EMLINK to be returned properly by gup_hugepte() even though gup-fast
will take it the same as zero.

Reported-by: David Hildenbrand <david@redhat.com>
Fixes: a12083d721d7 ("mm/gup: handle hugepd for follow_page()")
Signed-off-by: Peter Xu <peterx@redhat.com>
---

Note: The target commit to be fixed should just been moved into mm-stable,
so no need to cc stable.
---
 mm/gup.c | 64 ++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 2f7baf96f655..ca0f5cedce9b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -525,9 +525,17 @@ static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
 	return (__boundary - 1 < end - 1) ? __boundary : end;
 }
 
-static int gup_fast_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
-		unsigned long end, unsigned int flags, struct page **pages,
-		int *nr)
+/*
+ * Returns 1 if succeeded, 0 if failed, -EMLINK if unshare needed.
+ *
+ * NOTE: for the same entry, gup-fast and gup-slow can return different
+ * results (0 v.s. -EMLINK) depending on whether vma is available.  This is
+ * the expected behavior, where we simply want gup-fast to fallback to
+ * gup-slow to take the vma reference first.
+ */
+static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz,
+		       unsigned long addr, unsigned long end, unsigned int flags,
+		       struct page **pages, int *nr)
 {
 	unsigned long pte_end;
 	struct page *page;
@@ -559,9 +567,9 @@ static int gup_fast_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
 		return 0;
 	}
 
-	if (!pte_write(pte) && gup_must_unshare(NULL, flags, &folio->page)) {
+	if (!pte_write(pte) && gup_must_unshare(vma, flags, &folio->page)) {
 		gup_put_folio(folio, refs, flags);
-		return 0;
+		return -EMLINK;
 	}
 
 	*nr += refs;
@@ -577,19 +585,22 @@ static int gup_fast_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
  * of the other folios. See writable_file_mapping_allowed() and
  * gup_fast_folio_allowed() for more information.
  */
-static int gup_fast_hugepd(hugepd_t hugepd, unsigned long addr,
-		unsigned int pdshift, unsigned long end, unsigned int flags,
-		struct page **pages, int *nr)
+static int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
+		      unsigned long addr, unsigned int pdshift,
+		      unsigned long end, unsigned int flags,
+		      struct page **pages, int *nr)
 {
 	pte_t *ptep;
 	unsigned long sz = 1UL << hugepd_shift(hugepd);
 	unsigned long next;
+	int ret;
 
 	ptep = hugepte_offset(hugepd, addr, pdshift);
 	do {
 		next = hugepte_addr_end(addr, end, sz);
-		if (!gup_fast_hugepte(ptep, sz, addr, end, flags, pages, nr))
-			return 0;
+		ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr);
+		if (ret != 1)
+			return ret;
 	} while (ptep++, addr = next, addr != end);
 
 	return 1;
@@ -613,22 +624,25 @@ static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
 	h = hstate_vma(vma);
 	ptep = hugepte_offset(hugepd, addr, pdshift);
 	ptl = huge_pte_lock(h, vma->vm_mm, ptep);
-	ret = gup_fast_hugepd(hugepd, addr, pdshift, addr + PAGE_SIZE,
-			      flags, &page, &nr);
+	ret = gup_hugepd(vma, hugepd, addr, pdshift, addr + PAGE_SIZE,
+			 flags, &page, &nr);
 	spin_unlock(ptl);
 
-	if (ret) {
+	if (ret == 1) {
+		/* GUP succeeded */
 		WARN_ON_ONCE(nr != 1);
 		ctx->page_mask = (1U << huge_page_order(h)) - 1;
 		return page;
 	}
 
-	return NULL;
+	/* ret can be either 0 (translates to NULL) or negative */
+	return ERR_PTR(ret);
 }
 #else /* CONFIG_ARCH_HAS_HUGEPD */
-static inline int gup_fast_hugepd(hugepd_t hugepd, unsigned long addr,
-		unsigned int pdshift, unsigned long end, unsigned int flags,
-		struct page **pages, int *nr)
+static inline int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
+			     unsigned long addr, unsigned int pdshift,
+			     unsigned long end, unsigned int flags,
+			     struct page **pages, int *nr)
 {
 	return 0;
 }
@@ -3261,8 +3275,8 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
 			 * architecture have different format for hugetlbfs
 			 * pmd format and THP pmd format
 			 */
-			if (!gup_fast_hugepd(__hugepd(pmd_val(pmd)), addr,
-					     PMD_SHIFT, next, flags, pages, nr))
+			if (gup_hugepd(NULL, __hugepd(pmd_val(pmd)), addr,
+				       PMD_SHIFT, next, flags, pages, nr) != 1)
 				return 0;
 		} else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags,
 					       pages, nr))
@@ -3291,8 +3305,8 @@ static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr,
 					       pages, nr))
 				return 0;
 		} else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) {
-			if (!gup_fast_hugepd(__hugepd(pud_val(pud)), addr,
-					     PUD_SHIFT, next, flags, pages, nr))
+			if (gup_hugepd(NULL, __hugepd(pud_val(pud)), addr,
+				       PUD_SHIFT, next, flags, pages, nr) != 1)
 				return 0;
 		} else if (!gup_fast_pmd_range(pudp, pud, addr, next, flags,
 					       pages, nr))
@@ -3318,8 +3332,8 @@ static int gup_fast_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr,
 			return 0;
 		BUILD_BUG_ON(p4d_leaf(p4d));
 		if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
-			if (!gup_fast_hugepd(__hugepd(p4d_val(p4d)), addr,
-					     P4D_SHIFT, next, flags, pages, nr))
+			if (gup_hugepd(NULL, __hugepd(p4d_val(p4d)), addr,
+				       P4D_SHIFT, next, flags, pages, nr) != 1)
 				return 0;
 		} else if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags,
 					       pages, nr))
@@ -3347,8 +3361,8 @@ static void gup_fast_pgd_range(unsigned long addr, unsigned long end,
 					       pages, nr))
 				return;
 		} else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) {
-			if (!gup_fast_hugepd(__hugepd(pgd_val(pgd)), addr,
-					      PGDIR_SHIFT, next, flags, pages, nr))
+			if (gup_hugepd(NULL, __hugepd(pgd_val(pgd)), addr,
+				       PGDIR_SHIFT, next, flags, pages, nr) != 1)
 				return;
 		} else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags,
 					       pages, nr))
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests
  2024-04-28 19:01 [PATCH 0/2] mm/gup: Fix hugepd for longterm R/O pin on Power Peter Xu
  2024-04-28 19:01 ` [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework Peter Xu
@ 2024-04-28 19:01 ` Peter Xu
  2024-04-29  7:28   ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Xu @ 2024-04-28 19:01 UTC (permalink / raw
  To: linux-kernel, linux-mm
  Cc: Christophe Leroy, peterx, David Hildenbrand, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

Prefault, especially with RW, makes the GUP test too easy, and may not yet
reach the core of the test.

For example, R/O longterm pins will just hit, pte_write()==true for
whatever cases, the unsharing logic won't be ever tested.

This patch remove the prefault.  This tortures more code paths at least to
cover the unshare care for R/O longterm pins, in which case the first R/O
GUP attempt will fault in the page R/O first, then the 2nd will go through
the unshare path, checking whether an unshare is needed.

Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/mm/gup_longterm.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
index ad168d35b23b..488e32186246 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -119,10 +119,16 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
 	}
 
 	/*
-	 * Fault in the page writable such that GUP-fast can eventually pin
-	 * it immediately.
+	 * Explicitly avoid pre-faulting in the page, this can help testing
+	 * more code paths.
+	 *
+	 * Take example of an upcoming R/O pin test, if we RW prefault the
+	 * page, such pin will directly skip R/O unsharing and the longterm
+	 * pin will success mostly always.  When not prefaulted, R/O
+	 * longterm pin will first fault in a RO page, then the 2nd round
+	 * it'll go via the unshare check.  Otherwise those paths aren't
+	 * covered.
 	 */
-	memset(mem, 0, size);
 
 	switch (type) {
 	case TEST_TYPE_RO:
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework
  2024-04-28 19:01 ` [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework Peter Xu
@ 2024-04-29  7:17   ` David Hildenbrand
  0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2024-04-29  7:17 UTC (permalink / raw
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Christophe Leroy, Andrew Morton, Aneesh Kumar K . V,
	Lorenzo Stoakes, John Hubbard, linuxppc-dev, Muchun Song,
	Jason Gunthorpe

On 28.04.24 21:01, Peter Xu wrote:
> Commit a12083d721d7 added hugepd handling for gup-slow, reusing gup-fast
> functions.  follow_hugepd() correctly took the vma pointer in, however
> didn't pass it over into the lower functions, which was overlooked.
> 
> The issue is gup_fast_hugepte() uses the vma pointer to make the correct
> decision on whether an unshare is needed for a FOLL_PIN|FOLL_LONGTERM.  Now
> without vma ponter it will constantly return "true" (needs an unshare) for
> a page cache, even though in the SHARED case it will be wrong to unshare.
> 
> The other problem is, even if an unshare is needed, it now returns 0 rather
> than -EMLINK, which will not trigger a follow up FAULT_FLAG_UNSHARE fault.
> That will need to be fixed too when the unshare is wanted.
> 
> gup_longterm test didn't expose this issue in the past because it didn't
> yet test R/O unshare in this case, another separate patch will enable that
> in future tests.
> 
> Fix it by passing vma correctly to the bottom, rename gup_fast_hugepte()
> back to gup_hugepte() as it is shared between the fast/slow paths, and also
> allow -EMLINK to be returned properly by gup_hugepte() even though gup-fast
> will take it the same as zero.
> 
> Reported-by: David Hildenbrand <david@redhat.com>
> Fixes: a12083d721d7 ("mm/gup: handle hugepd for follow_page()")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---

LGTM

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests
  2024-04-28 19:01 ` [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests Peter Xu
@ 2024-04-29  7:28   ` David Hildenbrand
  2024-04-29 13:10     ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2024-04-29  7:28 UTC (permalink / raw
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Christophe Leroy, Andrew Morton, Aneesh Kumar K . V,
	Lorenzo Stoakes, John Hubbard, linuxppc-dev, Muchun Song,
	Jason Gunthorpe

On 28.04.24 21:01, Peter Xu wrote:
> Prefault, especially with RW, makes the GUP test too easy, and may not yet
> reach the core of the test.
> 
> For example, R/O longterm pins will just hit, pte_write()==true for
> whatever cases, the unsharing logic won't be ever tested.
> 
> This patch remove the prefault.  This tortures more code paths at least to
> cover the unshare care for R/O longterm pins, in which case the first R/O
> GUP attempt will fault in the page R/O first, then the 2nd will go through
> the unshare path, checking whether an unshare is needed.
> 
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   tools/testing/selftests/mm/gup_longterm.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
> index ad168d35b23b..488e32186246 100644
> --- a/tools/testing/selftests/mm/gup_longterm.c
> +++ b/tools/testing/selftests/mm/gup_longterm.c
> @@ -119,10 +119,16 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>   	}
>   
>   	/*
> -	 * Fault in the page writable such that GUP-fast can eventually pin
> -	 * it immediately.
> +	 * Explicitly avoid pre-faulting in the page, this can help testing
> +	 * more code paths.
> +	 *
> +	 * Take example of an upcoming R/O pin test, if we RW prefault the
> +	 * page, such pin will directly skip R/O unsharing and the longterm
> +	 * pin will success mostly always.  When not prefaulted, R/O
> +	 * longterm pin will first fault in a RO page, then the 2nd round
> +	 * it'll go via the unshare check.  Otherwise those paths aren't
> +	 * covered.
>   	 */
This will mean that GUP-fast never succeeds, which removes quite some testing
coverage for most other tests here.

Note that the main motivation of this test was to test gup_fast_folio_allowed(),
where we had issues with GUP-fast during development.

Would the following also get the job done?

diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
index ad168d35b23b7..e917a7c58d571 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -92,7 +92,7 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
  {
  	__fsword_t fs_type = get_fs_type(fd);
  	bool should_work;
-	char *mem;
+	char tmp, *mem;
  	int ret;
  
  	if (ftruncate(fd, size)) {
@@ -119,10 +119,19 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
  	}
  
  	/*
-	 * Fault in the page writable such that GUP-fast can eventually pin
-	 * it immediately.
+	 * Fault in the page such that GUP-fast might be able to pin it
+	 * immediately. To cover more cases, don't fault in pages writable when
+	 * R/O pinning.
  	 */
-	memset(mem, 0, size);
+	switch (type) {
+	case TEST_TYPE_RO:
+	case TEST_TYPE_RO_FAST:
+		tmp = *mem;
+		asm volatile("" : "+r" (tmp));
+		break;
+	default:
+		memset(mem, 0, size);
+	};
  
  	switch (type) {
  	case TEST_TYPE_RO:
-- 
2.44.0


-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests
  2024-04-29  7:28   ` David Hildenbrand
@ 2024-04-29 13:10     ` Peter Xu
  2024-04-29 13:26       ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2024-04-29 13:10 UTC (permalink / raw
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Christophe Leroy, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

On Mon, Apr 29, 2024 at 09:28:15AM +0200, David Hildenbrand wrote:
> On 28.04.24 21:01, Peter Xu wrote:
> > Prefault, especially with RW, makes the GUP test too easy, and may not yet
> > reach the core of the test.
> > 
> > For example, R/O longterm pins will just hit, pte_write()==true for
> > whatever cases, the unsharing logic won't be ever tested.
> > 
> > This patch remove the prefault.  This tortures more code paths at least to
> > cover the unshare care for R/O longterm pins, in which case the first R/O
> > GUP attempt will fault in the page R/O first, then the 2nd will go through
> > the unshare path, checking whether an unshare is needed.
> > 
> > Cc: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   tools/testing/selftests/mm/gup_longterm.c | 12 +++++++++---
> >   1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
> > index ad168d35b23b..488e32186246 100644
> > --- a/tools/testing/selftests/mm/gup_longterm.c
> > +++ b/tools/testing/selftests/mm/gup_longterm.c
> > @@ -119,10 +119,16 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
> >   	}
> >   	/*
> > -	 * Fault in the page writable such that GUP-fast can eventually pin
> > -	 * it immediately.
> > +	 * Explicitly avoid pre-faulting in the page, this can help testing
> > +	 * more code paths.
> > +	 *
> > +	 * Take example of an upcoming R/O pin test, if we RW prefault the
> > +	 * page, such pin will directly skip R/O unsharing and the longterm
> > +	 * pin will success mostly always.  When not prefaulted, R/O
> > +	 * longterm pin will first fault in a RO page, then the 2nd round
> > +	 * it'll go via the unshare check.  Otherwise those paths aren't
> > +	 * covered.
> >   	 */
> This will mean that GUP-fast never succeeds, which removes quite some testing
> coverage for most other tests here.
> 
> Note that the main motivation of this test was to test gup_fast_folio_allowed(),
> where we had issues with GUP-fast during development.

Ah I didn't notice that, as I thought that whitelists memfd ones.

> 
> Would the following also get the job done?
> 
> diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
> index ad168d35b23b7..e917a7c58d571 100644
> --- a/tools/testing/selftests/mm/gup_longterm.c
> +++ b/tools/testing/selftests/mm/gup_longterm.c
> @@ -92,7 +92,7 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>  {
>  	__fsword_t fs_type = get_fs_type(fd);
>  	bool should_work;
> -	char *mem;
> +	char tmp, *mem;
>  	int ret;
>  	if (ftruncate(fd, size)) {
> @@ -119,10 +119,19 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>  	}
>  	/*
> -	 * Fault in the page writable such that GUP-fast can eventually pin
> -	 * it immediately.
> +	 * Fault in the page such that GUP-fast might be able to pin it
> +	 * immediately. To cover more cases, don't fault in pages writable when
> +	 * R/O pinning.
>  	 */
> -	memset(mem, 0, size);
> +	switch (type) {
> +	case TEST_TYPE_RO:
> +	case TEST_TYPE_RO_FAST:
> +		tmp = *mem;
> +		asm volatile("" : "+r" (tmp));
> +		break;
> +	default:
> +		memset(mem, 0, size);
> +	};
>  	switch (type) {
>  	case TEST_TYPE_RO:

Yes this could work too.

The test patch here doesn't need to rush. David, how about you prepare a
better and verified patch and post it separately, making sure to cover all
the things we used to cover plus the unshare?  IIUC it used to be not
touched because of pte_write() always returns true with a write prefault.

Then we let patch 1 go through first, and drop this one?

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests
  2024-04-29 13:10     ` Peter Xu
@ 2024-04-29 13:26       ` David Hildenbrand
  2024-04-29 13:51         ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2024-04-29 13:26 UTC (permalink / raw
  To: Peter Xu
  Cc: linux-kernel, linux-mm, Christophe Leroy, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

On 29.04.24 15:10, Peter Xu wrote:
> On Mon, Apr 29, 2024 at 09:28:15AM +0200, David Hildenbrand wrote:
>> On 28.04.24 21:01, Peter Xu wrote:
>>> Prefault, especially with RW, makes the GUP test too easy, and may not yet
>>> reach the core of the test.
>>>
>>> For example, R/O longterm pins will just hit, pte_write()==true for
>>> whatever cases, the unsharing logic won't be ever tested.
>>>
>>> This patch remove the prefault.  This tortures more code paths at least to
>>> cover the unshare care for R/O longterm pins, in which case the first R/O
>>> GUP attempt will fault in the page R/O first, then the 2nd will go through
>>> the unshare path, checking whether an unshare is needed.
>>>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    tools/testing/selftests/mm/gup_longterm.c | 12 +++++++++---
>>>    1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
>>> index ad168d35b23b..488e32186246 100644
>>> --- a/tools/testing/selftests/mm/gup_longterm.c
>>> +++ b/tools/testing/selftests/mm/gup_longterm.c
>>> @@ -119,10 +119,16 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>>>    	}
>>>    	/*
>>> -	 * Fault in the page writable such that GUP-fast can eventually pin
>>> -	 * it immediately.
>>> +	 * Explicitly avoid pre-faulting in the page, this can help testing
>>> +	 * more code paths.
>>> +	 *
>>> +	 * Take example of an upcoming R/O pin test, if we RW prefault the
>>> +	 * page, such pin will directly skip R/O unsharing and the longterm
>>> +	 * pin will success mostly always.  When not prefaulted, R/O
>>> +	 * longterm pin will first fault in a RO page, then the 2nd round
>>> +	 * it'll go via the unshare check.  Otherwise those paths aren't
>>> +	 * covered.
>>>    	 */
>> This will mean that GUP-fast never succeeds, which removes quite some testing
>> coverage for most other tests here.
>>
>> Note that the main motivation of this test was to test gup_fast_folio_allowed(),
>> where we had issues with GUP-fast during development.
> 
> Ah I didn't notice that, as I thought that whitelists memfd ones.
> 
>>
>> Would the following also get the job done?
>>
>> diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
>> index ad168d35b23b7..e917a7c58d571 100644
>> --- a/tools/testing/selftests/mm/gup_longterm.c
>> +++ b/tools/testing/selftests/mm/gup_longterm.c
>> @@ -92,7 +92,7 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>>   {
>>   	__fsword_t fs_type = get_fs_type(fd);
>>   	bool should_work;
>> -	char *mem;
>> +	char tmp, *mem;
>>   	int ret;
>>   	if (ftruncate(fd, size)) {
>> @@ -119,10 +119,19 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
>>   	}
>>   	/*
>> -	 * Fault in the page writable such that GUP-fast can eventually pin
>> -	 * it immediately.
>> +	 * Fault in the page such that GUP-fast might be able to pin it
>> +	 * immediately. To cover more cases, don't fault in pages writable when
>> +	 * R/O pinning.
>>   	 */
>> -	memset(mem, 0, size);
>> +	switch (type) {
>> +	case TEST_TYPE_RO:
>> +	case TEST_TYPE_RO_FAST:
>> +		tmp = *mem;
>> +		asm volatile("" : "+r" (tmp));
>> +		break;
>> +	default:
>> +		memset(mem, 0, size);
>> +	};
>>   	switch (type) {
>>   	case TEST_TYPE_RO:
> 
> Yes this could work too.
> 
> The test patch here doesn't need to rush. David, how about you prepare a
> better and verified patch and post it separately, making sure to cover all
> the things we used to cover plus the unshare?  IIUC it used to be not
> touched because of pte_write() always returns true with a write prefault.
> 
> Then we let patch 1 go through first, and drop this one?

Whatever you prefer!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests
  2024-04-29 13:26       ` David Hildenbrand
@ 2024-04-29 13:51         ` Peter Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Xu @ 2024-04-29 13:51 UTC (permalink / raw
  To: David Hildenbrand, Andrew Morton
  Cc: linux-kernel, linux-mm, Christophe Leroy, Andrew Morton,
	Aneesh Kumar K . V, Lorenzo Stoakes, John Hubbard, linuxppc-dev,
	Muchun Song, Jason Gunthorpe

On Mon, Apr 29, 2024 at 03:26:22PM +0200, David Hildenbrand wrote:
> > The test patch here doesn't need to rush. David, how about you prepare a
> > better and verified patch and post it separately, making sure to cover all
> > the things we used to cover plus the unshare?  IIUC it used to be not
> > touched because of pte_write() always returns true with a write prefault.
> > 
> > Then we let patch 1 go through first, and drop this one?
> 
> Whatever you prefer!

Thanks!

Andrew, would you consider taking patch 1 but ignore this patch 2? Or do
you prefer me to resend?

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-29 13:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-28 19:01 [PATCH 0/2] mm/gup: Fix hugepd for longterm R/O pin on Power Peter Xu
2024-04-28 19:01 ` [PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework Peter Xu
2024-04-29  7:17   ` David Hildenbrand
2024-04-28 19:01 ` [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests Peter Xu
2024-04-29  7:28   ` David Hildenbrand
2024-04-29 13:10     ` Peter Xu
2024-04-29 13:26       ` David Hildenbrand
2024-04-29 13:51         ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).