LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] unrestrict process_madvise() for current process
@ 2024-09-23 16:03 Lorenzo Stoakes
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
  2024-09-23 16:03 ` [PATCH 2/2] selftests/mm: add test for process_madvise PR_MADV_SELF flag use Lorenzo Stoakes
  0 siblings, 2 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-23 16:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-mm, linux-kernel,
	Minchan Kim

The process_madvise() call was introduced in commit ecb8ac8b1f14
("mm/madvise: introduce process_madvise() syscall: an external memory
hinting API") as a means of performing madvise() operations on another
process.

However, as it provides the means by which to perform multiple madvise()
operations in a batch via an iovec, it is useful to utilise the same
interface for performing operations on the current process rather than a
remote one.

Using this interface targeting the current process is cumbersome - a pidfd
needs to be setup for the current pid, and we are limited to only a subset
of madvise() operations, a limitation sensible for manipulating remote
processes but not meaningful when manipulating the current one.

Commit 22af8caff7d1 ("mm/madvise: process_madvise() drop capability check
if same mm") removed the need for a caller invoking process_madvise() on
its own pidfd to possess the CAP_SYS_NICE capability, however this leaves
the restrictions on operation in place and the cumbersome need for a 'self
pidfd'.

This patch series eliminates both limitations:

1. The restriction on permitted operations is removed when operating
   on the current process.

2. A new flag is introduced - PR_MADV_SELF - which eliminates the need for
   a pidfd - if this flag is set, the pidfd argument is ignored and the
   operation is simply applied to the current process.

Therefore a user can simply invoke:

	process_madvise(0, iovec, n, MADV_..., PR_MADV_SELF);

And perform any madvise() operation they like on the n ranges specified by
the iovec parameter.

This series also introduces a series of self-tests for this feature
asserting that the flag functions as expected.

Lorenzo Stoakes (2):
  mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  selftests/mm: add test for process_madvise PR_MADV_SELF flag use

 include/uapi/asm-generic/mman-common.h       |   2 +
 mm/madvise.c                                 |  58 +++++++---
 tools/testing/selftests/mm/.gitignore        |   1 +
 tools/testing/selftests/mm/Makefile          |   1 +
 tools/testing/selftests/mm/process_madvise.c | 115 +++++++++++++++++++
 5 files changed, 161 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/selftests/mm/process_madvise.c

--
2.46.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 16:03 [PATCH 0/2] unrestrict process_madvise() for current process Lorenzo Stoakes
@ 2024-09-23 16:03 ` Lorenzo Stoakes
  2024-09-23 18:56   ` Shakeel Butt
                     ` (3 more replies)
  2024-09-23 16:03 ` [PATCH 2/2] selftests/mm: add test for process_madvise PR_MADV_SELF flag use Lorenzo Stoakes
  1 sibling, 4 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-23 16:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-mm, linux-kernel,
	Minchan Kim

process_madvise() was conceived as a useful means for performing a vector
of madvise() operations on a remote process's address space.

However it's useful to be able to do so on the current process also. It is
currently rather clunky to do this (requiring a pidfd to be opened for the
current process) and introduces unnecessary overhead in incrementing
reference counts for the task and mm.

Avoid all of this by providing a PR_MADV_SELF flag, which causes
process_madvise() to simply ignore the pidfd parameter and instead apply
the operation to the current process.

Since we are operating on our own process, no restrictions need be applied
on behaviors we can perform, so do not limit these in that case.

Also extend the case of a user specifying the current process via pidfd to
not be restricted on behaviors which can be performed.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/uapi/asm-generic/mman-common.h |  2 +
 mm/madvise.c                           | 58 +++++++++++++++++++-------
 2 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 6ce1f1ceb432..8f59f23dee09 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -87,4 +87,6 @@
 #define PKEY_ACCESS_MASK	(PKEY_DISABLE_ACCESS |\
 				 PKEY_DISABLE_WRITE)
 
+#define PR_MADV_SELF	(1<<0)		/* process_madvise() flag - apply to self */
+
 #endif /* __ASM_GENERIC_MMAN_COMMON_H */
diff --git a/mm/madvise.c b/mm/madvise.c
index ff139e57cca2..549b36d1463c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1208,7 +1208,8 @@ madvise_behavior_valid(int behavior)
 	}
 }
 
-static bool process_madvise_behavior_valid(int behavior)
+/* Can we invoke process_madvise() on a remote mm for the specified behavior? */
+static bool process_madvise_remote_valid(int behavior)
 {
 	switch (behavior) {
 	case MADV_COLD:
@@ -1477,6 +1478,28 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior)
 	return do_madvise(current->mm, start, len_in, behavior);
 }
 
+/* Perform an madvise operation over a vector of addresses and lengths. */
+static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
+			      int behavior)
+{
+	ssize_t ret = 0;
+	size_t total_len;
+
+	total_len = iov_iter_count(iter);
+
+	while (iov_iter_count(iter)) {
+		ret = do_madvise(mm, (unsigned long)iter_iov_addr(iter),
+				 iter_iov_len(iter), behavior);
+		if (ret < 0)
+			break;
+		iov_iter_advance(iter, iter_iov_len(iter));
+	}
+
+	ret = (total_len - iov_iter_count(iter)) ? : ret;
+
+	return ret;
+}
+
 SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 		size_t, vlen, int, behavior, unsigned int, flags)
 {
@@ -1486,10 +1509,9 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 	struct iov_iter iter;
 	struct task_struct *task;
 	struct mm_struct *mm;
-	size_t total_len;
 	unsigned int f_flags;
 
-	if (flags != 0) {
+	if (flags & ~PR_MADV_SELF) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -1498,13 +1520,26 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 	if (ret < 0)
 		goto out;
 
+	/*
+	 * Perform an madvise operation on the current process. No restrictions
+	 * need be applied, nor do we need to pin the task or mm_struct.
+	 */
+	if (flags & PR_MADV_SELF) {
+		ret = vector_madvise(current->mm, &iter, behavior);
+		goto free_iov;
+	}
+
 	task = pidfd_get_task(pidfd, &f_flags);
 	if (IS_ERR(task)) {
 		ret = PTR_ERR(task);
 		goto free_iov;
 	}
 
-	if (!process_madvise_behavior_valid(behavior)) {
+	/*
+	 * We need only perform this check if we are attempting to manipulate a
+	 * remote process's address space.
+	 */
+	if (mm != current->mm && !process_madvise_remote_valid(behavior)) {
 		ret = -EINVAL;
 		goto release_task;
 	}
@@ -1518,24 +1553,15 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 
 	/*
 	 * Require CAP_SYS_NICE for influencing process performance. Note that
-	 * only non-destructive hints are currently supported.
+	 * only non-destructive hints are currently supported for remote
+	 * processes.
 	 */
 	if (mm != current->mm && !capable(CAP_SYS_NICE)) {
 		ret = -EPERM;
 		goto release_mm;
 	}
 
-	total_len = iov_iter_count(&iter);
-
-	while (iov_iter_count(&iter)) {
-		ret = do_madvise(mm, (unsigned long)iter_iov_addr(&iter),
-					iter_iov_len(&iter), behavior);
-		if (ret < 0)
-			break;
-		iov_iter_advance(&iter, iter_iov_len(&iter));
-	}
-
-	ret = (total_len - iov_iter_count(&iter)) ? : ret;
+	ret = vector_madvise(mm, &iter, behavior);
 
 release_mm:
 	mmput(mm);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] selftests/mm: add test for process_madvise PR_MADV_SELF flag use
  2024-09-23 16:03 [PATCH 0/2] unrestrict process_madvise() for current process Lorenzo Stoakes
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
@ 2024-09-23 16:03 ` Lorenzo Stoakes
  1 sibling, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-23 16:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-mm, linux-kernel,
	Minchan Kim

Add a new process_madvise() selftest, and add a test for the newly
introduced PR_MADV_SELF flag.

Assert that we can perform a vector operation of an operation that would
not be permitted on a remote mm (MADV_DONTNEED in this instance) on ones in
our own mm and that the operation is correctly peformed.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 tools/testing/selftests/mm/.gitignore        |   1 +
 tools/testing/selftests/mm/Makefile          |   1 +
 tools/testing/selftests/mm/process_madvise.c | 115 +++++++++++++++++++
 3 files changed, 117 insertions(+)
 create mode 100644 tools/testing/selftests/mm/process_madvise.c

diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore
index da030b43e43b..a875376601b7 100644
--- a/tools/testing/selftests/mm/.gitignore
+++ b/tools/testing/selftests/mm/.gitignore
@@ -51,3 +51,4 @@ hugetlb_madv_vs_map
 mseal_test
 seal_elf
 droppable
+process_madvise
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 02e1204971b0..7503ec177cd2 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -79,6 +79,7 @@ TEST_GEN_FILES += hugetlb_fault_after_madv
 TEST_GEN_FILES += hugetlb_madv_vs_map
 TEST_GEN_FILES += hugetlb_dio
 TEST_GEN_FILES += droppable
+TEST_GEN_FILES += process_madvise
 
 ifneq ($(ARCH),arm64)
 TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/process_madvise.c b/tools/testing/selftests/mm/process_madvise.c
new file mode 100644
index 000000000000..7a29b77d837d
--- /dev/null
+++ b/tools/testing/selftests/mm/process_madvise.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#define _GNU_SOURCE
+#include "../kselftest_harness.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/uio.h>
+
+/* May not be available in host system yet. */
+#ifndef PR_MADV_SELF
+#define PR_MADV_SELF	(1<<0)
+#endif
+
+FIXTURE(process_madvise)
+{
+	unsigned long page_size;
+};
+
+FIXTURE_SETUP(process_madvise)
+{
+	self->page_size = (unsigned long)sysconf(_SC_PAGESIZE);
+};
+
+FIXTURE_TEARDOWN(process_madvise)
+{
+}
+
+static void populate_range(char *ptr, size_t len)
+{
+	memset(ptr, 'x', len);
+}
+
+static bool is_range_zeroed(char *ptr, size_t len)
+{
+	size_t i;
+
+	for (i = 0; i < len; i++) {
+		if (ptr[i] != '\0')
+			return false;
+	}
+
+	return true;
+}
+
+TEST_F(process_madvise, pr_madv_self)
+{
+	const unsigned long page_size = self->page_size;
+	struct iovec vec[3];
+	char *ptr_region, *ptr, *ptr2, *ptr3;
+
+	/* Establish a region in which to place VMAs. */
+	ptr_region = mmap(NULL, 100 * page_size, PROT_NONE,
+			  MAP_PRIVATE | MAP_ANON, -1, 0);
+	ASSERT_NE(ptr_region, MAP_FAILED);
+
+	/* Place a 5 page mapping offset by one page into the region. */
+	ptr = mmap(&ptr_region[page_size], 5 * page_size,
+		   PROT_READ | PROT_WRITE,
+		   MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
+	ASSERT_NE(ptr, MAP_FAILED);
+	populate_range(ptr, 5 * page_size);
+	vec[0].iov_base = ptr;
+	vec[0].iov_len = 5 * page_size;
+	/* Free the PROT_NONE region before this region. */
+	ASSERT_EQ(munmap(ptr_region, page_size), 0);
+
+	/* Place a 10 page mapping in the middle of the region. */
+	ptr2 = mmap(&ptr_region[50 * page_size], 10 * page_size,
+		    PROT_READ | PROT_WRITE,
+		    MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
+	ASSERT_NE(ptr2, MAP_FAILED);
+	populate_range(ptr2, 10 * page_size);
+	vec[1].iov_base = ptr2;
+	vec[1].iov_len = 10 * page_size;
+	/* Free the PROT_NONE region before this region. */
+	ASSERT_EQ(munmap(&ptr_region[6 * page_size], 44 * page_size), 0);
+
+	/* Place a 3 page mapping at the end of the region, offset by 1. */
+	ptr3 = mmap(&ptr_region[96 * page_size], 3 * page_size,
+		    PROT_READ | PROT_WRITE,
+		    MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
+	ASSERT_NE(ptr3, MAP_FAILED);
+	populate_range(ptr3, 3 * page_size);
+	vec[2].iov_base = ptr3;
+	vec[2].iov_len = 3 * page_size;
+	/* Free the PROT_NONE region before this region. */
+	ASSERT_EQ(munmap(&ptr_region[60 * page_size], 36 * page_size), 0);
+	/* Free the PROT_NONE region after this region. */
+	ASSERT_EQ(munmap(&ptr_region[99 * page_size], page_size), 0);
+
+	/*
+	 * OK now we should have three distinct regions of memory. Zap
+	 * them with MADV_DONTNEED. This should clear the populated ranges and
+	 * we can then assert on them being zeroed.
+	 *
+	 * The function returns the number of bytes advised, so assert this is
+	 * equal to the total size of the three regions.
+	 */
+	ASSERT_EQ(process_madvise(0, vec, 3, MADV_DONTNEED, PR_MADV_SELF),
+		  (5 + 10 + 3) * page_size);
+
+	/* Make sure these ranges are now zeroed. */
+	ASSERT_TRUE(is_range_zeroed(ptr, 5 * page_size));
+	ASSERT_TRUE(is_range_zeroed(ptr2, 10 * page_size));
+	ASSERT_TRUE(is_range_zeroed(ptr2, 3 * page_size));
+
+	/* Cleanup. */
+	ASSERT_EQ(munmap(ptr, 5 * page_size), 0);
+	ASSERT_EQ(munmap(ptr2, 10 * page_size), 0);
+	ASSERT_EQ(munmap(ptr3, 3 * page_size), 0);
+}
+
+TEST_HARNESS_MAIN
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
@ 2024-09-23 18:56   ` Shakeel Butt
  2024-09-23 19:34     ` Lorenzo Stoakes
  2024-09-23 21:20   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Shakeel Butt @ 2024-09-23 18:56 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, Vlastimil Babka, Liam R . Howlett,
	Suren Baghdasaryan, Arnd Bergmann, linux-api, linux-mm,
	linux-kernel, Minchan Kim

On Mon, Sep 23, 2024 at 05:03:56PM GMT, Lorenzo Stoakes wrote:
[...]
>  SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>  		size_t, vlen, int, behavior, unsigned int, flags)
>  {
> @@ -1486,10 +1509,9 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>  	struct iov_iter iter;
>  	struct task_struct *task;
>  	struct mm_struct *mm;
> -	size_t total_len;
>  	unsigned int f_flags;
>  
> -	if (flags != 0) {
> +	if (flags & ~PR_MADV_SELF) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -1498,13 +1520,26 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>  	if (ret < 0)
>  		goto out;
>  
> +	/*
> +	 * Perform an madvise operation on the current process. No restrictions
> +	 * need be applied, nor do we need to pin the task or mm_struct.
> +	 */
> +	if (flags & PR_MADV_SELF) {
> +		ret = vector_madvise(current->mm, &iter, behavior);
> +		goto free_iov;
> +	}
> +
>  	task = pidfd_get_task(pidfd, &f_flags);
>  	if (IS_ERR(task)) {
>  		ret = PTR_ERR(task);
>  		goto free_iov;
>  	}
>  
> -	if (!process_madvise_behavior_valid(behavior)) {
> +	/*
> +	 * We need only perform this check if we are attempting to manipulate a
> +	 * remote process's address space.
> +	 */
> +	if (mm != current->mm && !process_madvise_remote_valid(behavior)) {

Move the above check after mm is initialized i.e. mm = mm_access().

Shakeel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 18:56   ` Shakeel Butt
@ 2024-09-23 19:34     ` Lorenzo Stoakes
  2024-09-23 21:49       ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-23 19:34 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Vlastimil Babka, Liam R . Howlett,
	Suren Baghdasaryan, Arnd Bergmann, linux-api, linux-mm,
	linux-kernel, Minchan Kim

On Mon, Sep 23, 2024 at 11:56:06AM GMT, Shakeel Butt wrote:
> On Mon, Sep 23, 2024 at 05:03:56PM GMT, Lorenzo Stoakes wrote:
> [...]
> >  SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> >  		size_t, vlen, int, behavior, unsigned int, flags)
> >  {
> > @@ -1486,10 +1509,9 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> >  	struct iov_iter iter;
> >  	struct task_struct *task;
> >  	struct mm_struct *mm;
> > -	size_t total_len;
> >  	unsigned int f_flags;
> >
> > -	if (flags != 0) {
> > +	if (flags & ~PR_MADV_SELF) {
> >  		ret = -EINVAL;
> >  		goto out;
> >  	}
> > @@ -1498,13 +1520,26 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> >  	if (ret < 0)
> >  		goto out;
> >
> > +	/*
> > +	 * Perform an madvise operation on the current process. No restrictions
> > +	 * need be applied, nor do we need to pin the task or mm_struct.
> > +	 */
> > +	if (flags & PR_MADV_SELF) {
> > +		ret = vector_madvise(current->mm, &iter, behavior);
> > +		goto free_iov;
> > +	}
> > +
> >  	task = pidfd_get_task(pidfd, &f_flags);
> >  	if (IS_ERR(task)) {
> >  		ret = PTR_ERR(task);
> >  		goto free_iov;
> >  	}
> >
> > -	if (!process_madvise_behavior_valid(behavior)) {
> > +	/*
> > +	 * We need only perform this check if we are attempting to manipulate a
> > +	 * remote process's address space.
> > +	 */
> > +	if (mm != current->mm && !process_madvise_remote_valid(behavior)) {
>
> Move the above check after mm is initialized i.e. mm = mm_access().
>
> Shakeel

Ugh, sorry silly one there! Reflexively put that check in the original position.

Enclose a quick fix-patch for it, will fix on any respin also.

----8<----
From dc09e0edf1cf71a89cc4cfc3ec73fdae3c2ab86c Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date: Mon, 23 Sep 2024 20:33:07 +0100
Subject: [PATCH] mm/madvise: retrieve mm before checking

---
 mm/madvise.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 549b36d1463c..49d12f98b677 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1535,20 +1535,20 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 		goto free_iov;
 	}

+	/* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */
+	mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
+	if (IS_ERR_OR_NULL(mm)) {
+		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
+		goto release_task;
+	}
+
 	/*
 	 * We need only perform this check if we are attempting to manipulate a
 	 * remote process's address space.
 	 */
 	if (mm != current->mm && !process_madvise_remote_valid(behavior)) {
 		ret = -EINVAL;
-		goto release_task;
-	}
-
-	/* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */
-	mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
-	if (IS_ERR_OR_NULL(mm)) {
-		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
-		goto release_task;
+		goto release_mm;
 	}

 	/*
--
2.46.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
  2024-09-23 18:56   ` Shakeel Butt
@ 2024-09-23 21:20   ` kernel test robot
  2024-09-23 21:30   ` kernel test robot
  2024-09-24  3:15   ` kernel test robot
  3 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2024-09-23 21:20 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List,
	Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-kernel, Minchan Kim

Hi Lorenzo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on arnd-asm-generic/master soc/for-next linus/master v6.11 next-20240923]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-madvise-introduce-PR_MADV_SELF-flag-to-process_madvise/20240924-000845
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/077be0d59cb1047870a84c87c62e7b027af1c75d.1727106751.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
config: arm-aspeed_g4_defconfig (https://download.01.org/0day-ci/archive/20240924/202409240527.pAgR35QJ-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 8663a75fa2f31299ab8d1d90288d9df92aadee88)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240924/202409240527.pAgR35QJ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409240527.pAgR35QJ-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from mm/madvise.c:9:
   In file included from include/linux/mman.h:5:
   In file included from include/linux/mm.h:2198:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from mm/madvise.c:21:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/madvise.c:1542:6: warning: variable 'mm' is uninitialized when used here [-Wuninitialized]
    1542 |         if (mm != current->mm && !process_madvise_remote_valid(behavior)) {
         |             ^~
   mm/madvise.c:1511:22: note: initialize the variable 'mm' to silence this warning
    1511 |         struct mm_struct *mm;
         |                             ^
         |                              = NULL
   4 warnings generated.


vim +/mm +1542 mm/madvise.c

  1502	
  1503	SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
  1504			size_t, vlen, int, behavior, unsigned int, flags)
  1505	{
  1506		ssize_t ret;
  1507		struct iovec iovstack[UIO_FASTIOV];
  1508		struct iovec *iov = iovstack;
  1509		struct iov_iter iter;
  1510		struct task_struct *task;
  1511		struct mm_struct *mm;
  1512		unsigned int f_flags;
  1513	
  1514		if (flags & ~PR_MADV_SELF) {
  1515			ret = -EINVAL;
  1516			goto out;
  1517		}
  1518	
  1519		ret = import_iovec(ITER_DEST, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter);
  1520		if (ret < 0)
  1521			goto out;
  1522	
  1523		/*
  1524		 * Perform an madvise operation on the current process. No restrictions
  1525		 * need be applied, nor do we need to pin the task or mm_struct.
  1526		 */
  1527		if (flags & PR_MADV_SELF) {
  1528			ret = vector_madvise(current->mm, &iter, behavior);
  1529			goto free_iov;
  1530		}
  1531	
  1532		task = pidfd_get_task(pidfd, &f_flags);
  1533		if (IS_ERR(task)) {
  1534			ret = PTR_ERR(task);
  1535			goto free_iov;
  1536		}
  1537	
  1538		/*
  1539		 * We need only perform this check if we are attempting to manipulate a
  1540		 * remote process's address space.
  1541		 */
> 1542		if (mm != current->mm && !process_madvise_remote_valid(behavior)) {

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
  2024-09-23 18:56   ` Shakeel Butt
  2024-09-23 21:20   ` kernel test robot
@ 2024-09-23 21:30   ` kernel test robot
  2024-09-24  3:15   ` kernel test robot
  3 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2024-09-23 21:30 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, Vlastimil Babka,
	Liam R . Howlett, Suren Baghdasaryan, Arnd Bergmann, Shakeel Butt,
	linux-api, linux-kernel, Minchan Kim

Hi Lorenzo,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on arnd-asm-generic/master soc/for-next linus/master v6.11 next-20240923]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-madvise-introduce-PR_MADV_SELF-flag-to-process_madvise/20240924-000845
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/077be0d59cb1047870a84c87c62e7b027af1c75d.1727106751.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
config: parisc-allnoconfig (https://download.01.org/0day-ci/archive/20240924/202409240556.LgM8vOIF-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240924/202409240556.LgM8vOIF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409240556.LgM8vOIF-lkp@intel.com/

All errors (new ones prefixed by >>):

   mm/madvise.c: In function '__do_sys_process_madvise':
>> mm/madvise.c:1514:22: error: 'PR_MADV_SELF' undeclared (first use in this function)
    1514 |         if (flags & ~PR_MADV_SELF) {
         |                      ^~~~~~~~~~~~
   mm/madvise.c:1514:22: note: each undeclared identifier is reported only once for each function it appears in


vim +/PR_MADV_SELF +1514 mm/madvise.c

  1502	
  1503	SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
  1504			size_t, vlen, int, behavior, unsigned int, flags)
  1505	{
  1506		ssize_t ret;
  1507		struct iovec iovstack[UIO_FASTIOV];
  1508		struct iovec *iov = iovstack;
  1509		struct iov_iter iter;
  1510		struct task_struct *task;
  1511		struct mm_struct *mm;
  1512		unsigned int f_flags;
  1513	
> 1514		if (flags & ~PR_MADV_SELF) {

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 19:34     ` Lorenzo Stoakes
@ 2024-09-23 21:49       ` Arnd Bergmann
  2024-09-24  7:49         ` Lorenzo Stoakes
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2024-09-23 21:49 UTC (permalink / raw)
  To: Lorenzo Stoakes, Shakeel Butt
  Cc: Andrew Morton, Vlastimil Babka, Liam R. Howlett,
	Suren Baghdasaryan, linux-api, linux-mm, linux-kernel,
	Minchan Kim

On Mon, Sep 23, 2024, at 19:34, Lorenzo Stoakes wrote:
> On Mon, Sep 23, 2024 at 11:56:06AM GMT, Shakeel Butt wrote:
>
> +	/* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */
> +	mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
> +	if (IS_ERR_OR_NULL(mm)) {
> +		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
> +		goto release_task;
> +	}

Any chance we can fix mm_access() to not be able to return
a NULL pointer and an error pointer?  IS_ERR_OR_NULL() is
usually an indication of a confusing API, and this is
clearly one of them, given that only one of the
callers actually wants the NULL value instead of -ESRCH.

    Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
                     ` (2 preceding siblings ...)
  2024-09-23 21:30   ` kernel test robot
@ 2024-09-24  3:15   ` kernel test robot
  2024-09-24  8:47     ` Lorenzo Stoakes
  3 siblings, 1 reply; 11+ messages in thread
From: kernel test robot @ 2024-09-24  3:15 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List,
	Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-kernel, Minchan Kim

Hi Lorenzo,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on soc/for-next linus/master v6.11 next-20240923]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-madvise-introduce-PR_MADV_SELF-flag-to-process_madvise/20240924-000845
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/077be0d59cb1047870a84c87c62e7b027af1c75d.1727106751.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
config: mips-ip32_defconfig (https://download.01.org/0day-ci/archive/20240924/202409241034.6ilzMh4w-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 8663a75fa2f31299ab8d1d90288d9df92aadee88)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240924/202409241034.6ilzMh4w-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409241034.6ilzMh4w-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from mm/madvise.c:9:
   In file included from include/linux/mman.h:5:
   In file included from include/linux/mm.h:2198:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from mm/madvise.c:21:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/madvise.c:1514:15: error: use of undeclared identifier 'PR_MADV_SELF'
    1514 |         if (flags & ~PR_MADV_SELF) {
         |                      ^
   mm/madvise.c:1527:14: error: use of undeclared identifier 'PR_MADV_SELF'
    1527 |         if (flags & PR_MADV_SELF) {
         |                     ^
   3 warnings and 2 errors generated.


vim +/PR_MADV_SELF +1514 mm/madvise.c

  1502	
  1503	SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
  1504			size_t, vlen, int, behavior, unsigned int, flags)
  1505	{
  1506		ssize_t ret;
  1507		struct iovec iovstack[UIO_FASTIOV];
  1508		struct iovec *iov = iovstack;
  1509		struct iov_iter iter;
  1510		struct task_struct *task;
  1511		struct mm_struct *mm;
  1512		unsigned int f_flags;
  1513	
> 1514		if (flags & ~PR_MADV_SELF) {

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-23 21:49       ` Arnd Bergmann
@ 2024-09-24  7:49         ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-24  7:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Shakeel Butt, Andrew Morton, Vlastimil Babka, Liam R. Howlett,
	Suren Baghdasaryan, linux-api, linux-mm, linux-kernel,
	Minchan Kim

On Mon, Sep 23, 2024 at 09:49:43PM GMT, Arnd Bergmann wrote:
> On Mon, Sep 23, 2024, at 19:34, Lorenzo Stoakes wrote:
> > On Mon, Sep 23, 2024 at 11:56:06AM GMT, Shakeel Butt wrote:
> >
> > +	/* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */
> > +	mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
> > +	if (IS_ERR_OR_NULL(mm)) {
> > +		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
> > +		goto release_task;
> > +	}
>
> Any chance we can fix mm_access() to not be able to return
> a NULL pointer and an error pointer?  IS_ERR_OR_NULL() is
> usually an indication of a confusing API, and this is
> clearly one of them, given that only one of the
> callers actually wants the NULL value instead of -ESRCH.
>
>     Arnd

Agreed, this should be fixed. I think it'd be a bit out of the scope of
this series so will send something separately for this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
  2024-09-24  3:15   ` kernel test robot
@ 2024-09-24  8:47     ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2024-09-24  8:47 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrew Morton, llvm, oe-kbuild-all, Linux Memory Management List,
	Vlastimil Babka, Liam R . Howlett, Suren Baghdasaryan,
	Arnd Bergmann, Shakeel Butt, linux-api, linux-kernel, Minchan Kim

On Tue, Sep 24, 2024 at 11:15:17AM GMT, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on akpm-mm/mm-everything]
> [also build test ERROR on soc/for-next linus/master v6.11 next-20240923]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-madvise-introduce-PR_MADV_SELF-flag-to-process_madvise/20240924-000845
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/077be0d59cb1047870a84c87c62e7b027af1c75d.1727106751.git.lorenzo.stoakes%40oracle.com
> patch subject: [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise()
> config: mips-ip32_defconfig (https://download.01.org/0day-ci/archive/20240924/202409241034.6ilzMh4w-lkp@intel.com/config)
> compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 8663a75fa2f31299ab8d1d90288d9df92aadee88)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240924/202409241034.6ilzMh4w-lkp@intel.com/reproduce)



>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202409241034.6ilzMh4w-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    In file included from mm/madvise.c:9:
>    In file included from include/linux/mman.h:5:
>    In file included from include/linux/mm.h:2198:
>    include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
>      518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
>          |                               ~~~~~~~~~~~ ^ ~~~
>    In file included from mm/madvise.c:21:
>    include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
>       47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
>          |                                    ~~~~~~~~~~~ ^ ~~~
>    include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
>       49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
>          |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
> >> mm/madvise.c:1514:15: error: use of undeclared identifier 'PR_MADV_SELF'
>     1514 |         if (flags & ~PR_MADV_SELF) {
>          |                      ^
>    mm/madvise.c:1527:14: error: use of undeclared identifier 'PR_MADV_SELF'
>     1527 |         if (flags & PR_MADV_SELF) {
>          |                     ^
>    3 warnings and 2 errors generated.

OK looks like mman-common.h is insufficient for some arches, will fix up and send out a v2.

>
>
> vim +/PR_MADV_SELF +1514 mm/madvise.c
>
>   1502
>   1503	SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>   1504			size_t, vlen, int, behavior, unsigned int, flags)
>   1505	{
>   1506		ssize_t ret;
>   1507		struct iovec iovstack[UIO_FASTIOV];
>   1508		struct iovec *iov = iovstack;
>   1509		struct iov_iter iter;
>   1510		struct task_struct *task;
>   1511		struct mm_struct *mm;
>   1512		unsigned int f_flags;
>   1513
> > 1514		if (flags & ~PR_MADV_SELF) {
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-09-24  8:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-23 16:03 [PATCH 0/2] unrestrict process_madvise() for current process Lorenzo Stoakes
2024-09-23 16:03 ` [PATCH 1/2] mm/madvise: introduce PR_MADV_SELF flag to process_madvise() Lorenzo Stoakes
2024-09-23 18:56   ` Shakeel Butt
2024-09-23 19:34     ` Lorenzo Stoakes
2024-09-23 21:49       ` Arnd Bergmann
2024-09-24  7:49         ` Lorenzo Stoakes
2024-09-23 21:20   ` kernel test robot
2024-09-23 21:30   ` kernel test robot
2024-09-24  3:15   ` kernel test robot
2024-09-24  8:47     ` Lorenzo Stoakes
2024-09-23 16:03 ` [PATCH 2/2] selftests/mm: add test for process_madvise PR_MADV_SELF flag use Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).