All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com, linux-arch@vger.kernel.org
Subject: [PATCH 18/34] mm: add gup flag to indicate "foreign" mm access
Date: Thu, 03 Dec 2015 17:14:49 -0800	[thread overview]
Message-ID: <20151204011449.768C774C@viggo.jf.intel.com> (raw)
In-Reply-To: <20151204011424.8A36E365@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

We try to enforce protection keys in software the same way that we
do in hardware.  (See long example below).

But, we only want to do this when accessing our *own* process's
memory.  If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then
tried to PTRACE_POKE a target process which just happened to have
some mprotect_pkey(pkey=6) memory, we do *not* want to deny the
debugger access to that memory.  PKRU is fundamentally a
thread-local structure and we do not want to enforce it on access
to _another_ thread's data.

This gets especially tricky when we have workqueues or other
delayed-work mechanisms that might run in a random process's context.
We can check that we only enforce pkeys when operating on our *own* mm,
but delayed work gets performed when a random user context is active.
We might end up with a situation where a delayed-work gup fails when
running randomly under its "own" task but succeeds when running under
another process.  We want to avoid that.

To avoid that, we add a GUP flag: FOLL_FOREIGN and a fault flag:
FAULT_FLAG_FOREIGN.  They indicate that we are walking an mm
which is not guranteed to be the same as current->mm and should
not be subject to protection key enforcement.

Thanks to Jerome Glisse for pointing out this scenario.

*** Why do we enforce protection keys in software?? ***

Imagine that we disabled access to the memory pointer to by 'buf'.
The, we implemented sys_write() like this:

	sys_read(fd, buf, len...)
	{
		struct page *page = follow_page(buf);
		void *buf_mapped = kmap(page);
		memcpy(buf_mapped, fd_data, len);
		...
	}

This writes to 'buf' via a *kernel* mapping, without a protection
key.  While this implementation does the same thing:

	sys_read(fd, buf, len...)
	{
		copy_to_user(buf, fd_data, len);
		...
	}

but would hit a protection key fault because the userspace 'buf'
mapping has a protection key set.

To provide consistency, and to make key-protected memory work
as much like mprotect()ed memory as possible, we try to enforce
the same protections as the hardware would when the *kernel* walks
the page tables (and other mm structures).

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
---

 b/arch/powerpc/include/asm/mmu_context.h   |    3 ++-
 b/arch/s390/include/asm/mmu_context.h      |    3 ++-
 b/arch/unicore32/include/asm/mmu_context.h |    3 ++-
 b/arch/x86/include/asm/mmu_context.h       |    5 +++--
 b/drivers/iommu/amd_iommu_v2.c             |    8 +++++---
 b/include/asm-generic/mm_hooks.h           |    3 ++-
 b/include/linux/mm.h                       |    2 ++
 b/mm/gup.c                                 |   15 ++++++++++-----
 b/mm/ksm.c                                 |   10 ++++++++--
 b/mm/memory.c                              |    3 ++-
 10 files changed, 38 insertions(+), 17 deletions(-)

diff -puN arch/powerpc/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.223698386 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h	2015-12-03 16:21:26.241699202 -0800
@@ -148,7 +148,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.224698431 -0800
+++ b/arch/s390/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -130,7 +130,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/unicore32/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.226698522 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -97,7 +97,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/x86/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.228698613 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -299,10 +299,11 @@ static inline bool vma_is_foreign(struct
 	return false;
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* allow access if the VMA is not one from this process */
-	if (vma_is_foreign(vma))
+	if (foreign || vma_is_foreign(vma))
 		return true;
 	return __pkru_allows_pkey(vma_pkey(vma), write);
 }
diff -puN drivers/iommu/amd_iommu_v2.c~pkeys-12-gup-fault-foreign-flag drivers/iommu/amd_iommu_v2.c
--- a/drivers/iommu/amd_iommu_v2.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.229698658 -0800
+++ b/drivers/iommu/amd_iommu_v2.c	2015-12-03 16:21:26.243699293 -0800
@@ -500,9 +500,11 @@ static void do_fault(struct work_struct
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
 	u64 address;
-	int ret, write;
+	int ret, flags;
 
-	write = !!(fault->flags & PPR_FAULT_WRITE);
+	if (fault->flags & PPR_FAULT_WRITE)
+		flags = FAULT_FLAG_WRITE;
+	flags |= FAULT_FLAG_FOREIGN;
 
 	mm = fault->state->mm;
 	address = fault->address;
@@ -523,7 +525,7 @@ static void do_fault(struct work_struct
 		goto out;
 	}
 
-	ret = handle_mm_fault(mm, vma, address, write);
+	ret = handle_mm_fault(mm, vma, address, flags);
 	if (ret & VM_FAULT_ERROR) {
 		/* failed to service fault */
 		up_read(&mm->mmap_sem);
diff -puN include/asm-generic/mm_hooks.h~pkeys-12-gup-fault-foreign-flag include/asm-generic/mm_hooks.h
--- a/include/asm-generic/mm_hooks.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.231698749 -0800
+++ b/include/asm-generic/mm_hooks.h	2015-12-03 16:21:26.243699293 -0800
@@ -26,7 +26,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN include/linux/mm.h~pkeys-12-gup-fault-foreign-flag include/linux/mm.h
--- a/include/linux/mm.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.233698839 -0800
+++ b/include/linux/mm.h	2015-12-03 16:21:26.244699338 -0800
@@ -232,6 +232,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_KILLABLE	0x10	/* The fault task is in SIGKILL killable region */
 #define FAULT_FLAG_TRIED	0x20	/* Second try */
 #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
+#define FAULT_FLAG_FOREIGN	0x80	/* faulting for non current tsk/mm */
 
 /*
  * vm_fault is filled by the the pagefault handler and passed to the vma's
@@ -2138,6 +2139,7 @@ static inline struct page *follow_page(s
 #define FOLL_MIGRATION	0x400	/* wait for page to replace migration entry */
 #define FOLL_TRIED	0x800	/* a retry, previous pass started an IO */
 #define FOLL_MLOCK	0x1000	/* lock present pages */
+#define FOLL_FOREIGN	0x2000	/* we are working on non-current tsk/mm */
 
 typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
 			void *data);
diff -puN mm/gup.c~pkeys-12-gup-fault-foreign-flag mm/gup.c
--- a/mm/gup.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.234698885 -0800
+++ b/mm/gup.c	2015-12-03 16:21:26.245699384 -0800
@@ -310,6 +310,8 @@ static int faultin_page(struct task_stru
 		return -ENOENT;
 	if (*flags & FOLL_WRITE)
 		fault_flags |= FAULT_FLAG_WRITE;
+	if (*flags & FOLL_FOREIGN)
+		fault_flags |= FAULT_FLAG_FOREIGN;
 	if (nonblocking)
 		fault_flags |= FAULT_FLAG_ALLOW_RETRY;
 	if (*flags & FOLL_NOWAIT)
@@ -360,11 +362,13 @@ static int faultin_page(struct task_stru
 static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
 {
 	vm_flags_t vm_flags = vma->vm_flags;
+	int write = (gup_flags & FOLL_WRITE);
+	int foreign = (gup_flags & FOLL_FOREIGN);
 
 	if (vm_flags & (VM_IO | VM_PFNMAP))
 		return -EFAULT;
 
-	if (gup_flags & FOLL_WRITE) {
+	if (write) {
 		if (!(vm_flags & VM_WRITE)) {
 			if (!(gup_flags & FOLL_FORCE))
 				return -EFAULT;
@@ -392,7 +396,7 @@ static int check_vma_flags(struct vm_are
 		if (!(vm_flags & VM_MAYREAD))
 			return -EFAULT;
 	}
-	if (!arch_vma_access_permitted(vma, (gup_flags & FOLL_WRITE)))
+	if (!arch_vma_access_permitted(vma, write, foreign))
 		return -EFAULT;
 	return 0;
 }
@@ -562,7 +566,8 @@ EXPORT_SYMBOL(__get_user_pages);
 
 bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags)
 {
-	bool write = !!(fault_flags & FAULT_FLAG_WRITE);
+	bool write   = !!(fault_flags & FAULT_FLAG_WRITE);
+	bool foreign = !!(fault_flags & FAULT_FLAG_FOREIGN);
 	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;
 
 	if (!(vm_flags & vma->vm_flags))
@@ -570,9 +575,9 @@ bool vma_permits_fault(struct vm_area_st
 
 	/*
 	 * The architecture might have a hardware protection
-	 * mechanism other than read/write that can deny access
+	 * mechanism other than read/write that can deny access.
 	 */
-	if (!arch_vma_access_permitted(vma, write))
+	if (!arch_vma_access_permitted(vma, write, foreign))
 		return false;
 
 	return true;
diff -puN mm/ksm.c~pkeys-12-gup-fault-foreign-flag mm/ksm.c
--- a/mm/ksm.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.236698975 -0800
+++ b/mm/ksm.c	2015-12-03 16:21:26.246699429 -0800
@@ -359,6 +359,10 @@ static inline bool ksm_test_exit(struct
  * in case the application has unmapped and remapped mm,addr meanwhile.
  * Could a ksm page appear anywhere else?  Actually yes, in a VM_PFNMAP
  * mmap of /dev/mem or /dev/kmem, where we would not want to touch it.
+ *
+ * FAULT_FLAG/FOLL_FOREIGN are because we do this outside the context
+ * of the process that owns 'vma'.  We also do not want to enforce
+ * protection keys here anyway.
  */
 static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
 {
@@ -367,12 +371,14 @@ static int break_ksm(struct vm_area_stru
 
 	do {
 		cond_resched();
-		page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION);
+		page = follow_page(vma, addr,
+				FOLL_GET | FOLL_MIGRATION | FOLL_FOREIGN);
 		if (IS_ERR_OR_NULL(page))
 			break;
 		if (PageKsm(page))
 			ret = handle_mm_fault(vma->vm_mm, vma, addr,
-							FAULT_FLAG_WRITE);
+							FAULT_FLAG_WRITE |
+							FAULT_FLAG_FOREIGN);
 		else
 			ret = VM_FAULT_WRITE;
 		put_page(page);
diff -puN mm/memory.c~pkeys-12-gup-fault-foreign-flag mm/memory.c
--- a/mm/memory.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.238699066 -0800
+++ b/mm/memory.c	2015-12-03 16:21:26.247699474 -0800
@@ -3345,7 +3345,8 @@ static int __handle_mm_fault(struct mm_s
 	pmd_t *pmd;
 	pte_t *pte;
 
-	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE))
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+					    flags & FAULT_FLAG_FOREIGN))
 		return VM_FAULT_SIGSEGV;
 
 	if (unlikely(is_vm_hugetlb_page(vma)))
_

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com, linux-arch@vger.kernel.org
Subject: [PATCH 18/34] mm: add gup flag to indicate "foreign" mm access
Date: Thu, 03 Dec 2015 17:14:49 -0800	[thread overview]
Message-ID: <20151204011449.768C774C@viggo.jf.intel.com> (raw)
In-Reply-To: <20151204011424.8A36E365@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

We try to enforce protection keys in software the same way that we
do in hardware.  (See long example below).

But, we only want to do this when accessing our *own* process's
memory.  If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then
tried to PTRACE_POKE a target process which just happened to have
some mprotect_pkey(pkey=6) memory, we do *not* want to deny the
debugger access to that memory.  PKRU is fundamentally a
thread-local structure and we do not want to enforce it on access
to _another_ thread's data.

This gets especially tricky when we have workqueues or other
delayed-work mechanisms that might run in a random process's context.
We can check that we only enforce pkeys when operating on our *own* mm,
but delayed work gets performed when a random user context is active.
We might end up with a situation where a delayed-work gup fails when
running randomly under its "own" task but succeeds when running under
another process.  We want to avoid that.

To avoid that, we add a GUP flag: FOLL_FOREIGN and a fault flag:
FAULT_FLAG_FOREIGN.  They indicate that we are walking an mm
which is not guranteed to be the same as current->mm and should
not be subject to protection key enforcement.

Thanks to Jerome Glisse for pointing out this scenario.

*** Why do we enforce protection keys in software?? ***

Imagine that we disabled access to the memory pointer to by 'buf'.
The, we implemented sys_write() like this:

	sys_read(fd, buf, len...)
	{
		struct page *page = follow_page(buf);
		void *buf_mapped = kmap(page);
		memcpy(buf_mapped, fd_data, len);
		...
	}

This writes to 'buf' via a *kernel* mapping, without a protection
key.  While this implementation does the same thing:

	sys_read(fd, buf, len...)
	{
		copy_to_user(buf, fd_data, len);
		...
	}

but would hit a protection key fault because the userspace 'buf'
mapping has a protection key set.

To provide consistency, and to make key-protected memory work
as much like mprotect()ed memory as possible, we try to enforce
the same protections as the hardware would when the *kernel* walks
the page tables (and other mm structures).

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
---

 b/arch/powerpc/include/asm/mmu_context.h   |    3 ++-
 b/arch/s390/include/asm/mmu_context.h      |    3 ++-
 b/arch/unicore32/include/asm/mmu_context.h |    3 ++-
 b/arch/x86/include/asm/mmu_context.h       |    5 +++--
 b/drivers/iommu/amd_iommu_v2.c             |    8 +++++---
 b/include/asm-generic/mm_hooks.h           |    3 ++-
 b/include/linux/mm.h                       |    2 ++
 b/mm/gup.c                                 |   15 ++++++++++-----
 b/mm/ksm.c                                 |   10 ++++++++--
 b/mm/memory.c                              |    3 ++-
 10 files changed, 38 insertions(+), 17 deletions(-)

diff -puN arch/powerpc/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.223698386 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h	2015-12-03 16:21:26.241699202 -0800
@@ -148,7 +148,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.224698431 -0800
+++ b/arch/s390/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -130,7 +130,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/unicore32/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.226698522 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -97,7 +97,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN arch/x86/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.228698613 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-12-03 16:21:26.242699248 -0800
@@ -299,10 +299,11 @@ static inline bool vma_is_foreign(struct
 	return false;
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* allow access if the VMA is not one from this process */
-	if (vma_is_foreign(vma))
+	if (foreign || vma_is_foreign(vma))
 		return true;
 	return __pkru_allows_pkey(vma_pkey(vma), write);
 }
diff -puN drivers/iommu/amd_iommu_v2.c~pkeys-12-gup-fault-foreign-flag drivers/iommu/amd_iommu_v2.c
--- a/drivers/iommu/amd_iommu_v2.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.229698658 -0800
+++ b/drivers/iommu/amd_iommu_v2.c	2015-12-03 16:21:26.243699293 -0800
@@ -500,9 +500,11 @@ static void do_fault(struct work_struct
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
 	u64 address;
-	int ret, write;
+	int ret, flags;
 
-	write = !!(fault->flags & PPR_FAULT_WRITE);
+	if (fault->flags & PPR_FAULT_WRITE)
+		flags = FAULT_FLAG_WRITE;
+	flags |= FAULT_FLAG_FOREIGN;
 
 	mm = fault->state->mm;
 	address = fault->address;
@@ -523,7 +525,7 @@ static void do_fault(struct work_struct
 		goto out;
 	}
 
-	ret = handle_mm_fault(mm, vma, address, write);
+	ret = handle_mm_fault(mm, vma, address, flags);
 	if (ret & VM_FAULT_ERROR) {
 		/* failed to service fault */
 		up_read(&mm->mmap_sem);
diff -puN include/asm-generic/mm_hooks.h~pkeys-12-gup-fault-foreign-flag include/asm-generic/mm_hooks.h
--- a/include/asm-generic/mm_hooks.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.231698749 -0800
+++ b/include/asm-generic/mm_hooks.h	2015-12-03 16:21:26.243699293 -0800
@@ -26,7 +26,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool foreign)
 {
 	/* by default, allow everything */
 	return true;
diff -puN include/linux/mm.h~pkeys-12-gup-fault-foreign-flag include/linux/mm.h
--- a/include/linux/mm.h~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.233698839 -0800
+++ b/include/linux/mm.h	2015-12-03 16:21:26.244699338 -0800
@@ -232,6 +232,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_KILLABLE	0x10	/* The fault task is in SIGKILL killable region */
 #define FAULT_FLAG_TRIED	0x20	/* Second try */
 #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
+#define FAULT_FLAG_FOREIGN	0x80	/* faulting for non current tsk/mm */
 
 /*
  * vm_fault is filled by the the pagefault handler and passed to the vma's
@@ -2138,6 +2139,7 @@ static inline struct page *follow_page(s
 #define FOLL_MIGRATION	0x400	/* wait for page to replace migration entry */
 #define FOLL_TRIED	0x800	/* a retry, previous pass started an IO */
 #define FOLL_MLOCK	0x1000	/* lock present pages */
+#define FOLL_FOREIGN	0x2000	/* we are working on non-current tsk/mm */
 
 typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
 			void *data);
diff -puN mm/gup.c~pkeys-12-gup-fault-foreign-flag mm/gup.c
--- a/mm/gup.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.234698885 -0800
+++ b/mm/gup.c	2015-12-03 16:21:26.245699384 -0800
@@ -310,6 +310,8 @@ static int faultin_page(struct task_stru
 		return -ENOENT;
 	if (*flags & FOLL_WRITE)
 		fault_flags |= FAULT_FLAG_WRITE;
+	if (*flags & FOLL_FOREIGN)
+		fault_flags |= FAULT_FLAG_FOREIGN;
 	if (nonblocking)
 		fault_flags |= FAULT_FLAG_ALLOW_RETRY;
 	if (*flags & FOLL_NOWAIT)
@@ -360,11 +362,13 @@ static int faultin_page(struct task_stru
 static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
 {
 	vm_flags_t vm_flags = vma->vm_flags;
+	int write = (gup_flags & FOLL_WRITE);
+	int foreign = (gup_flags & FOLL_FOREIGN);
 
 	if (vm_flags & (VM_IO | VM_PFNMAP))
 		return -EFAULT;
 
-	if (gup_flags & FOLL_WRITE) {
+	if (write) {
 		if (!(vm_flags & VM_WRITE)) {
 			if (!(gup_flags & FOLL_FORCE))
 				return -EFAULT;
@@ -392,7 +396,7 @@ static int check_vma_flags(struct vm_are
 		if (!(vm_flags & VM_MAYREAD))
 			return -EFAULT;
 	}
-	if (!arch_vma_access_permitted(vma, (gup_flags & FOLL_WRITE)))
+	if (!arch_vma_access_permitted(vma, write, foreign))
 		return -EFAULT;
 	return 0;
 }
@@ -562,7 +566,8 @@ EXPORT_SYMBOL(__get_user_pages);
 
 bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags)
 {
-	bool write = !!(fault_flags & FAULT_FLAG_WRITE);
+	bool write   = !!(fault_flags & FAULT_FLAG_WRITE);
+	bool foreign = !!(fault_flags & FAULT_FLAG_FOREIGN);
 	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;
 
 	if (!(vm_flags & vma->vm_flags))
@@ -570,9 +575,9 @@ bool vma_permits_fault(struct vm_area_st
 
 	/*
 	 * The architecture might have a hardware protection
-	 * mechanism other than read/write that can deny access
+	 * mechanism other than read/write that can deny access.
 	 */
-	if (!arch_vma_access_permitted(vma, write))
+	if (!arch_vma_access_permitted(vma, write, foreign))
 		return false;
 
 	return true;
diff -puN mm/ksm.c~pkeys-12-gup-fault-foreign-flag mm/ksm.c
--- a/mm/ksm.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.236698975 -0800
+++ b/mm/ksm.c	2015-12-03 16:21:26.246699429 -0800
@@ -359,6 +359,10 @@ static inline bool ksm_test_exit(struct
  * in case the application has unmapped and remapped mm,addr meanwhile.
  * Could a ksm page appear anywhere else?  Actually yes, in a VM_PFNMAP
  * mmap of /dev/mem or /dev/kmem, where we would not want to touch it.
+ *
+ * FAULT_FLAG/FOLL_FOREIGN are because we do this outside the context
+ * of the process that owns 'vma'.  We also do not want to enforce
+ * protection keys here anyway.
  */
 static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
 {
@@ -367,12 +371,14 @@ static int break_ksm(struct vm_area_stru
 
 	do {
 		cond_resched();
-		page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION);
+		page = follow_page(vma, addr,
+				FOLL_GET | FOLL_MIGRATION | FOLL_FOREIGN);
 		if (IS_ERR_OR_NULL(page))
 			break;
 		if (PageKsm(page))
 			ret = handle_mm_fault(vma->vm_mm, vma, addr,
-							FAULT_FLAG_WRITE);
+							FAULT_FLAG_WRITE |
+							FAULT_FLAG_FOREIGN);
 		else
 			ret = VM_FAULT_WRITE;
 		put_page(page);
diff -puN mm/memory.c~pkeys-12-gup-fault-foreign-flag mm/memory.c
--- a/mm/memory.c~pkeys-12-gup-fault-foreign-flag	2015-12-03 16:21:26.238699066 -0800
+++ b/mm/memory.c	2015-12-03 16:21:26.247699474 -0800
@@ -3345,7 +3345,8 @@ static int __handle_mm_fault(struct mm_s
 	pmd_t *pmd;
 	pte_t *pte;
 
-	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE))
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+					    flags & FAULT_FLAG_FOREIGN))
 		return VM_FAULT_SIGSEGV;
 
 	if (unlikely(is_vm_hugetlb_page(vma)))
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-12-04  1:14 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04  1:14 [PATCH 00/34] x86: Memory Protection Keys (v5) Dave Hansen
2015-12-04  1:14 ` Dave Hansen
2015-12-04  1:14 ` Dave Hansen
2015-12-04  1:14 ` [PATCH 01/34] mm, gup: introduce concept of "foreign" get_user_pages() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 02/34] x86, fpu: add placeholder for Processor Trace XSAVE state Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 03/34] x86, pkeys: Add Kconfig option Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 04/34] x86, pkeys: cpuid bit definition Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 05/34] x86, pkeys: define new CR4 bit Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 06/34] x86, pkeys: add PKRU xsave fields and data structure(s) Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 07/34] x86, pkeys: PTE bits for storing protection key Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 08/34] x86, pkeys: new page fault error code bit: PF_PK Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 09/34] x86, pkeys: store protection in high VMA flags Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 14:17   ` Thomas Gleixner
2015-12-08 14:17     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 10/34] x86, pkeys: arch-specific protection bits Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 15:15   ` [PATCH 10/34] x86, pkeys: arch-specific protection bitsy Thomas Gleixner
2015-12-08 15:15     ` Thomas Gleixner
2015-12-08 16:34     ` Dave Hansen
2015-12-08 16:34       ` Dave Hansen
2015-12-08 17:24       ` Thomas Gleixner
2015-12-08 17:24         ` Thomas Gleixner
2015-12-08 18:06         ` Dave Hansen
2015-12-08 18:29           ` Thomas Gleixner
2015-12-08 18:29             ` Thomas Gleixner
2015-12-08 18:35             ` Thomas Gleixner
2015-12-08 18:35               ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 11/34] x86, pkeys: pass VMA down in to fault signal generation code Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 12/34] signals, pkeys: notify userspace about protection key faults Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 13/34] x86, pkeys: fill in pkey field in siginfo Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 14/34] x86, pkeys: add functions to fetch PKRU Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 15:18   ` Thomas Gleixner
2015-12-08 15:18     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 15/34] mm: factor out VMA fault permission checking Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 17:26   ` Thomas Gleixner
2015-12-08 17:26     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 16/34] x86, mm: simplify get_user_pages() PTE bit handling Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:01   ` Thomas Gleixner
2015-12-08 18:01     ` Thomas Gleixner
2015-12-08 18:30     ` Dave Hansen
2015-12-08 18:30       ` Dave Hansen
2015-12-04  1:14 ` [PATCH 17/34] x86, pkeys: check VMAs and PTEs for protection keys Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:11   ` Thomas Gleixner
2015-12-08 18:11     ` Thomas Gleixner
2015-12-04  1:14 ` Dave Hansen [this message]
2015-12-04  1:14   ` [PATCH 18/34] mm: add gup flag to indicate "foreign" mm access Dave Hansen
2015-12-04  1:14 ` [PATCH 19/34] x86, pkeys: optimize fault handling in access_error() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:14   ` Thomas Gleixner
2015-12-08 18:14     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 20/34] x86, pkeys: differentiate instruction fetches Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:17   ` Thomas Gleixner
2015-12-08 18:17     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 21/34] x86, pkeys: dump PKRU with other kernel registers Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:19   ` Thomas Gleixner
2015-12-08 18:19     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 22/34] x86, pkeys: dump PTE pkey in /proc/pid/smaps Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:20   ` Thomas Gleixner
2015-12-08 18:20     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 23/34] x86, pkeys: add Kconfig prompt to existing config option Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:21   ` Thomas Gleixner
2015-12-08 18:21     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 24/34] mm, multi-arch: pass a protection key in to calc_vm_flag_bits() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 25/34] x86, pkeys: add arch_validate_pkey() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:39   ` Thomas Gleixner
2015-12-08 18:39     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 26/34] mm: implement new mprotect_key() system call Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-05  6:50   ` Michael Kerrisk (man-pages)
2015-12-05  6:50     ` Michael Kerrisk (man-pages)
2015-12-05  6:50     ` Michael Kerrisk (man-pages)
2015-12-07 16:44     ` Dave Hansen
2015-12-07 16:44       ` Dave Hansen
2015-12-09 11:08       ` Michael Kerrisk (man-pages)
2015-12-09 11:08         ` Michael Kerrisk (man-pages)
2015-12-09 15:48         ` Dave Hansen
2015-12-09 15:48           ` Dave Hansen
2015-12-09 16:45           ` Michael Kerrisk (man-pages)
2015-12-09 16:45             ` Michael Kerrisk (man-pages)
2015-12-09 16:45             ` Michael Kerrisk (man-pages)
2015-12-09 17:05             ` Dave Hansen
2015-12-09 17:05               ` Dave Hansen
2015-12-09 17:05               ` Dave Hansen
2015-12-11 20:13               ` Michael Kerrisk (man-pages)
2015-12-11 20:13                 ` Michael Kerrisk (man-pages)
2015-12-04  1:15 ` [PATCH 27/34] x86, pkeys: make mprotect_key() mask off additional vm_flags Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:41   ` Thomas Gleixner
2015-12-08 18:41     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 28/34] x86: wire up mprotect_key() system call Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:44   ` Thomas Gleixner
2015-12-08 18:44     ` Thomas Gleixner
2015-12-08 18:44     ` Thomas Gleixner
2015-12-08 19:06     ` Dave Hansen
2015-12-08 19:06       ` Dave Hansen
2015-12-08 20:38       ` Thomas Gleixner
2015-12-08 20:38         ` Thomas Gleixner
2015-12-08 20:38         ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 29/34] x86: separate out LDT init from context init Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:45   ` Thomas Gleixner
2015-12-08 18:45     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 30/34] x86, fpu: allow setting of XSAVE state Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:48   ` Thomas Gleixner
2015-12-08 18:48     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 31/34] x86, pkeys: allocation/free syscalls Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 32/34] x86, pkeys: add pkey set/get syscalls Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 33/34] x86, pkeys: actually enable Memory Protection Keys in CPU Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 34/34] x86, pkeys: Documentation Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04 23:31 ` [PATCH 00/34] x86: Memory Protection Keys (v5) Andy Lutomirski
2015-12-04 23:31   ` Andy Lutomirski
2015-12-04 23:38   ` Dave Hansen
2015-12-04 23:38     ` Dave Hansen
2015-12-04 23:38     ` Dave Hansen
2015-12-11 20:16     ` Andy Lutomirski
2015-12-11 20:16       ` Andy Lutomirski
2015-12-11 20:16       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151204011449.768C774C@viggo.jf.intel.com \
    --to=dave@sr71.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.