All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com
Subject: [PATCH 17/34] x86, pkeys: check VMAs and PTEs for protection keys
Date: Thu, 03 Dec 2015 17:14:48 -0800	[thread overview]
Message-ID: <20151204011448.23DC574D@viggo.jf.intel.com> (raw)
In-Reply-To: <20151204011424.8A36E365@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

Today, for normal faults and page table walks, we check the VMA
and/or PTE to ensure that it is compatible with the action.  For
instance, if we get a write fault on a non-writeable VMA, we
SIGSEGV.

We try to do the same thing for protection keys.  Basically, we
try to make sure that if a user does this:

	mprotect(ptr, size, PROT_NONE);
	*ptr = foo;

they see the same effects with protection keys when they do this:

	mprotect(ptr, size, PROT_READ|PROT_WRITE);
	set_pkey(ptr, size, 4);
	wrpkru(0xffffff3f); // access disable pkey 4
	*ptr = foo;

The state to do that checking is in the VMA, but we also
sometimes have to do it on the page tables only, like when doing
a get_user_pages_fast() where we have no VMA.

We add two functions and expose them to generic code:

	arch_pte_access_permitted(pte_flags, write)
	arch_vma_access_permitted(vma, write)

These are, of course, backed up in x86 arch code with checks
against the PTE or VMA's protection key.

But, there are also cases where we do not want to respect
protection keys.  When we ptrace(), for instance, we do not want
to apply the tracer's PKRU permissions to the PTEs from the
process being traced.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/powerpc/include/asm/mmu_context.h   |   11 ++++++
 b/arch/s390/include/asm/mmu_context.h      |   11 ++++++
 b/arch/unicore32/include/asm/mmu_context.h |   11 ++++++
 b/arch/x86/include/asm/mmu_context.h       |   49 +++++++++++++++++++++++++++++
 b/arch/x86/include/asm/pgtable.h           |   29 +++++++++++++++++
 b/arch/x86/mm/fault.c                      |   21 +++++++++++-
 b/arch/x86/mm/gup.c                        |    4 ++
 b/include/asm-generic/mm_hooks.h           |   11 ++++++
 b/mm/gup.c                                 |   18 ++++++++--
 b/mm/memory.c                              |    4 ++
 10 files changed, 165 insertions(+), 4 deletions(-)

diff -puN arch/powerpc/include/asm/mmu_context.h~pkeys-11-pte-fault arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.567668634 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h	2015-12-03 16:21:25.585669451 -0800
@@ -148,5 +148,16 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-11-pte-fault arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.569668725 -0800
+++ b/arch/s390/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -130,4 +130,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif /* __S390_MMU_CONTEXT_H */
diff -puN arch/unicore32/include/asm/mmu_context.h~pkeys-11-pte-fault arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.570668770 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -97,4 +97,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif
diff -puN arch/x86/include/asm/mmu_context.h~pkeys-11-pte-fault arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.572668861 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -263,4 +263,53 @@ static inline int vma_pkey(struct vm_are
 	return pkey;
 }
 
+static inline bool __pkru_allows_pkey(u16 pkey, bool write)
+{
+	u32 pkru = read_pkru();
+
+	if (!__pkru_allows_read(pkru, pkey))
+		return false;
+	if (write && !__pkru_allows_write(pkru, pkey))
+		return false;
+
+	return true;
+}
+
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to PKRU for other
+ * processes or any way to tell *which * PKRU in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+	if (!current->mm)
+		return true;
+	/*
+	 * Should PKRU be enforced on the access to this VMA?  If
+	 * the VMA is from another process, then PKRU has no
+	 * relevance and should not be enforced.
+	 */
+	if (current->mm != vma->vm_mm)
+		return true;
+
+	return false;
+}
+
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* allow access if the VMA is not one from this process */
+	if (vma_is_foreign(vma))
+		return true;
+	return __pkru_allows_pkey(vma_pkey(vma), write);
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	return __pkru_allows_pkey(pte_flags_pkey(pte_flags(pte)), write);
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff -puN arch/x86/include/asm/pgtable.h~pkeys-11-pte-fault arch/x86/include/asm/pgtable.h
--- a/arch/x86/include/asm/pgtable.h~pkeys-11-pte-fault	2015-12-03 16:21:25.574668952 -0800
+++ b/arch/x86/include/asm/pgtable.h	2015-12-03 16:21:25.587669541 -0800
@@ -910,6 +910,35 @@ static inline pte_t pte_swp_clear_soft_d
 }
 #endif
 
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * 2;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * 2;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u16 pte_flags_pkey(unsigned long pte_flags)
+{
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+	/* ifdef to avoid doing 59-bit shift on 32-bit values */
+	return (pte_flags & _PAGE_PKEY_MASK) >> _PAGE_BIT_PKEY_BIT0;
+#else
+	return 0;
+#endif
+}
+
 #include <asm-generic/pgtable.h>
 #endif	/* __ASSEMBLY__ */
 
diff -puN arch/x86/mm/fault.c~pkeys-11-pte-fault arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-11-pte-fault	2015-12-03 16:21:25.575668997 -0800
+++ b/arch/x86/mm/fault.c	2015-12-03 16:21:25.587669541 -0800
@@ -897,6 +897,16 @@ bad_area(struct pt_regs *regs, unsigned
 	__bad_area(regs, error_code, address, NULL, SEGV_MAPERR);
 }
 
+static inline bool bad_area_access_from_pkeys(unsigned long error_code,
+		struct vm_area_struct *vma)
+{
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return false;
+	if (error_code & PF_PK)
+		return true;
+	return false;
+}
+
 static noinline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 		      unsigned long address, struct vm_area_struct *vma)
@@ -906,7 +916,7 @@ bad_area_access_error(struct pt_regs *re
 	 * But, doing it this way allows compiler optimizations
 	 * if pkeys are compiled out.
 	 */
-	if (boot_cpu_has(X86_FEATURE_OSPKE) && (error_code & PF_PK))
+	if (bad_area_access_from_pkeys(error_code, vma))
 		__bad_area(regs, error_code, address, vma, SEGV_PKUERR);
 	else
 		__bad_area(regs, error_code, address, vma, SEGV_ACCERR);
@@ -1081,6 +1091,15 @@ int show_unhandled_signals = 1;
 static inline int
 access_error(unsigned long error_code, struct vm_area_struct *vma)
 {
+	/*
+	 * Access or read was blocked by protection keys. We do
+	 * this check before any others because we do not want
+	 * to, for instance, confuse a protection-key-denied
+	 * write with one for which we should do a COW.
+	 */
+	if (error_code & PF_PK)
+		return 1;
+
 	if (error_code & PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
diff -puN arch/x86/mm/gup.c~pkeys-11-pte-fault arch/x86/mm/gup.c
--- a/arch/x86/mm/gup.c~pkeys-11-pte-fault	2015-12-03 16:21:25.577669088 -0800
+++ b/arch/x86/mm/gup.c	2015-12-03 16:21:25.588669587 -0800
@@ -10,6 +10,7 @@
 #include <linux/highmem.h>
 #include <linux/swap.h>
 
+#include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 
 static inline pte_t gup_get_pte(pte_t *ptep)
@@ -74,6 +75,9 @@ static inline int pte_allows_gup(pte_t p
 		return 0;
 	if (write && !(pte_write(pte)))
 		return 0;
+	/* This one checks memory protection keys. */
+	if (!arch_pte_access_permitted(pte, write))
+		return 0;
 	return 1;
 }
 
diff -puN include/asm-generic/mm_hooks.h~pkeys-11-pte-fault include/asm-generic/mm_hooks.h
--- a/include/asm-generic/mm_hooks.h~pkeys-11-pte-fault	2015-12-03 16:21:25.579669178 -0800
+++ b/include/asm-generic/mm_hooks.h	2015-12-03 16:21:25.588669587 -0800
@@ -26,4 +26,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif	/* _ASM_GENERIC_MM_HOOKS_H */
diff -puN mm/gup.c~pkeys-11-pte-fault mm/gup.c
--- a/mm/gup.c~pkeys-11-pte-fault	2015-12-03 16:21:25.580669224 -0800
+++ b/mm/gup.c	2015-12-03 16:21:25.589669632 -0800
@@ -13,6 +13,7 @@
 #include <linux/rwsem.h>
 #include <linux/hugetlb.h>
 
+#include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
 
@@ -391,6 +392,8 @@ static int check_vma_flags(struct vm_are
 		if (!(vm_flags & VM_MAYREAD))
 			return -EFAULT;
 	}
+	if (!arch_vma_access_permitted(vma, (gup_flags & FOLL_WRITE)))
+		return -EFAULT;
 	return 0;
 }
 
@@ -559,13 +562,19 @@ EXPORT_SYMBOL(__get_user_pages);
 
 bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags)
 {
-	vm_flags_t vm_flags;
-
-	vm_flags = (fault_flags & FAULT_FLAG_WRITE) ? VM_WRITE : VM_READ;
+	bool write = !!(fault_flags & FAULT_FLAG_WRITE);
+	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;
 
 	if (!(vm_flags & vma->vm_flags))
 		return false;
 
+	/*
+	 * The architecture might have a hardware protection
+	 * mechanism other than read/write that can deny access
+	 */
+	if (!arch_vma_access_permitted(vma, write))
+		return false;
+
 	return true;
 }
 
@@ -1102,6 +1111,9 @@ static int gup_pte_range(pmd_t pmd, unsi
 			pte_protnone(pte) || (write && !pte_write(pte)))
 			goto pte_unmap;
 
+		if (!arch_pte_access_permitted(pte, write))
+			goto pte_unmap;
+
 		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
 		page = pte_page(pte);
 
diff -puN mm/memory.c~pkeys-11-pte-fault mm/memory.c
--- a/mm/memory.c~pkeys-11-pte-fault	2015-12-03 16:21:25.582669314 -0800
+++ b/mm/memory.c	2015-12-03 16:21:25.590669677 -0800
@@ -64,6 +64,7 @@
 #include <linux/userfaultfd_k.h>
 
 #include <asm/io.h>
+#include <asm/mmu_context.h>
 #include <asm/pgalloc.h>
 #include <asm/uaccess.h>
 #include <asm/tlb.h>
@@ -3344,6 +3345,9 @@ static int __handle_mm_fault(struct mm_s
 	pmd_t *pmd;
 	pte_t *pte;
 
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE))
+		return VM_FAULT_SIGSEGV;
+
 	if (unlikely(is_vm_hugetlb_page(vma)))
 		return hugetlb_fault(mm, vma, address, flags);
 
_

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com
Subject: [PATCH 17/34] x86, pkeys: check VMAs and PTEs for protection keys
Date: Thu, 03 Dec 2015 17:14:48 -0800	[thread overview]
Message-ID: <20151204011448.23DC574D@viggo.jf.intel.com> (raw)
In-Reply-To: <20151204011424.8A36E365@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

Today, for normal faults and page table walks, we check the VMA
and/or PTE to ensure that it is compatible with the action.  For
instance, if we get a write fault on a non-writeable VMA, we
SIGSEGV.

We try to do the same thing for protection keys.  Basically, we
try to make sure that if a user does this:

	mprotect(ptr, size, PROT_NONE);
	*ptr = foo;

they see the same effects with protection keys when they do this:

	mprotect(ptr, size, PROT_READ|PROT_WRITE);
	set_pkey(ptr, size, 4);
	wrpkru(0xffffff3f); // access disable pkey 4
	*ptr = foo;

The state to do that checking is in the VMA, but we also
sometimes have to do it on the page tables only, like when doing
a get_user_pages_fast() where we have no VMA.

We add two functions and expose them to generic code:

	arch_pte_access_permitted(pte_flags, write)
	arch_vma_access_permitted(vma, write)

These are, of course, backed up in x86 arch code with checks
against the PTE or VMA's protection key.

But, there are also cases where we do not want to respect
protection keys.  When we ptrace(), for instance, we do not want
to apply the tracer's PKRU permissions to the PTEs from the
process being traced.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/powerpc/include/asm/mmu_context.h   |   11 ++++++
 b/arch/s390/include/asm/mmu_context.h      |   11 ++++++
 b/arch/unicore32/include/asm/mmu_context.h |   11 ++++++
 b/arch/x86/include/asm/mmu_context.h       |   49 +++++++++++++++++++++++++++++
 b/arch/x86/include/asm/pgtable.h           |   29 +++++++++++++++++
 b/arch/x86/mm/fault.c                      |   21 +++++++++++-
 b/arch/x86/mm/gup.c                        |    4 ++
 b/include/asm-generic/mm_hooks.h           |   11 ++++++
 b/mm/gup.c                                 |   18 ++++++++--
 b/mm/memory.c                              |    4 ++
 10 files changed, 165 insertions(+), 4 deletions(-)

diff -puN arch/powerpc/include/asm/mmu_context.h~pkeys-11-pte-fault arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.567668634 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h	2015-12-03 16:21:25.585669451 -0800
@@ -148,5 +148,16 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-11-pte-fault arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.569668725 -0800
+++ b/arch/s390/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -130,4 +130,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif /* __S390_MMU_CONTEXT_H */
diff -puN arch/unicore32/include/asm/mmu_context.h~pkeys-11-pte-fault arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.570668770 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -97,4 +97,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif
diff -puN arch/x86/include/asm/mmu_context.h~pkeys-11-pte-fault arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-11-pte-fault	2015-12-03 16:21:25.572668861 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-12-03 16:21:25.586669496 -0800
@@ -263,4 +263,53 @@ static inline int vma_pkey(struct vm_are
 	return pkey;
 }
 
+static inline bool __pkru_allows_pkey(u16 pkey, bool write)
+{
+	u32 pkru = read_pkru();
+
+	if (!__pkru_allows_read(pkru, pkey))
+		return false;
+	if (write && !__pkru_allows_write(pkru, pkey))
+		return false;
+
+	return true;
+}
+
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to PKRU for other
+ * processes or any way to tell *which * PKRU in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+	if (!current->mm)
+		return true;
+	/*
+	 * Should PKRU be enforced on the access to this VMA?  If
+	 * the VMA is from another process, then PKRU has no
+	 * relevance and should not be enforced.
+	 */
+	if (current->mm != vma->vm_mm)
+		return true;
+
+	return false;
+}
+
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* allow access if the VMA is not one from this process */
+	if (vma_is_foreign(vma))
+		return true;
+	return __pkru_allows_pkey(vma_pkey(vma), write);
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	return __pkru_allows_pkey(pte_flags_pkey(pte_flags(pte)), write);
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff -puN arch/x86/include/asm/pgtable.h~pkeys-11-pte-fault arch/x86/include/asm/pgtable.h
--- a/arch/x86/include/asm/pgtable.h~pkeys-11-pte-fault	2015-12-03 16:21:25.574668952 -0800
+++ b/arch/x86/include/asm/pgtable.h	2015-12-03 16:21:25.587669541 -0800
@@ -910,6 +910,35 @@ static inline pte_t pte_swp_clear_soft_d
 }
 #endif
 
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * 2;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * 2;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u16 pte_flags_pkey(unsigned long pte_flags)
+{
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+	/* ifdef to avoid doing 59-bit shift on 32-bit values */
+	return (pte_flags & _PAGE_PKEY_MASK) >> _PAGE_BIT_PKEY_BIT0;
+#else
+	return 0;
+#endif
+}
+
 #include <asm-generic/pgtable.h>
 #endif	/* __ASSEMBLY__ */
 
diff -puN arch/x86/mm/fault.c~pkeys-11-pte-fault arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-11-pte-fault	2015-12-03 16:21:25.575668997 -0800
+++ b/arch/x86/mm/fault.c	2015-12-03 16:21:25.587669541 -0800
@@ -897,6 +897,16 @@ bad_area(struct pt_regs *regs, unsigned
 	__bad_area(regs, error_code, address, NULL, SEGV_MAPERR);
 }
 
+static inline bool bad_area_access_from_pkeys(unsigned long error_code,
+		struct vm_area_struct *vma)
+{
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return false;
+	if (error_code & PF_PK)
+		return true;
+	return false;
+}
+
 static noinline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 		      unsigned long address, struct vm_area_struct *vma)
@@ -906,7 +916,7 @@ bad_area_access_error(struct pt_regs *re
 	 * But, doing it this way allows compiler optimizations
 	 * if pkeys are compiled out.
 	 */
-	if (boot_cpu_has(X86_FEATURE_OSPKE) && (error_code & PF_PK))
+	if (bad_area_access_from_pkeys(error_code, vma))
 		__bad_area(regs, error_code, address, vma, SEGV_PKUERR);
 	else
 		__bad_area(regs, error_code, address, vma, SEGV_ACCERR);
@@ -1081,6 +1091,15 @@ int show_unhandled_signals = 1;
 static inline int
 access_error(unsigned long error_code, struct vm_area_struct *vma)
 {
+	/*
+	 * Access or read was blocked by protection keys. We do
+	 * this check before any others because we do not want
+	 * to, for instance, confuse a protection-key-denied
+	 * write with one for which we should do a COW.
+	 */
+	if (error_code & PF_PK)
+		return 1;
+
 	if (error_code & PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
diff -puN arch/x86/mm/gup.c~pkeys-11-pte-fault arch/x86/mm/gup.c
--- a/arch/x86/mm/gup.c~pkeys-11-pte-fault	2015-12-03 16:21:25.577669088 -0800
+++ b/arch/x86/mm/gup.c	2015-12-03 16:21:25.588669587 -0800
@@ -10,6 +10,7 @@
 #include <linux/highmem.h>
 #include <linux/swap.h>
 
+#include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 
 static inline pte_t gup_get_pte(pte_t *ptep)
@@ -74,6 +75,9 @@ static inline int pte_allows_gup(pte_t p
 		return 0;
 	if (write && !(pte_write(pte)))
 		return 0;
+	/* This one checks memory protection keys. */
+	if (!arch_pte_access_permitted(pte, write))
+		return 0;
 	return 1;
 }
 
diff -puN include/asm-generic/mm_hooks.h~pkeys-11-pte-fault include/asm-generic/mm_hooks.h
--- a/include/asm-generic/mm_hooks.h~pkeys-11-pte-fault	2015-12-03 16:21:25.579669178 -0800
+++ b/include/asm-generic/mm_hooks.h	2015-12-03 16:21:25.588669587 -0800
@@ -26,4 +26,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
 #endif	/* _ASM_GENERIC_MM_HOOKS_H */
diff -puN mm/gup.c~pkeys-11-pte-fault mm/gup.c
--- a/mm/gup.c~pkeys-11-pte-fault	2015-12-03 16:21:25.580669224 -0800
+++ b/mm/gup.c	2015-12-03 16:21:25.589669632 -0800
@@ -13,6 +13,7 @@
 #include <linux/rwsem.h>
 #include <linux/hugetlb.h>
 
+#include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
 
@@ -391,6 +392,8 @@ static int check_vma_flags(struct vm_are
 		if (!(vm_flags & VM_MAYREAD))
 			return -EFAULT;
 	}
+	if (!arch_vma_access_permitted(vma, (gup_flags & FOLL_WRITE)))
+		return -EFAULT;
 	return 0;
 }
 
@@ -559,13 +562,19 @@ EXPORT_SYMBOL(__get_user_pages);
 
 bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags)
 {
-	vm_flags_t vm_flags;
-
-	vm_flags = (fault_flags & FAULT_FLAG_WRITE) ? VM_WRITE : VM_READ;
+	bool write = !!(fault_flags & FAULT_FLAG_WRITE);
+	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;
 
 	if (!(vm_flags & vma->vm_flags))
 		return false;
 
+	/*
+	 * The architecture might have a hardware protection
+	 * mechanism other than read/write that can deny access
+	 */
+	if (!arch_vma_access_permitted(vma, write))
+		return false;
+
 	return true;
 }
 
@@ -1102,6 +1111,9 @@ static int gup_pte_range(pmd_t pmd, unsi
 			pte_protnone(pte) || (write && !pte_write(pte)))
 			goto pte_unmap;
 
+		if (!arch_pte_access_permitted(pte, write))
+			goto pte_unmap;
+
 		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
 		page = pte_page(pte);
 
diff -puN mm/memory.c~pkeys-11-pte-fault mm/memory.c
--- a/mm/memory.c~pkeys-11-pte-fault	2015-12-03 16:21:25.582669314 -0800
+++ b/mm/memory.c	2015-12-03 16:21:25.590669677 -0800
@@ -64,6 +64,7 @@
 #include <linux/userfaultfd_k.h>
 
 #include <asm/io.h>
+#include <asm/mmu_context.h>
 #include <asm/pgalloc.h>
 #include <asm/uaccess.h>
 #include <asm/tlb.h>
@@ -3344,6 +3345,9 @@ static int __handle_mm_fault(struct mm_s
 	pmd_t *pmd;
 	pte_t *pte;
 
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE))
+		return VM_FAULT_SIGSEGV;
+
 	if (unlikely(is_vm_hugetlb_page(vma)))
 		return hugetlb_fault(mm, vma, address, flags);
 
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-12-04  1:19 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04  1:14 [PATCH 00/34] x86: Memory Protection Keys (v5) Dave Hansen
2015-12-04  1:14 ` Dave Hansen
2015-12-04  1:14 ` Dave Hansen
2015-12-04  1:14 ` [PATCH 01/34] mm, gup: introduce concept of "foreign" get_user_pages() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 02/34] x86, fpu: add placeholder for Processor Trace XSAVE state Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 03/34] x86, pkeys: Add Kconfig option Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 04/34] x86, pkeys: cpuid bit definition Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 05/34] x86, pkeys: define new CR4 bit Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 06/34] x86, pkeys: add PKRU xsave fields and data structure(s) Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 07/34] x86, pkeys: PTE bits for storing protection key Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 08/34] x86, pkeys: new page fault error code bit: PF_PK Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 09/34] x86, pkeys: store protection in high VMA flags Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 14:17   ` Thomas Gleixner
2015-12-08 14:17     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 10/34] x86, pkeys: arch-specific protection bits Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 15:15   ` [PATCH 10/34] x86, pkeys: arch-specific protection bitsy Thomas Gleixner
2015-12-08 15:15     ` Thomas Gleixner
2015-12-08 16:34     ` Dave Hansen
2015-12-08 16:34       ` Dave Hansen
2015-12-08 17:24       ` Thomas Gleixner
2015-12-08 17:24         ` Thomas Gleixner
2015-12-08 18:06         ` Dave Hansen
2015-12-08 18:29           ` Thomas Gleixner
2015-12-08 18:29             ` Thomas Gleixner
2015-12-08 18:35             ` Thomas Gleixner
2015-12-08 18:35               ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 11/34] x86, pkeys: pass VMA down in to fault signal generation code Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 12/34] signals, pkeys: notify userspace about protection key faults Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 13/34] x86, pkeys: fill in pkey field in siginfo Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 14/34] x86, pkeys: add functions to fetch PKRU Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 15:18   ` Thomas Gleixner
2015-12-08 15:18     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 15/34] mm: factor out VMA fault permission checking Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 17:26   ` Thomas Gleixner
2015-12-08 17:26     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 16/34] x86, mm: simplify get_user_pages() PTE bit handling Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:01   ` Thomas Gleixner
2015-12-08 18:01     ` Thomas Gleixner
2015-12-08 18:30     ` Dave Hansen
2015-12-08 18:30       ` Dave Hansen
2015-12-04  1:14 ` Dave Hansen [this message]
2015-12-04  1:14   ` [PATCH 17/34] x86, pkeys: check VMAs and PTEs for protection keys Dave Hansen
2015-12-08 18:11   ` Thomas Gleixner
2015-12-08 18:11     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 18/34] mm: add gup flag to indicate "foreign" mm access Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 19/34] x86, pkeys: optimize fault handling in access_error() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:14   ` Thomas Gleixner
2015-12-08 18:14     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 20/34] x86, pkeys: differentiate instruction fetches Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:17   ` Thomas Gleixner
2015-12-08 18:17     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 21/34] x86, pkeys: dump PKRU with other kernel registers Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:19   ` Thomas Gleixner
2015-12-08 18:19     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 22/34] x86, pkeys: dump PTE pkey in /proc/pid/smaps Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:20   ` Thomas Gleixner
2015-12-08 18:20     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 23/34] x86, pkeys: add Kconfig prompt to existing config option Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:21   ` Thomas Gleixner
2015-12-08 18:21     ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 24/34] mm, multi-arch: pass a protection key in to calc_vm_flag_bits() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-04  1:14 ` [PATCH 25/34] x86, pkeys: add arch_validate_pkey() Dave Hansen
2015-12-04  1:14   ` Dave Hansen
2015-12-08 18:39   ` Thomas Gleixner
2015-12-08 18:39     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 26/34] mm: implement new mprotect_key() system call Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-05  6:50   ` Michael Kerrisk (man-pages)
2015-12-05  6:50     ` Michael Kerrisk (man-pages)
2015-12-05  6:50     ` Michael Kerrisk (man-pages)
2015-12-07 16:44     ` Dave Hansen
2015-12-07 16:44       ` Dave Hansen
2015-12-09 11:08       ` Michael Kerrisk (man-pages)
2015-12-09 11:08         ` Michael Kerrisk (man-pages)
2015-12-09 15:48         ` Dave Hansen
2015-12-09 15:48           ` Dave Hansen
2015-12-09 16:45           ` Michael Kerrisk (man-pages)
2015-12-09 16:45             ` Michael Kerrisk (man-pages)
2015-12-09 16:45             ` Michael Kerrisk (man-pages)
2015-12-09 17:05             ` Dave Hansen
2015-12-09 17:05               ` Dave Hansen
2015-12-09 17:05               ` Dave Hansen
2015-12-11 20:13               ` Michael Kerrisk (man-pages)
2015-12-11 20:13                 ` Michael Kerrisk (man-pages)
2015-12-04  1:15 ` [PATCH 27/34] x86, pkeys: make mprotect_key() mask off additional vm_flags Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:41   ` Thomas Gleixner
2015-12-08 18:41     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 28/34] x86: wire up mprotect_key() system call Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:44   ` Thomas Gleixner
2015-12-08 18:44     ` Thomas Gleixner
2015-12-08 18:44     ` Thomas Gleixner
2015-12-08 19:06     ` Dave Hansen
2015-12-08 19:06       ` Dave Hansen
2015-12-08 20:38       ` Thomas Gleixner
2015-12-08 20:38         ` Thomas Gleixner
2015-12-08 20:38         ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 29/34] x86: separate out LDT init from context init Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:45   ` Thomas Gleixner
2015-12-08 18:45     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 30/34] x86, fpu: allow setting of XSAVE state Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-08 18:48   ` Thomas Gleixner
2015-12-08 18:48     ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 31/34] x86, pkeys: allocation/free syscalls Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 32/34] x86, pkeys: add pkey set/get syscalls Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 33/34] x86, pkeys: actually enable Memory Protection Keys in CPU Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04  1:15 ` [PATCH 34/34] x86, pkeys: Documentation Dave Hansen
2015-12-04  1:15   ` Dave Hansen
2015-12-04 23:31 ` [PATCH 00/34] x86: Memory Protection Keys (v5) Andy Lutomirski
2015-12-04 23:31   ` Andy Lutomirski
2015-12-04 23:38   ` Dave Hansen
2015-12-04 23:38     ` Dave Hansen
2015-12-04 23:38     ` Dave Hansen
2015-12-11 20:16     ` Andy Lutomirski
2015-12-11 20:16       ` Andy Lutomirski
2015-12-11 20:16       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151204011448.23DC574D@viggo.jf.intel.com \
    --to=dave@sr71.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.