All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Kees Cook <keescook@chromium.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Baoquan He <bhe@redhat.com>
Cc: linux-kernel@vger.kernel.org, Yinghai Lu <yinghai@kernel.org>,
	Jiri Kosina <jkosina@suse.cz>,
	Matt Fleming <matt.fleming@intel.com>
Subject: [PATCH 10/42] x86, 64bit: Set ident_mapping for kaslr
Date: Tue,  7 Jul 2015 13:19:56 -0700	[thread overview]
Message-ID: <1436300428-21163-11-git-send-email-yinghai@kernel.org> (raw)
In-Reply-To: <1436300428-21163-1-git-send-email-yinghai@kernel.org>

Current aslr only support random in near range, and new range still use
old mapping. Also it does not support new range above 4G.

We need to have ident mapping for the new range before we can do
decompress to the new output, and later run them.

In this patch, we add ident mapping for all needed range.

At first, to support aslr to put random VO above 4G, we must set ident
mapping for the new range when it come via startup_32 path.

Secondly, when boot from 64bit bootloader, bootloader set ident mapping,
and boot via ZO (arch/x86/boot/compressed/vmlinux) startup_64.
Those pages for pagetable need to be avoided when we select new random
VO (vmlinux) base. Otherwise decompressor would overwrite them during
decompressing.
First way would be: walk through pagetable and find out every page is used
by pagetable for every mem_aovid checking but we will need extra code, and
may need to increase mem_avoid array size to hold them.
Other way would be: We can create new ident mapping instead, and pages for
pagetable will come from _pagetable section of ZO, and they are in
mem_avoid array already. In this way, we can reuse the code for ident
mapping.

The _pgtable will be shared 32bit and 64bit path to reduce init_size,
as now ZO _rodata to _end will contribute init_size.

We need to increase pgt buffer size.
When boot via startup_64, as we need to cover old VO, params, cmdline
and new VO, in extreme case we could have them all cross 512G boundary,
will need (2+2)*4 pages with 2M mapping. And need 2 for first 2M for vga
ram. Plus one for level4. Total will be 19 pages.
When boot via startup_32, aslr would move new VO above 4G, we need set
extra ident mapping for new VO, pgt buffer come from _pgtable offset 6
pages. Should only need (2+2) pages at most when it cross 512G boundary.
So 19 pages could make both paths happy.

Cc: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/Makefile   |  3 ++
 arch/x86/boot/compressed/aslr.c     | 14 ++++++
 arch/x86/boot/compressed/head_64.S  |  4 +-
 arch/x86/boot/compressed/misc.h     | 11 +++++
 arch/x86/boot/compressed/misc_pgt.c | 91 +++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/boot.h         | 19 ++++++++
 6 files changed, 140 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/boot/compressed/misc_pgt.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index e12a93c..66461b4 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -58,6 +58,9 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/aslr.o
+ifdef CONFIG_X86_64
+	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/misc_pgt.o
+endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
 
diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index d753fb3..0990c78 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -151,6 +151,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	 */
 	mem_avoid[0].start = input;
 	mem_avoid[0].size = (output + init_size) - input;
+	fill_pagetable(input, (output + init_size) - input);
 
 	/* Avoid initrd. */
 	initrd_start  = (u64)real_mode->ext_ramdisk_image << 32;
@@ -159,6 +160,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	initrd_size |= real_mode->hdr.ramdisk_size;
 	mem_avoid[1].start = initrd_start;
 	mem_avoid[1].size = initrd_size;
+	/* don't need to set mapping for initrd */
 
 	/* Avoid kernel command line. */
 	cmd_line  = (u64)real_mode->ext_cmd_line_ptr << 32;
@@ -169,10 +171,19 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 		;
 	mem_avoid[2].start = cmd_line;
 	mem_avoid[2].size = cmd_line_size;
+	fill_pagetable(cmd_line, cmd_line_size);
 
 	/* Avoid params */
 	mem_avoid[3].start = (unsigned long)real_mode;
 	mem_avoid[3].size = sizeof(*real_mode);
+	fill_pagetable((unsigned long)real_mode, sizeof(*real_mode));
+
+	/* don't need to set mapping for setup_data */
+
+#ifdef CONFIG_X86_VERBOSE_BOOTUP
+	/* for video ram */
+	fill_pagetable(0, PMD_SIZE);
+#endif
 }
 
 /* Does this memory vector overlap a known avoided area? */
@@ -330,6 +341,9 @@ unsigned char *choose_kernel_location(unsigned char *input,
 		goto out;
 
 	choice = random;
+
+	fill_pagetable(choice, output_run_size);
+	switch_pagetable();
 out:
 	return (unsigned char *)choice;
 }
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 3691451..075bb15 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -126,7 +126,7 @@ ENTRY(startup_32)
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
-	movl	$((4096*6)/4), %ecx
+	movl	$(BOOT_INIT_PGT_SIZE/4), %ecx
 	rep	stosl
 
 	/* Build Level 4 */
@@ -478,4 +478,4 @@ boot_stack_end:
 	.section ".pgtable","a",@nobits
 	.balign 4096
 pgtable:
-	.fill 6*4096, 1, 0
+	.fill BOOT_PGT_SIZE, 1, 0
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 40b4546..0104c0be 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -73,6 +73,17 @@ unsigned char *choose_kernel_location(unsigned char *input,
 }
 #endif
 
+#ifdef CONFIG_X86_64
+void fill_pagetable(unsigned long start, unsigned long size);
+void switch_pagetable(void);
+extern unsigned char _pgtable[];
+#else
+static inline void fill_pagetable(unsigned long start, unsigned long size)
+{ }
+static inline void switch_pagetable(void)
+{ }
+#endif
+
 #ifdef CONFIG_EARLY_PRINTK
 /* early_serial_console.c */
 extern int early_serial_base;
diff --git a/arch/x86/boot/compressed/misc_pgt.c b/arch/x86/boot/compressed/misc_pgt.c
new file mode 100644
index 0000000..954811e
--- /dev/null
+++ b/arch/x86/boot/compressed/misc_pgt.c
@@ -0,0 +1,91 @@
+#define __pa(x)  ((unsigned long)(x))
+#define __va(x)  ((void *)((unsigned long)(x)))
+
+#include "misc.h"
+
+#include <asm/init.h>
+#include <asm/pgtable.h>
+
+#include "../../mm/ident_map.c"
+#include "../string.h"
+
+struct alloc_pgt_data {
+	unsigned char *pgt_buf;
+	unsigned long pgt_buf_size;
+	unsigned long pgt_buf_offset;
+};
+
+static void *alloc_pgt_page(void *context)
+{
+	struct alloc_pgt_data *d = (struct alloc_pgt_data *)context;
+	unsigned char *p = (unsigned char *)d->pgt_buf;
+
+	if (d->pgt_buf_offset >= d->pgt_buf_size) {
+		debug_putstr("out of pgt_buf in misc.c\n");
+		return NULL;
+	}
+
+	p += d->pgt_buf_offset;
+	d->pgt_buf_offset += PAGE_SIZE;
+
+	return p;
+}
+
+/*
+ * Use a normal definition of memset() from string.c. There are already
+ * included header files which expect a definition of memset() and by
+ * the time we define memset macro, it is too late.
+ */
+#undef memset
+
+unsigned long __force_order;
+static struct alloc_pgt_data pgt_data;
+static struct x86_mapping_info mapping_info;
+static pgd_t *level4p;
+
+void fill_pagetable(unsigned long start, unsigned long size)
+{
+	unsigned long end = start + size;
+
+	if (!level4p) {
+		pgt_data.pgt_buf_offset = 0;
+		mapping_info.alloc_pgt_page = alloc_pgt_page;
+		mapping_info.context = &pgt_data;
+		mapping_info.pmd_flag = __PAGE_KERNEL_LARGE_EXEC;
+
+		/*
+		 * come from startup_32 ?
+		 * then cr3 is _pgtable, we can reuse it.
+		 */
+		level4p = (pgd_t *)read_cr3();
+		if ((unsigned long)level4p == (unsigned long)_pgtable) {
+			pgt_data.pgt_buf = (unsigned char *)_pgtable +
+						 BOOT_INIT_PGT_SIZE;
+			pgt_data.pgt_buf_size = BOOT_PGT_SIZE -
+						 BOOT_INIT_PGT_SIZE;
+			memset((unsigned char *)pgt_data.pgt_buf, 0,
+				pgt_data.pgt_buf_size);
+			debug_putstr("boot via startup_32\n");
+		} else {
+			pgt_data.pgt_buf = (unsigned char *)_pgtable;
+			pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
+			memset((unsigned char *)pgt_data.pgt_buf, 0,
+				pgt_data.pgt_buf_size);
+			debug_putstr("boot via startup_64\n");
+			level4p = (pgd_t *)alloc_pgt_page(&pgt_data);
+		}
+	}
+
+	/* align boundary to 2M */
+	start = round_down(start, PMD_SIZE);
+	end = round_up(end, PMD_SIZE);
+	if (start >= end)
+		return;
+
+	kernel_ident_mapping_init(&mapping_info, level4p, start, end);
+}
+
+void switch_pagetable(void)
+{
+	write_cr3((unsigned long)level4p);
+}
diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h
index 4fa687a..7b23908 100644
--- a/arch/x86/include/asm/boot.h
+++ b/arch/x86/include/asm/boot.h
@@ -32,7 +32,26 @@
 #endif /* !CONFIG_KERNEL_BZIP2 */
 
 #ifdef CONFIG_X86_64
+
 #define BOOT_STACK_SIZE	0x4000
+
+#define BOOT_INIT_PGT_SIZE (6*4096)
+#ifdef CONFIG_RANDOMIZE_BASE
+/*
+ * 1 page for level4, 2 pages for first 2M.
+ * (2+2)*4 pages for kernel, param, cmd_line, random kernel
+ * if all cross 512G boundary.
+ * So total will be 19 pages.
+ */
+#ifdef CONFIG_X86_VERBOSE_BOOTUP
+#define BOOT_PGT_SIZE (19*4096)
+#else
+#define BOOT_PGT_SIZE (17*4096)
+#endif
+#else
+#define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE
+#endif
+
 #else
 #define BOOT_STACK_SIZE	0x1000
 #endif
-- 
1.8.4.5


  parent reply	other threads:[~2015-07-07 20:26 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
2015-07-07 20:57   ` Kees Cook
2015-07-07 20:19 ` [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing Yinghai Lu
2015-07-07 21:22   ` Kees Cook
2015-07-07 20:19 ` [PATCH 03/42] x86, boot: Fix run_size calculation Yinghai Lu
2015-07-07 22:15   ` Kees Cook
2015-07-07 20:19 ` [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code Yinghai Lu
2015-07-07 22:18   ` Kees Cook
2015-07-07 20:19 ` [PATCH 05/42] x86, kaslr: rename output_size to output_run_size Yinghai Lu
2015-07-07 20:19 ` [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling Yinghai Lu
2015-07-07 22:36   ` Kees Cook
2015-07-07 20:19 ` [PATCH 07/42] x86, boot: Move z_extract_offset calculation to header.S Yinghai Lu
2015-07-07 20:19 ` [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer Yinghai Lu
2015-07-07 22:40   ` Kees Cook
2015-07-07 20:19 ` [PATCH 09/42] x86, boot: Split kernel_ident_mapping_init to another file Yinghai Lu
2015-07-07 20:19 ` Yinghai Lu [this message]
2015-07-07 20:19 ` [PATCH 11/42] x86, boot: Add checking for memcpy Yinghai Lu
2015-07-07 20:19 ` [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G Yinghai Lu
2015-07-07 22:42   ` Kees Cook
2015-07-07 20:19 ` [PATCH 13/42] x86, kaslr: Introduce struct slot_area to manage randomization slot info Yinghai Lu
2015-07-07 20:20 ` [PATCH 14/42] x86, kaslr: Add two functions which will be used later Yinghai Lu
2015-07-07 20:20 ` [PATCH 15/42] x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address Yinghai Lu
2015-07-07 20:20 ` [PATCH 16/42] x86, kaslr: Randomize physical and virtual address of kernel separately Yinghai Lu
2015-07-07 20:20 ` [PATCH 17/42] x86, kaslr: Add support of kernel physical address randomization above 4G Yinghai Lu
2015-07-07 20:20 ` [PATCH 18/42] x86, kaslr: Remove useless codes Yinghai Lu
2015-07-07 20:20 ` [PATCH 19/42] x86, kaslr: Allow random address could be below loaded address Yinghai Lu
2015-07-07 20:20 ` [PATCH 20/42] x86, boot: Add printf support for early console in compressed/misc.c Yinghai Lu
2015-07-07 20:20 ` [PATCH 21/42] x86, boot: Add more debug printout " Yinghai Lu
2015-07-07 20:20 ` [PATCH 22/42] x86, setup: Check early serial console per string instead of one char Yinghai Lu
2015-07-07 22:59   ` Kees Cook
2015-07-07 20:20 ` [PATCH 23/42] x86, setup: Use puts() instead of printf() in edd code Yinghai Lu
2015-07-07 20:20 ` [PATCH 24/42] x86: Setup early console as early as possible in x86_start_kernel() Yinghai Lu
2015-07-07 20:20 ` [PATCH 25/42] x86, boot: print compression suffix in decompress stage Yinghai Lu
2015-07-07 23:13   ` Kees Cook
2015-07-07 20:20 ` [PATCH 26/42] x86: remove not needed clear_page calling Yinghai Lu
2015-07-07 23:14   ` Kees Cook
2015-07-07 20:20 ` [PATCH 27/42] x86: restore end_of_ram to E820_RAM Yinghai Lu
2015-07-08 17:44   ` Matt Fleming
2015-07-09  1:41     ` Dan Williams
2015-07-09  7:45     ` Christoph Hellwig
2015-07-09 11:17       ` Matt Fleming
2015-07-07 20:20 ` [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G Yinghai Lu
2015-07-07 23:12   ` Kees Cook
2015-07-08 18:00     ` Matt Fleming
2015-07-07 20:20 ` [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update Yinghai Lu
2015-07-07 23:08   ` Kees Cook
2015-07-07 20:20 ` [PATCH 30/42] x86: Kill E820_RESERVED_KERN Yinghai Lu
2015-07-07 20:20 ` [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly Yinghai Lu
2015-07-22 10:58   ` Matt Fleming
2015-07-22 10:58     ` Matt Fleming
2015-07-24  2:07   ` Dave Young
2015-07-24  2:07     ` Dave Young
2015-07-07 20:20 ` [PATCH 32/42] x86, of: Let add_dtb reserve setup_data locally Yinghai Lu
2015-07-07 20:20 ` [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI Yinghai Lu
2015-07-14 22:30   ` Bjorn Helgaas
2015-07-07 20:20 ` [PATCH 34/42] x86: Kill not used setup_data handling code Yinghai Lu
2015-07-07 20:20 ` [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list Yinghai Lu
2015-07-14 22:35   ` Bjorn Helgaas
2015-07-15  1:57     ` Yinghai Lu
2015-07-07 20:20 ` [PATCH 36/42] x86, boot, PCI: Copy SETUP_PCI rom to kernel space Yinghai Lu
2015-07-07 20:20 ` [PATCH 37/42] x86, boot, PCI: Export SETUP_PCI data via sysfs Yinghai Lu
2015-07-07 20:20 ` [PATCH 38/42] x86: Fix typo in mark_rodata_ro Yinghai Lu
2015-07-07 23:05   ` Kees Cook
2015-07-07 20:20 ` [PATCH 39/42] x86, 64bit: add pfn_range_is_highmapped() Yinghai Lu
2015-07-07 20:20 ` [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges Yinghai Lu
2015-07-07 23:17   ` Kees Cook
2015-07-07 20:20 ` [PATCH 41/42] x86, 64bit: Add __pa_high/__va_high Yinghai Lu
2015-07-07 20:20 ` [PATCH 42/42] x86: fix msr print again Yinghai Lu
2015-07-07 23:21 ` [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Kees Cook
2015-10-02 20:16   ` Kees Cook
2016-02-06 11:50     ` Baoquan He
2016-02-09  4:31       ` Kees Cook
2016-02-09  4:31         ` [kernel-hardening] " Kees Cook
2016-02-15  7:29         ` Baoquan He
2016-02-15  7:29           ` [kernel-hardening] " Baoquan He
2016-02-16 23:50           ` Kees Cook
2016-02-16 23:50             ` [kernel-hardening] " Kees Cook
2015-07-08 10:51 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1436300428-21163-11-git-send-email-yinghai@kernel.org \
    --to=yinghai@kernel.org \
    --cc=bhe@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.