All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
@ 2015-07-07 20:19 Yinghai Lu
  2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
                   ` (43 more replies)
  0 siblings, 44 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Those patches are rebased on v4.2-rc1 that I sent before but were rejected
by Ingo on changelog.

Kees Cook said that he would like to give a try to make improvement on changelog
to get things moving.

First part are kaslr related:
1. First put compressed kernel ZO near end of the buffer before decompressing
so we can find the ZO position easily for kaslr buffer searchin
2. kill run_size calculation shell scripts.
3. create new ident mapping for kasl 64bit, so we can cover
   above 4G random kernel base
4. 7 patches from He that support random random, as I already used his patches
   to test the ident mapping code.
5. some debug patches for boot/kaslr.

Second part are setup_data related:
Now setup_data is reserved via memblock and e820 and different
handlers have different ways, and it is confusing.
1. SETUP_E820_EXT: is consumed early and will not copy or access again.
        have memory wasted.
2. SETUP_EFI: is accessed via ioremap every time at early stage.
        have memory wasted.
3. SETUP_DTB: is copied locally.
        have memory wasted.
4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
Also setup_data is exported to debugfs for debug purpose.
Here will convert to let every handler to decide how to handle it.
and will not reserve the setup_data generally, so will not
waste memory and also make memblock/e820 keep page aligned.
1. not touch E820 anymore.
2. copy SETUP_EFI to __initdata variable and access it without ioremap.
3. SETUP_DTB: reserver and copy to local and free.
4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
5. export SETUP_PCI via sysfs.

Third part are some small cleanup patches.

put those patches at
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next

Thanks

Yinghai


Baoquan He (7):
  x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
  x86, kaslr: Introduce struct slot_area to manage randomization slot info
  x86, kaslr: Add two functions which will be used later
  x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
  x86, kaslr: Randomize physical and virtual address of kernel separately
  x86, kaslr: Add support of kernel physical address randomization above 4G
  x86, kaslr: Remove useless codes

Yinghai Lu (35):
  x86, kasl: Remove not needed parameter for choose_kernel_location
  x86, boot: Move compressed kernel to end of buffer before decompressing
  x86, boot: Fix run_size calculation
  x86, kaslr: Kill not needed and wrong run_size calculation code.
  x86, kaslr: rename output_size to output_run_size
  x86, kaslr: Consolidate mem_avoid array filling
  x86, boot: Move z_extract_offset calculation to header.S
  x86, kaslr: Get correct max_addr for relocs pointer
  x86, boot: Split kernel_ident_mapping_init to another file
  x86, 64bit: Set ident_mapping for kaslr
  x86, boot: Add checking for memcpy
  x86, kaslr: Allow random address could be below loaded address
  x86, boot: Add printf support for early console in compressed/misc.c
  x86, boot: Add more debug printout in compressed/misc.c
  x86, setup: Check early serial console per string instead of one char
  x86, setup: Use puts() instead of printf() in edd code
  x86: Setup early console as early as possible in x86_start_kernel()
  x86, boot: print compression suffix in decompress stage
  x86: remove not needed clear_page calling
  x86: restore end_of_ram to E820_RAM
  x86, boot: Allow 64bit EFI kernel to be loaded above 4G
  x86: Find correct 64 bit ramdisk address for microcode early update
  x86: Kill E820_RESERVED_KERN
  x86, efi: Copy SETUP_EFI data and access directly
  x86, of: Let add_dtb reserve setup_data locally
  x86, boot: Add add_pci handler for SETUP_PCI
  x86: Kill not used setup_data handling code
  x86, boot, PCI: Convert SETUP_PCI data to list
  x86, boot, PCI: Copy SETUP_PCI rom to kernel space
  x86, boot, PCI: Export SETUP_PCI data via sysfs
  x86: Fix typo in mark_rodata_ro
  x86, 64bit: add pfn_range_is_highmapped()
  x86, 64bit: remove highmap for not needed ranges
  x86, 64bit: Add __pa_high/__va_high
  x86: fix msr print again

 Documentation/x86/boot.txt                  |  19 ++
 arch/x86/boot/Makefile                      |  13 +-
 arch/x86/boot/compressed/Makefile           |  21 +-
 arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
 arch/x86/boot/compressed/eboot.c            |  15 +-
 arch/x86/boot/compressed/head_32.S          |  14 +-
 arch/x86/boot/compressed/head_64.S          |  22 +-
 arch/x86/boot/compressed/misc.c             | 129 +++++++++---
 arch/x86/boot/compressed/misc.h             |  41 +++-
 arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
 arch/x86/boot/compressed/mkpiggy.c          |  28 +--
 arch/x86/boot/compressed/printf.c           |   5 +
 arch/x86/boot/compressed/string.c           |  28 ++-
 arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
 arch/x86/boot/edd.c                         |   4 +-
 arch/x86/boot/header.S                      |  34 ++-
 arch/x86/boot/tty.c                         |  14 +-
 arch/x86/include/asm/boot.h                 |  19 ++
 arch/x86/include/asm/efi.h                  |   2 +-
 arch/x86/include/asm/page.h                 |   5 +
 arch/x86/include/asm/pci.h                  |   4 +
 arch/x86/include/asm/pgtable_64.h           |   2 +
 arch/x86/include/asm/processor.h            |   1 -
 arch/x86/include/asm/prom.h                 |   9 +-
 arch/x86/include/asm/setup.h                |   5 +
 arch/x86/include/uapi/asm/bootparam.h       |   1 +
 arch/x86/include/uapi/asm/e820.h            |   8 -
 arch/x86/kernel/asm-offsets.c               |   2 +
 arch/x86/kernel/cpu/common.c                |  61 +++---
 arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
 arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
 arch/x86/kernel/devicetree.c                |  39 ++--
 arch/x86/kernel/e820.c                      |  18 +-
 arch/x86/kernel/head.c                      |  26 +++
 arch/x86/kernel/head32.c                    |   1 +
 arch/x86/kernel/head64.c                    |  21 +-
 arch/x86/kernel/kdebugfs.c                  | 142 -------------
 arch/x86/kernel/setup.c                     |  79 ++-----
 arch/x86/kernel/tboot.c                     |   3 +-
 arch/x86/kernel/vmlinux.lds.S               |   1 +
 arch/x86/mm/ident_map.c                     |  74 +++++++
 arch/x86/mm/init_64.c                       | 173 +++++++--------
 arch/x86/mm/pageattr.c                      |  16 +-
 arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
 arch/x86/platform/efi/efi.c                 |  13 +-
 arch/x86/platform/efi/efi_64.c              |  10 +-
 arch/x86/platform/efi/quirks.c              |  23 +-
 arch/x86/tools/calc_run_size.sh             |  42 ----
 drivers/tty/serial/8250/8250_early.c        |  17 ++
 kernel/printk/printk.c                      |  11 +-
 50 files changed, 1235 insertions(+), 661 deletions(-)
 create mode 100644 arch/x86/boot/compressed/misc_pgt.c
 create mode 100644 arch/x86/boot/compressed/printf.c
 create mode 100644 arch/x86/mm/ident_map.c
 delete mode 100644 arch/x86/tools/calc_run_size.sh

-- 
1.8.4.5


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:57   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing Yinghai Lu
                   ` (42 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

real_mode is global variable, so we do not need to pass it around.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/aslr.c | 5 ++---
 arch/x86/boot/compressed/misc.c | 2 +-
 arch/x86/boot/compressed/misc.h | 6 ++----
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index d7b1f65..71520c4 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -295,8 +295,7 @@ static unsigned long find_random_addr(unsigned long minimum,
 	return slots_fetch_random();
 }
 
-unsigned char *choose_kernel_location(struct boot_params *boot_params,
-				      unsigned char *input,
+unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
 				      unsigned long output_size)
@@ -316,7 +315,7 @@ unsigned char *choose_kernel_location(struct boot_params *boot_params,
 	}
 #endif
 
-	boot_params->hdr.loadflags |= KASLR_FLAG;
+	real_mode->hdr.loadflags |= KASLR_FLAG;
 
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init((unsigned long)input, input_size,
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index a107b93..ebf72ce 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -404,7 +404,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
-	output = choose_kernel_location(real_mode, input_data, input_len, output,
+	output = choose_kernel_location(input_data, input_len, output,
 					output_len > run_size ? output_len
 							      : run_size);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 805d25c..8c96cc5 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -56,8 +56,7 @@ int cmdline_find_option_bool(const char *option);
 
 #if CONFIG_RANDOMIZE_BASE
 /* aslr.c */
-unsigned char *choose_kernel_location(struct boot_params *boot_params,
-				      unsigned char *input,
+unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
 				      unsigned long output_size);
@@ -65,8 +64,7 @@ unsigned char *choose_kernel_location(struct boot_params *boot_params,
 bool has_cpuflag(int flag);
 #else
 static inline
-unsigned char *choose_kernel_location(struct boot_params *boot_params,
-				      unsigned char *input,
+unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
 				      unsigned long output_size)
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
  2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 21:22   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 03/42] x86, boot: Fix run_size calculation Yinghai Lu
                   ` (41 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

So we can find out ZO position easily during run-time for kasl buffer
searching.

Current code is using extract_offset to control copied kernel position, it
will put the copied kernel in the middle of buffer when kernel run size is
bigger than decompressed needed buffer size.

Current layout:
when init_size is the same as kernel run_size:
                                        run_size
0              extract_offset          init_size
|------------------|------------------------|
   VO text/data                   VO bss/brk
                   input ZO text ZO data

This patch try to:
move ZO to the end of buffer instead of middle of the buffer.
When init_size is bigger than kernel run size, will have

0                            run_size    init_size
|--------------------------------|----------|
   VO text/data        VO bss/brk
                       input ZO text ZO data

We already have init_size the buffer size, we can find the end easily
when copying ZO before decompressing.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/head_32.S     | 11 +++++++++--
 arch/x86/boot/compressed/head_64.S     |  8 ++++++--
 arch/x86/boot/compressed/mkpiggy.c     |  7 ++-----
 arch/x86/boot/compressed/vmlinux.lds.S |  1 +
 arch/x86/boot/header.S                 |  2 +-
 arch/x86/kernel/asm-offsets.c          |  1 +
 arch/x86/kernel/vmlinux.lds.S          |  1 +
 7 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8ef964d..0c140f9 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -148,7 +148,9 @@ preferred_addr:
 1:
 
 	/* Target address to relocate to for decompression */
-	addl	$z_extract_offset, %ebx
+	movl    BP_init_size(%esi), %eax
+	subl    $_end, %eax
+	addl    %eax, %ebx
 
 	/* Set up the stack */
 	leal	boot_stack_end(%ebx), %esp
@@ -210,8 +212,13 @@ relocated:
 				/* push arguments for decompress_kernel: */
 	pushl	$z_run_size	/* size of kernel with .bss and .brk */
 	pushl	$z_output_len	/* decompressed length, end of relocs */
-	leal	z_extract_offset_negative(%ebx), %ebp
+
+	movl    BP_init_size(%esi), %eax
+	subl    $_end, %eax
+	movl    %ebx, %ebp
+	subl    %eax, %ebp
 	pushl	%ebp		/* output address */
+
 	pushl	$z_input_len	/* input_len */
 	leal	input_data(%ebx), %eax
 	pushl	%eax		/* input_data */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index b0c0d16..67dd8d3 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -102,7 +102,9 @@ ENTRY(startup_32)
 1:
 
 	/* Target address to relocate to for decompression */
-	addl	$z_extract_offset, %ebx
+	movl	BP_init_size(%esi), %eax
+	subl	$_end, %eax
+	addl	%eax, %ebx
 
 /*
  * Prepare for entering 64 bit mode
@@ -330,7 +332,9 @@ preferred_addr:
 1:
 
 	/* Target address to relocate to for decompression */
-	leaq	z_extract_offset(%rbp), %rbx
+	movl	BP_init_size(%rsi), %ebx
+	subl	$_end, %ebx
+	addq	%rbp, %rbx
 
 	/* Set up the stack */
 	leaq	boot_stack_end(%rbx), %rsp
diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index d8222f2..5faad09 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -83,11 +83,8 @@ int main(int argc, char *argv[])
 	printf("z_input_len = %lu\n", ilen);
 	printf(".globl z_output_len\n");
 	printf("z_output_len = %lu\n", (unsigned long)olen);
-	printf(".globl z_extract_offset\n");
-	printf("z_extract_offset = 0x%lx\n", offs);
-	/* z_extract_offset_negative allows simplification of head_32.S */
-	printf(".globl z_extract_offset_negative\n");
-	printf("z_extract_offset_negative = -0x%lx\n", offs);
+	printf(".globl z_min_extract_offset\n");
+	printf("z_min_extract_offset = 0x%lx\n", offs);
 	printf(".globl z_run_size\n");
 	printf("z_run_size = %lu\n", run_size);
 
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 34d047c..e24e0a0 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -70,5 +70,6 @@ SECTIONS
 		_epgtable = . ;
 	}
 #endif
+	. = ALIGN(PAGE_SIZE);	/* keep ZO size page aligned */
 	_end = .;
 }
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 16ef025..9bfab22 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -440,7 +440,7 @@ setup_data:		.quad 0			# 64-bit physical pointer to
 
 pref_address:		.quad LOAD_PHYSICAL_ADDR	# preferred load addr
 
-#define ZO_INIT_SIZE	(ZO__end - ZO_startup_32 + ZO_z_extract_offset)
+#define ZO_INIT_SIZE	(ZO__end - ZO_startup_32 + ZO_z_min_extract_offset)
 #define VO_INIT_SIZE	(VO__end - VO__text)
 #if ZO_INIT_SIZE > VO_INIT_SIZE
 #define INIT_SIZE ZO_INIT_SIZE
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 8e3d22a1..d2e00bc 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -87,6 +87,7 @@ void common(void) {
 	OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch);
 	OFFSET(BP_version, boot_params, hdr.version);
 	OFFSET(BP_kernel_alignment, boot_params, hdr.kernel_alignment);
+	OFFSET(BP_init_size, boot_params, hdr.init_size);
 	OFFSET(BP_pref_address, boot_params, hdr.pref_address);
 	OFFSET(BP_code32_start, boot_params, hdr.code32_start);
 
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 00bf300..5816920 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -325,6 +325,7 @@ SECTIONS
 		__brk_limit = .;
 	}
 
+	. = ALIGN(PAGE_SIZE);		/* keep VO_INIT_SIZE page aligned */
 	_end = .;
 
         STABS_DEBUG
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 03/42] x86, boot: Fix run_size calculation
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
  2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
  2015-07-07 20:19 ` [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 22:15   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code Yinghai Lu
                   ` (40 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Junjie Mao, Josh Triplett, Matt Fleming,
	Andrew Morton

While looking at the boot code to add mem mapping for kasl
with 64bit above 4G support, I found that e6023367d779 ("x86, kaslr: Prevent
.bss from overlaping initrd") and later introduced way to get kernel run_size
and pass it around.  At first run_size calculation is via perl and then
changed to shell scripts.

At first, that calculation is not right in the shell scripts:
it is using bss offset in the file plus bss/brk section size.

   run_size=$(( $offsetA + $sizeA + $sizeB ))

Idx Name          Size      VMA               LMA               File off  Algn
...
 24 .bss          000a1000  ffffffff825e0000  00000000025e0000  019e0000  2**12
                  ALLOC
 25 .brk          00026000  ffffffff82681000  0000000002681000  019e0000  2**0
                  ALLOC

that run_size will be 27947008.

it has extra not needed size as
1. we have hole between the sections in file to get aligned in file.
2. start of text is from 0x200000 in elf file.

  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  ...
  [25] .bss              NOBITS           ffffffff825e0000  019e0000
       00000000000a1000  0000000000000000  WA       0     0     4096
  [26] .brk              NOBITS           ffffffff82681000  019e0000
       0000000000026000  0000000000000000  WA       0     0     1

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000
                 0x00000000013a9000 0x00000000013a9000  R E    200000
  LOAD           0x0000000001600000 0xffffffff82400000 0x0000000002400000
                 0x00000000000ed000 0x00000000000ed000  RW     200000
  LOAD           0x0000000001800000 0x0000000000000000 0x00000000024ed000
                 0x0000000000013698 0x0000000000013698  RW     200000
  LOAD           0x0000000001901000 0xffffffff82501000 0x0000000002501000
                 0x00000000000df000 0x00000000001a6000  RWE    200000
  NOTE           0x0000000000e9d7dc 0xffffffff81c9d7dc 0x0000000001c9d7dc
                 0x0000000000000024 0x0000000000000024         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes ..
   01     .data .vvar
   02     .data..percpu
   03     .init.text ... .bss .brk
   04     .notes

During decompress_kernel, parse_elf will move forward section to run time position.

   parse_elf: [0x009a000000-0x009b3a8fff] <=== [0x009a200000-0x009b5a8fff]
   parse_elf: [0x009b400000-0x009b4ecfff] <=== [0x009b600000-0x009b6ecfff]
   parse_elf: [0x009b4ed000-0x009b500697] <=== [0x009b800000-0x009b813697]
   parse_elf: [0x009b501000-0x009b5dffff] <=== [0x009b901000-0x009b9dffff]

Secondly it is not necessary. As run_size is simple constant, we don't
need to pass it around and we already have voffset.h for that.

We can share voffset.h between misc.c and header.S instead of adding
other way to get run_size.

In this patch, we move voffset.h creation code to boot/compressed/Makefile.

Dependence was:
boot/header.S ==> boot/voffset.h ==> vmlinux
boot/header.S ==> compressed/vmlinux ==> compressed/misc.c
Now become:
boot/header.S ==> compressed/vmlinux ==> compressed/misc.c ==> boot/voffset.h ==> vmlinux

Use macro in misc.c to replace passed run_size.

Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/Makefile            | 11 +----------
 arch/x86/boot/compressed/Makefile | 12 ++++++++++++
 arch/x86/boot/compressed/misc.c   |  3 +++
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index 57bbf2f..4d27e8b 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -77,15 +77,6 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
 
 SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
 
-sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 0x\1/p'
-
-quiet_cmd_voffset = VOFFSET $@
-      cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
-
-targets += voffset.h
-$(obj)/voffset.h: vmlinux FORCE
-	$(call if_changed,voffset)
-
 sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|z_.*\)$$/\#define ZO_\2 0x\1/p'
 
 quiet_cmd_zoffset = ZOFFSET $@
@@ -97,7 +88,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
 
 
 AFLAGS_header.o += -I$(obj)
-$(obj)/header.o: $(obj)/voffset.h $(obj)/zoffset.h
+$(obj)/header.o: $(obj)/zoffset.h
 
 LDFLAGS_setup.elf	:= -T
 $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 0a291cd..d9fee82 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -40,6 +40,18 @@ LDFLAGS_vmlinux := -T
 hostprogs-y	:= mkpiggy
 HOST_EXTRACFLAGS += -I$(srctree)/tools/include
 
+sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
+
+quiet_cmd_voffset = VOFFSET $@
+      cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
+
+targets += ../voffset.h
+
+$(obj)/../voffset.h: vmlinux FORCE
+	$(call if_changed,voffset)
+
+$(obj)/misc.o: $(obj)/../voffset.h
+
 vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index ebf72ce..a88b591 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -11,6 +11,7 @@
 
 #include "misc.h"
 #include "../string.h"
+#include "../voffset.h"
 
 /* WARNING!!
  * This code is compiled with -fPIC and it is relocated dynamically
@@ -393,6 +394,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	lines = real_mode->screen_info.orig_video_lines;
 	cols = real_mode->screen_info.orig_video_cols;
 
+	run_size = VO__end - VO__text;
+
 	console_init();
 	debug_putstr("early console in decompress_kernel\n");
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code.
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (2 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 03/42] x86, boot: Fix run_size calculation Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 22:18   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 05/42] x86, kaslr: rename output_size to output_run_size Yinghai Lu
                   ` (39 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Josh Triplett, Matt Fleming,
	Andrew Morton, Ard Biesheuvel, Junjie Mao

We use simple and correct version to get run_size now, remove code for
wrong run_size calculation.

Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Junjie Mao <eternal.n08@gmail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/Makefile  |  4 +---
 arch/x86/boot/compressed/head_32.S |  3 +--
 arch/x86/boot/compressed/head_64.S |  3 ---
 arch/x86/boot/compressed/misc.c    |  6 ++----
 arch/x86/boot/compressed/mkpiggy.c |  9 ++------
 arch/x86/tools/calc_run_size.sh    | 42 --------------------------------------
 6 files changed, 6 insertions(+), 61 deletions(-)
 delete mode 100644 arch/x86/tools/calc_run_size.sh

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index d9fee82..50daea7 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -104,10 +104,8 @@ suffix-$(CONFIG_KERNEL_XZ)	:= xz
 suffix-$(CONFIG_KERNEL_LZO) 	:= lzo
 suffix-$(CONFIG_KERNEL_LZ4) 	:= lz4
 
-RUN_SIZE = $(shell $(OBJDUMP) -h vmlinux | \
-	     $(CONFIG_SHELL) $(srctree)/arch/x86/tools/calc_run_size.sh)
 quiet_cmd_mkpiggy = MKPIGGY $@
-      cmd_mkpiggy = $(obj)/mkpiggy $< $(RUN_SIZE) > $@ || ( rm -f $@ ; false )
+      cmd_mkpiggy = $(obj)/mkpiggy $< > $@ || ( rm -f $@ ; false )
 
 targets += piggy.S
 $(obj)/piggy.S: $(obj)/vmlinux.bin.$(suffix-y) $(obj)/mkpiggy FORCE
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 0c140f9..122b32f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -210,7 +210,6 @@ relocated:
  * Do the decompression, and jump to the new kernel..
  */
 				/* push arguments for decompress_kernel: */
-	pushl	$z_run_size	/* size of kernel with .bss and .brk */
 	pushl	$z_output_len	/* decompressed length, end of relocs */
 
 	movl    BP_init_size(%esi), %eax
@@ -226,7 +225,7 @@ relocated:
 	pushl	%eax		/* heap area */
 	pushl	%esi		/* real mode pointer */
 	call	decompress_kernel /* returns kernel location in %eax */
-	addl	$28, %esp
+	addl	$24, %esp
 
 /*
  * Jump to the decompressed kernel.
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 67dd8d3..3691451 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -407,8 +407,6 @@ relocated:
  * Do the decompression, and jump to the new kernel..
  */
 	pushq	%rsi			/* Save the real mode argument */
-	movq	$z_run_size, %r9	/* size of kernel with .bss and .brk */
-	pushq	%r9
 	movq	%rsi, %rdi		/* real mode address */
 	leaq	boot_heap(%rip), %rsi	/* malloc area for uncompression */
 	leaq	input_data(%rip), %rdx  /* input_data */
@@ -416,7 +414,6 @@ relocated:
 	movq	%rbp, %r8		/* output target address */
 	movq	$z_output_len, %r9	/* decompressed length, end of relocs */
 	call	decompress_kernel	/* returns kernel location in %rax */
-	popq	%r9
 	popq	%rsi
 
 /*
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index a88b591..96201aa 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -371,9 +371,9 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 				  unsigned char *input_data,
 				  unsigned long input_len,
 				  unsigned char *output,
-				  unsigned long output_len,
-				  unsigned long run_size)
+				  unsigned long output_len)
 {
+	unsigned long run_size = VO__end - VO__text;
 	unsigned char *output_orig = output;
 
 	real_mode = rmode;
@@ -394,8 +394,6 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	lines = real_mode->screen_info.orig_video_lines;
 	cols = real_mode->screen_info.orig_video_cols;
 
-	run_size = VO__end - VO__text;
-
 	console_init();
 	debug_putstr("early console in decompress_kernel\n");
 
diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index 5faad09..c03b009 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -36,13 +36,11 @@ int main(int argc, char *argv[])
 	uint32_t olen;
 	long ilen;
 	unsigned long offs;
-	unsigned long run_size;
 	FILE *f = NULL;
 	int retval = 1;
 
-	if (argc < 3) {
-		fprintf(stderr, "Usage: %s compressed_file run_size\n",
-				argv[0]);
+	if (argc < 2) {
+		fprintf(stderr, "Usage: %s compressed_file\n", argv[0]);
 		goto bail;
 	}
 
@@ -76,7 +74,6 @@ int main(int argc, char *argv[])
 	offs += olen >> 12;	/* Add 8 bytes for each 32K block */
 	offs += 64*1024 + 128;	/* Add 64K + 128 bytes slack */
 	offs = (offs+4095) & ~4095; /* Round to a 4K boundary */
-	run_size = atoi(argv[2]);
 
 	printf(".section \".rodata..compressed\",\"a\",@progbits\n");
 	printf(".globl z_input_len\n");
@@ -85,8 +82,6 @@ int main(int argc, char *argv[])
 	printf("z_output_len = %lu\n", (unsigned long)olen);
 	printf(".globl z_min_extract_offset\n");
 	printf("z_min_extract_offset = 0x%lx\n", offs);
-	printf(".globl z_run_size\n");
-	printf("z_run_size = %lu\n", run_size);
 
 	printf(".globl input_data, input_data_end\n");
 	printf("input_data:\n");
diff --git a/arch/x86/tools/calc_run_size.sh b/arch/x86/tools/calc_run_size.sh
deleted file mode 100644
index 1a4c17b..0000000
--- a/arch/x86/tools/calc_run_size.sh
+++ /dev/null
@@ -1,42 +0,0 @@
-#!/bin/sh
-#
-# Calculate the amount of space needed to run the kernel, including room for
-# the .bss and .brk sections.
-#
-# Usage:
-# objdump -h a.out | sh calc_run_size.sh
-
-NUM='\([0-9a-fA-F]*[ \t]*\)'
-OUT=$(sed -n 's/^[ \t0-9]*.b[sr][sk][ \t]*'"$NUM$NUM$NUM$NUM"'.*/\1\4/p')
-if [ -z "$OUT" ] ; then
-	echo "Never found .bss or .brk file offset" >&2
-	exit 1
-fi
-
-OUT=$(echo ${OUT# })
-sizeA=$(printf "%d" 0x${OUT%% *})
-OUT=${OUT#* }
-offsetA=$(printf "%d" 0x${OUT%% *})
-OUT=${OUT#* }
-sizeB=$(printf "%d" 0x${OUT%% *})
-OUT=${OUT#* }
-offsetB=$(printf "%d" 0x${OUT%% *})
-
-run_size=$(( $offsetA + $sizeA + $sizeB ))
-
-# BFD linker shows the same file offset in ELF.
-if [ "$offsetA" -ne "$offsetB" ] ; then
-	# Gold linker shows them as consecutive.
-	endB=$(( $offsetB + $sizeB ))
-	if [ "$endB" != "$run_size" ] ; then
-		printf "sizeA: 0x%x\n" $sizeA >&2
-		printf "offsetA: 0x%x\n" $offsetA >&2
-		printf "sizeB: 0x%x\n" $sizeB >&2
-		printf "offsetB: 0x%x\n" $offsetB >&2
-		echo ".bss and .brk are non-contiguous" >&2
-		exit 1
-	fi
-fi
-
-printf "%d\n" $run_size
-exit 0
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 05/42] x86, kaslr: rename output_size to output_run_size
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (3 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:19 ` [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling Yinghai Lu
                   ` (38 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Now we are using output_size as parameter, actually we are passing
max(output_len, run_size).

Change it to output_run_size to make it less confusing.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/aslr.c | 10 +++++-----
 arch/x86/boot/compressed/misc.c |  6 ++++--
 arch/x86/boot/compressed/misc.h |  4 ++--
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 71520c4..0e1dac0 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -135,7 +135,7 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 }
 
 static void mem_avoid_init(unsigned long input, unsigned long input_size,
-			   unsigned long output, unsigned long output_size)
+			   unsigned long output, unsigned long output_run_size)
 {
 	u64 initrd_start, initrd_size;
 	u64 cmd_line, cmd_line_size;
@@ -146,7 +146,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	 * Avoid the region that is unsafe to overlap during
 	 * decompression (see calculations at top of misc.c).
 	 */
-	unsafe_len = (output_size >> 12) + 32768 + 18;
+	unsafe_len = (output_run_size >> 12) + 32768 + 18;
 	unsafe = (unsigned long)input + input_size - unsafe_len;
 	mem_avoid[0].start = unsafe;
 	mem_avoid[0].size = unsafe_len;
@@ -298,7 +298,7 @@ static unsigned long find_random_addr(unsigned long minimum,
 unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
-				      unsigned long output_size)
+				      unsigned long output_run_size)
 {
 	unsigned long choice = (unsigned long)output;
 	unsigned long random;
@@ -319,10 +319,10 @@ unsigned char *choose_kernel_location(unsigned char *input,
 
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init((unsigned long)input, input_size,
-		       (unsigned long)output, output_size);
+		       (unsigned long)output, output_run_size);
 
 	/* Walk e820 and find a random address. */
-	random = find_random_addr(choice, output_size);
+	random = find_random_addr(choice, output_run_size);
 	if (!random) {
 		debug_putstr("KASLR could not find suitable E820 region...\n");
 		goto out;
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 96201aa..1c03098 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -375,6 +375,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 {
 	unsigned long run_size = VO__end - VO__text;
 	unsigned char *output_orig = output;
+	unsigned long output_run_size;
 
 	real_mode = rmode;
 
@@ -400,14 +401,15 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	free_mem_ptr     = heap;	/* Heap */
 	free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
 
+	output_run_size = output_len > run_size ? output_len : run_size;
+
 	/*
 	 * The memory hole needed for the kernel is the larger of either
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
 	output = choose_kernel_location(input_data, input_len, output,
-					output_len > run_size ? output_len
-							      : run_size);
+					output_run_size);
 
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 8c96cc5..40b4546 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -59,7 +59,7 @@ int cmdline_find_option_bool(const char *option);
 unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
-				      unsigned long output_size);
+				      unsigned long output_run_size);
 /* cpuflags.c */
 bool has_cpuflag(int flag);
 #else
@@ -67,7 +67,7 @@ static inline
 unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
-				      unsigned long output_size)
+				      unsigned long output_run_size)
 {
 	return output;
 }
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (4 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 05/42] x86, kaslr: rename output_size to output_run_size Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 22:36   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 07/42] x86, boot: Move z_extract_offset calculation to header.S Yinghai Lu
                   ` (37 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

We are going to support kaslr with 64bit above 4G, and new random output
buffer could be anywhere.

mem_avoid array is used for kaslr to search new output buffer.
Current code only track range that is after output+output_run_size.

We need to track all range instead of just after output+output_run_size.

Current code has first entry is extra bytes after input+input_size, and it
is according to output_run_size. Other entries are for initrd, cmdline,
and heap/stack for ZO running.

At first, check the first entry that should be in the mem_avoid array.

Now ZO sit end of the buffer always, we can find out where is ZO text
and data/bss etc.
                                                output+run_size
                                                      |
0   output               input      input+input_size  |     output+init_size
|     |                    |               |          |          |
|-----|-----------------|--|---------------|------|---|----------|
                        |                         |
               output+init_size-ZO_SIZE   output+output_size

[output, output+init_size) is the buffer for decompress.

[output, output+run_size) is for VO run size.
[output, output+output_size) is (VO (vmlinux after objcopy) plus relocs)

[output+init_size-ZO_SIZE, output+init_size) is copied ZO.
[input, input+input_size) is copied compressed (VO (vmlinux after objcopy)
plus relocs), not the ZO.

[input+input_size, output+init_size) is [_text, _end) for ZO. that could be
first range in mem_avoid.

That new first entry already include heap and stack for ZO running.  So we
don't need to put them separatedly into mem_avoid array.

Also we need to put [input, input+input_size) in mem_avoid array, ant it
is connected to first one, so merge them.

At last we need to put boot_params into the mem_avoid too. As with 64bit bootloader
could put it anywhere.

After those changes, we have all range needed to be avoided in mem_avoid array.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/aslr.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 0e1dac0..d753fb3 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -109,7 +109,7 @@ struct mem_vector {
 	unsigned long size;
 };
 
-#define MEM_AVOID_MAX 5
+#define MEM_AVOID_MAX 4
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
 static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
@@ -135,21 +135,22 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 }
 
 static void mem_avoid_init(unsigned long input, unsigned long input_size,
-			   unsigned long output, unsigned long output_run_size)
+			   unsigned long output)
 {
+	unsigned long init_size = real_mode->hdr.init_size;
 	u64 initrd_start, initrd_size;
 	u64 cmd_line, cmd_line_size;
-	unsigned long unsafe, unsafe_len;
 	char *ptr;
 
 	/*
 	 * Avoid the region that is unsafe to overlap during
-	 * decompression (see calculations at top of misc.c).
+	 * decompression.
+	 * As we already move ZO (arch/x86/boot/compressed/vmlinux)
+	 * to the end of buffer, [input+input_size, output+init_size)
+	 * has [_text, _end) for ZO.
 	 */
-	unsafe_len = (output_run_size >> 12) + 32768 + 18;
-	unsafe = (unsigned long)input + input_size - unsafe_len;
-	mem_avoid[0].start = unsafe;
-	mem_avoid[0].size = unsafe_len;
+	mem_avoid[0].start = input;
+	mem_avoid[0].size = (output + init_size) - input;
 
 	/* Avoid initrd. */
 	initrd_start  = (u64)real_mode->ext_ramdisk_image << 32;
@@ -169,13 +170,9 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	mem_avoid[2].start = cmd_line;
 	mem_avoid[2].size = cmd_line_size;
 
-	/* Avoid heap memory. */
-	mem_avoid[3].start = (unsigned long)free_mem_ptr;
-	mem_avoid[3].size = BOOT_HEAP_SIZE;
-
-	/* Avoid stack memory. */
-	mem_avoid[4].start = (unsigned long)free_mem_end_ptr;
-	mem_avoid[4].size = BOOT_STACK_SIZE;
+	/* Avoid params */
+	mem_avoid[3].start = (unsigned long)real_mode;
+	mem_avoid[3].size = sizeof(*real_mode);
 }
 
 /* Does this memory vector overlap a known avoided area? */
@@ -319,7 +316,7 @@ unsigned char *choose_kernel_location(unsigned char *input,
 
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init((unsigned long)input, input_size,
-		       (unsigned long)output, output_run_size);
+		       (unsigned long)output);
 
 	/* Walk e820 and find a random address. */
 	random = find_random_addr(choice, output_run_size);
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 07/42] x86, boot: Move z_extract_offset calculation to header.S
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (5 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:19 ` [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer Yinghai Lu
                   ` (36 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Old extract_offset calculation is done without knowledge of decompressor size.
so it guess one big size.

We can move it to header.S, where we have exact decompressor size.

We save 8 pages for init_size with this patch.

before patch:
kernel: [13e000000,13fa1dfff]
  input: [0x13f32d3b4-0x13fa01cc7], output: [0x13e000000-0x13f9ef81f], heap: [0x13fa0b680-0x13fa1367f]

after patch:
kernel: [13e000000,13fa15fff]
  input: [0x13f3253b4-0x13f9f9cc7], output: [0x13e000000-0x13f9ef81f], heap: [0x13fa03680-0x13fa0b67f]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/Makefile             |  2 +-
 arch/x86/boot/compressed/misc.c    |  5 +----
 arch/x86/boot/compressed/mkpiggy.c | 16 +---------------
 arch/x86/boot/header.S             | 29 +++++++++++++++++++++++++++++
 4 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index 4d27e8b..e7196cf 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -77,7 +77,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
 
 SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
 
-sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|z_.*\)$$/\#define ZO_\2 0x\1/p'
+sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|_ehead\|_text\|z_.*\)$$/\#define ZO_\2 0x\1/p'
 
 quiet_cmd_zoffset = ZOFFSET $@
       cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 1c03098..db97bdf 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -84,13 +84,10 @@
  * To avoid problems with the compressed data's meta information an extra 18
  * bytes are needed.  Leading to the formula:
  *
- * extra_bytes = (uncompressed_size >> 12) + 32768 + 18 + decompressor_size.
+ * extra_bytes = (uncompressed_size >> 12) + 32768 + 18.
  *
  * Adding 8 bytes per 32K is a bit excessive but much easier to calculate.
  * Adding 32768 instead of 32767 just makes for round numbers.
- * Adding the decompressor_size is necessary as it musht live after all
- * of the data as well.  Last I measured the decompressor is about 14K.
- * 10K of actual data and 4K of bss.
  *
  */
 
diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index c03b009..c5148642 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -21,8 +21,7 @@
  * ----------------------------------------------------------------------- */
 
 /*
- * Compute the desired load offset from a compressed program; outputs
- * a small assembly wrapper with the appropriate symbols defined.
+ * outputs a small assembly wrapper with the appropriate symbols defined.
  */
 
 #include <stdlib.h>
@@ -35,7 +34,6 @@ int main(int argc, char *argv[])
 {
 	uint32_t olen;
 	long ilen;
-	unsigned long offs;
 	FILE *f = NULL;
 	int retval = 1;
 
@@ -65,23 +63,11 @@ int main(int argc, char *argv[])
 	ilen = ftell(f);
 	olen = get_unaligned_le32(&olen);
 
-	/*
-	 * Now we have the input (compressed) and output (uncompressed)
-	 * sizes, compute the necessary decompression offset...
-	 */
-
-	offs = (olen > ilen) ? olen - ilen : 0;
-	offs += olen >> 12;	/* Add 8 bytes for each 32K block */
-	offs += 64*1024 + 128;	/* Add 64K + 128 bytes slack */
-	offs = (offs+4095) & ~4095; /* Round to a 4K boundary */
-
 	printf(".section \".rodata..compressed\",\"a\",@progbits\n");
 	printf(".globl z_input_len\n");
 	printf("z_input_len = %lu\n", ilen);
 	printf(".globl z_output_len\n");
 	printf("z_output_len = %lu\n", (unsigned long)olen);
-	printf(".globl z_min_extract_offset\n");
-	printf("z_min_extract_offset = 0x%lx\n", offs);
 
 	printf(".globl input_data, input_data_end\n");
 	printf("input_data:\n");
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 9bfab22..99204e5 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -440,7 +440,36 @@ setup_data:		.quad 0			# 64-bit physical pointer to
 
 pref_address:		.quad LOAD_PHYSICAL_ADDR	# preferred load addr
 
+/* check arch/x86/boot/compressed/misc.c for the formula about extra_bytes.  */
+#define ZO_z_extra_bytes ((ZO_z_output_len >> 12) + 32768 + 18)
+#if ZO_z_output_len > ZO_z_input_len
+#define ZO_z_extract_offset (ZO_z_output_len + ZO_z_extra_bytes - ZO_z_input_len)
+#else
+#define ZO_z_extract_offset ZO_z_extra_bytes
+#endif
+
+/*
+ * extract_offset has to be bigger than ZO head section.
+ * otherwise during head code running to move ZO to end of buffer,
+ * will overwrite head code itself.
+ */
+#if (ZO__ehead - ZO_startup_32) > ZO_z_extract_offset
+#define ZO_z_min_extract_offset ((ZO__ehead - ZO_startup_32 + 4095) & ~4095)
+#else
+#define ZO_z_min_extract_offset ((ZO_z_extract_offset + 4095) & ~4095)
+#endif
+
 #define ZO_INIT_SIZE	(ZO__end - ZO_startup_32 + ZO_z_min_extract_offset)
+
+/*
+ * ZO__end - ZO_startup_32 is (ZO__ehead - ZO_startup_32) + ZO_z_input_len + (ZO__end - ZO__text)
+ * ZO_z_min_extract_offset >= (ZO_z_output_len + ZO_z_extra_bytes - ZO_z_input_len)
+ * then ZO_INIT_SIZE >= (ZO__ehead - ZO_startup_32) + ZO_z_input_len + (ZO__end - ZO__text) + (ZO_z_output_len + ZO_z_extra_bytes - ZO_z_input_len)
+ * so (ZO_INIT_SIZE - ZO_z_output_len) > (ZO__end - ZO__text)
+ * That means during decompressor running, output would not
+ * overwrite the decompressor itself.
+ */
+
 #define VO_INIT_SIZE	(VO__end - VO__text)
 #if ZO_INIT_SIZE > VO_INIT_SIZE
 #define INIT_SIZE ZO_INIT_SIZE
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (6 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 07/42] x86, boot: Move z_extract_offset calculation to header.S Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 22:40   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 09/42] x86, boot: Split kernel_ident_mapping_init to another file Yinghai Lu
                   ` (35 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

There is boundary checking for pointer in kaslr relocation handling.

Current code is using output_len, and that is VO (vmlinux after objcopy)
file size plus vmlinux.relocs file size.

That is not right, as we should use loaded address for running.

At that time parse_elf already move the sections according to ELF headers.

The valid range should be VO [_text, __bss_start) loaded physical addresses.

In the patch, add export for __bss_start to voffset.h and use it to get
max_addr.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/Makefile | 2 +-
 arch/x86/boot/compressed/misc.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 50daea7..e12a93c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -40,7 +40,7 @@ LDFLAGS_vmlinux := -T
 hostprogs-y	:= mkpiggy
 HOST_EXTRACFLAGS += -I$(srctree)/tools/include
 
-sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
+sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
 
 quiet_cmd_voffset = VOFFSET $@
       cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index db97bdf..8fb74ba 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -234,7 +234,7 @@ static void handle_relocations(void *output, unsigned long output_len)
 	int *reloc;
 	unsigned long delta, map, ptr;
 	unsigned long min_addr = (unsigned long)output;
-	unsigned long max_addr = min_addr + output_len;
+	unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
 
 	/*
 	 * Calculate the delta between where vmlinux was linked to load
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 09/42] x86, boot: Split kernel_ident_mapping_init to another file
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (7 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:19 ` [PATCH 10/42] x86, 64bit: Set ident_mapping for kaslr Yinghai Lu
                   ` (34 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

We need to include that in boot::decompress_kernel stage to set new
ident mapping.

Also add checking for __pa/__va macro definition, as we need to override them
in boot::decompress_kernel stage.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/page.h |  5 +++
 arch/x86/mm/ident_map.c     | 74 +++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/init_64.c       | 74 +--------------------------------------------
 3 files changed, 80 insertions(+), 73 deletions(-)
 create mode 100644 arch/x86/mm/ident_map.c

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 802dde3..cf8f619 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -37,7 +37,10 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
+#ifndef __pa
 #define __pa(x)		__phys_addr((unsigned long)(x))
+#endif
+
 #define __pa_nodebug(x)	__phys_addr_nodebug((unsigned long)(x))
 /* __pa_symbol should be used for C visible symbols.
    This seems to be the official gcc blessed way to do such arithmetic. */
@@ -51,7 +54,9 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 #define __pa_symbol(x) \
 	__phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))
 
+#ifndef __va
 #define __va(x)			((void *)((unsigned long)(x)+PAGE_OFFSET))
+#endif
 
 #define __boot_va(x)		__va(x)
 #define __boot_pa(x)		__pa(x)
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
new file mode 100644
index 0000000..751ca92
--- /dev/null
+++ b/arch/x86/mm/ident_map.c
@@ -0,0 +1,74 @@
+
+static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page,
+			   unsigned long addr, unsigned long end)
+{
+	addr &= PMD_MASK;
+	for (; addr < end; addr += PMD_SIZE) {
+		pmd_t *pmd = pmd_page + pmd_index(addr);
+
+		if (!pmd_present(*pmd))
+			set_pmd(pmd, __pmd(addr | pmd_flag));
+	}
+}
+static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
+			  unsigned long addr, unsigned long end)
+{
+	unsigned long next;
+
+	for (; addr < end; addr = next) {
+		pud_t *pud = pud_page + pud_index(addr);
+		pmd_t *pmd;
+
+		next = (addr & PUD_MASK) + PUD_SIZE;
+		if (next > end)
+			next = end;
+
+		if (pud_present(*pud)) {
+			pmd = pmd_offset(pud, 0);
+			ident_pmd_init(info->pmd_flag, pmd, addr, next);
+			continue;
+		}
+		pmd = (pmd_t *)info->alloc_pgt_page(info->context);
+		if (!pmd)
+			return -ENOMEM;
+		ident_pmd_init(info->pmd_flag, pmd, addr, next);
+		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
+	}
+
+	return 0;
+}
+
+int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
+			      unsigned long addr, unsigned long end)
+{
+	unsigned long next;
+	int result;
+	int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0;
+
+	for (; addr < end; addr = next) {
+		pgd_t *pgd = pgd_page + pgd_index(addr) + off;
+		pud_t *pud;
+
+		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
+		if (next > end)
+			next = end;
+
+		if (pgd_present(*pgd)) {
+			pud = pud_offset(pgd, 0);
+			result = ident_pud_init(info, pud, addr, next);
+			if (result)
+				return result;
+			continue;
+		}
+
+		pud = (pud_t *)info->alloc_pgt_page(info->context);
+		if (!pud)
+			return -ENOMEM;
+		result = ident_pud_init(info, pud, addr, next);
+		if (result)
+			return result;
+		set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
+	}
+
+	return 0;
+}
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3fba623..6f457a4 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -56,79 +56,7 @@
 
 #include "mm_internal.h"
 
-static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page,
-			   unsigned long addr, unsigned long end)
-{
-	addr &= PMD_MASK;
-	for (; addr < end; addr += PMD_SIZE) {
-		pmd_t *pmd = pmd_page + pmd_index(addr);
-
-		if (!pmd_present(*pmd))
-			set_pmd(pmd, __pmd(addr | pmd_flag));
-	}
-}
-static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
-			  unsigned long addr, unsigned long end)
-{
-	unsigned long next;
-
-	for (; addr < end; addr = next) {
-		pud_t *pud = pud_page + pud_index(addr);
-		pmd_t *pmd;
-
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (next > end)
-			next = end;
-
-		if (pud_present(*pud)) {
-			pmd = pmd_offset(pud, 0);
-			ident_pmd_init(info->pmd_flag, pmd, addr, next);
-			continue;
-		}
-		pmd = (pmd_t *)info->alloc_pgt_page(info->context);
-		if (!pmd)
-			return -ENOMEM;
-		ident_pmd_init(info->pmd_flag, pmd, addr, next);
-		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-	}
-
-	return 0;
-}
-
-int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
-			      unsigned long addr, unsigned long end)
-{
-	unsigned long next;
-	int result;
-	int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0;
-
-	for (; addr < end; addr = next) {
-		pgd_t *pgd = pgd_page + pgd_index(addr) + off;
-		pud_t *pud;
-
-		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
-		if (next > end)
-			next = end;
-
-		if (pgd_present(*pgd)) {
-			pud = pud_offset(pgd, 0);
-			result = ident_pud_init(info, pud, addr, next);
-			if (result)
-				return result;
-			continue;
-		}
-
-		pud = (pud_t *)info->alloc_pgt_page(info->context);
-		if (!pud)
-			return -ENOMEM;
-		result = ident_pud_init(info, pud, addr, next);
-		if (result)
-			return result;
-		set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
-	}
-
-	return 0;
-}
+#include "ident_map.c"
 
 /*
  * NOTE: pagetable_init alloc all the fixmap pagetables contiguous on the
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 10/42] x86, 64bit: Set ident_mapping for kaslr
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (8 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 09/42] x86, boot: Split kernel_ident_mapping_init to another file Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:19 ` [PATCH 11/42] x86, boot: Add checking for memcpy Yinghai Lu
                   ` (33 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Jiri Kosina, Matt Fleming

Current aslr only support random in near range, and new range still use
old mapping. Also it does not support new range above 4G.

We need to have ident mapping for the new range before we can do
decompress to the new output, and later run them.

In this patch, we add ident mapping for all needed range.

At first, to support aslr to put random VO above 4G, we must set ident
mapping for the new range when it come via startup_32 path.

Secondly, when boot from 64bit bootloader, bootloader set ident mapping,
and boot via ZO (arch/x86/boot/compressed/vmlinux) startup_64.
Those pages for pagetable need to be avoided when we select new random
VO (vmlinux) base. Otherwise decompressor would overwrite them during
decompressing.
First way would be: walk through pagetable and find out every page is used
by pagetable for every mem_aovid checking but we will need extra code, and
may need to increase mem_avoid array size to hold them.
Other way would be: We can create new ident mapping instead, and pages for
pagetable will come from _pagetable section of ZO, and they are in
mem_avoid array already. In this way, we can reuse the code for ident
mapping.

The _pgtable will be shared 32bit and 64bit path to reduce init_size,
as now ZO _rodata to _end will contribute init_size.

We need to increase pgt buffer size.
When boot via startup_64, as we need to cover old VO, params, cmdline
and new VO, in extreme case we could have them all cross 512G boundary,
will need (2+2)*4 pages with 2M mapping. And need 2 for first 2M for vga
ram. Plus one for level4. Total will be 19 pages.
When boot via startup_32, aslr would move new VO above 4G, we need set
extra ident mapping for new VO, pgt buffer come from _pgtable offset 6
pages. Should only need (2+2) pages at most when it cross 512G boundary.
So 19 pages could make both paths happy.

Cc: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/Makefile   |  3 ++
 arch/x86/boot/compressed/aslr.c     | 14 ++++++
 arch/x86/boot/compressed/head_64.S  |  4 +-
 arch/x86/boot/compressed/misc.h     | 11 +++++
 arch/x86/boot/compressed/misc_pgt.c | 91 +++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/boot.h         | 19 ++++++++
 6 files changed, 140 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/boot/compressed/misc_pgt.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index e12a93c..66461b4 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -58,6 +58,9 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/aslr.o
+ifdef CONFIG_X86_64
+	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/misc_pgt.o
+endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
 
diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index d753fb3..0990c78 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -151,6 +151,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	 */
 	mem_avoid[0].start = input;
 	mem_avoid[0].size = (output + init_size) - input;
+	fill_pagetable(input, (output + init_size) - input);
 
 	/* Avoid initrd. */
 	initrd_start  = (u64)real_mode->ext_ramdisk_image << 32;
@@ -159,6 +160,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	initrd_size |= real_mode->hdr.ramdisk_size;
 	mem_avoid[1].start = initrd_start;
 	mem_avoid[1].size = initrd_size;
+	/* don't need to set mapping for initrd */
 
 	/* Avoid kernel command line. */
 	cmd_line  = (u64)real_mode->ext_cmd_line_ptr << 32;
@@ -169,10 +171,19 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 		;
 	mem_avoid[2].start = cmd_line;
 	mem_avoid[2].size = cmd_line_size;
+	fill_pagetable(cmd_line, cmd_line_size);
 
 	/* Avoid params */
 	mem_avoid[3].start = (unsigned long)real_mode;
 	mem_avoid[3].size = sizeof(*real_mode);
+	fill_pagetable((unsigned long)real_mode, sizeof(*real_mode));
+
+	/* don't need to set mapping for setup_data */
+
+#ifdef CONFIG_X86_VERBOSE_BOOTUP
+	/* for video ram */
+	fill_pagetable(0, PMD_SIZE);
+#endif
 }
 
 /* Does this memory vector overlap a known avoided area? */
@@ -330,6 +341,9 @@ unsigned char *choose_kernel_location(unsigned char *input,
 		goto out;
 
 	choice = random;
+
+	fill_pagetable(choice, output_run_size);
+	switch_pagetable();
 out:
 	return (unsigned char *)choice;
 }
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 3691451..075bb15 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -126,7 +126,7 @@ ENTRY(startup_32)
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
-	movl	$((4096*6)/4), %ecx
+	movl	$(BOOT_INIT_PGT_SIZE/4), %ecx
 	rep	stosl
 
 	/* Build Level 4 */
@@ -478,4 +478,4 @@ boot_stack_end:
 	.section ".pgtable","a",@nobits
 	.balign 4096
 pgtable:
-	.fill 6*4096, 1, 0
+	.fill BOOT_PGT_SIZE, 1, 0
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 40b4546..0104c0be 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -73,6 +73,17 @@ unsigned char *choose_kernel_location(unsigned char *input,
 }
 #endif
 
+#ifdef CONFIG_X86_64
+void fill_pagetable(unsigned long start, unsigned long size);
+void switch_pagetable(void);
+extern unsigned char _pgtable[];
+#else
+static inline void fill_pagetable(unsigned long start, unsigned long size)
+{ }
+static inline void switch_pagetable(void)
+{ }
+#endif
+
 #ifdef CONFIG_EARLY_PRINTK
 /* early_serial_console.c */
 extern int early_serial_base;
diff --git a/arch/x86/boot/compressed/misc_pgt.c b/arch/x86/boot/compressed/misc_pgt.c
new file mode 100644
index 0000000..954811e
--- /dev/null
+++ b/arch/x86/boot/compressed/misc_pgt.c
@@ -0,0 +1,91 @@
+#define __pa(x)  ((unsigned long)(x))
+#define __va(x)  ((void *)((unsigned long)(x)))
+
+#include "misc.h"
+
+#include <asm/init.h>
+#include <asm/pgtable.h>
+
+#include "../../mm/ident_map.c"
+#include "../string.h"
+
+struct alloc_pgt_data {
+	unsigned char *pgt_buf;
+	unsigned long pgt_buf_size;
+	unsigned long pgt_buf_offset;
+};
+
+static void *alloc_pgt_page(void *context)
+{
+	struct alloc_pgt_data *d = (struct alloc_pgt_data *)context;
+	unsigned char *p = (unsigned char *)d->pgt_buf;
+
+	if (d->pgt_buf_offset >= d->pgt_buf_size) {
+		debug_putstr("out of pgt_buf in misc.c\n");
+		return NULL;
+	}
+
+	p += d->pgt_buf_offset;
+	d->pgt_buf_offset += PAGE_SIZE;
+
+	return p;
+}
+
+/*
+ * Use a normal definition of memset() from string.c. There are already
+ * included header files which expect a definition of memset() and by
+ * the time we define memset macro, it is too late.
+ */
+#undef memset
+
+unsigned long __force_order;
+static struct alloc_pgt_data pgt_data;
+static struct x86_mapping_info mapping_info;
+static pgd_t *level4p;
+
+void fill_pagetable(unsigned long start, unsigned long size)
+{
+	unsigned long end = start + size;
+
+	if (!level4p) {
+		pgt_data.pgt_buf_offset = 0;
+		mapping_info.alloc_pgt_page = alloc_pgt_page;
+		mapping_info.context = &pgt_data;
+		mapping_info.pmd_flag = __PAGE_KERNEL_LARGE_EXEC;
+
+		/*
+		 * come from startup_32 ?
+		 * then cr3 is _pgtable, we can reuse it.
+		 */
+		level4p = (pgd_t *)read_cr3();
+		if ((unsigned long)level4p == (unsigned long)_pgtable) {
+			pgt_data.pgt_buf = (unsigned char *)_pgtable +
+						 BOOT_INIT_PGT_SIZE;
+			pgt_data.pgt_buf_size = BOOT_PGT_SIZE -
+						 BOOT_INIT_PGT_SIZE;
+			memset((unsigned char *)pgt_data.pgt_buf, 0,
+				pgt_data.pgt_buf_size);
+			debug_putstr("boot via startup_32\n");
+		} else {
+			pgt_data.pgt_buf = (unsigned char *)_pgtable;
+			pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
+			memset((unsigned char *)pgt_data.pgt_buf, 0,
+				pgt_data.pgt_buf_size);
+			debug_putstr("boot via startup_64\n");
+			level4p = (pgd_t *)alloc_pgt_page(&pgt_data);
+		}
+	}
+
+	/* align boundary to 2M */
+	start = round_down(start, PMD_SIZE);
+	end = round_up(end, PMD_SIZE);
+	if (start >= end)
+		return;
+
+	kernel_ident_mapping_init(&mapping_info, level4p, start, end);
+}
+
+void switch_pagetable(void)
+{
+	write_cr3((unsigned long)level4p);
+}
diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h
index 4fa687a..7b23908 100644
--- a/arch/x86/include/asm/boot.h
+++ b/arch/x86/include/asm/boot.h
@@ -32,7 +32,26 @@
 #endif /* !CONFIG_KERNEL_BZIP2 */
 
 #ifdef CONFIG_X86_64
+
 #define BOOT_STACK_SIZE	0x4000
+
+#define BOOT_INIT_PGT_SIZE (6*4096)
+#ifdef CONFIG_RANDOMIZE_BASE
+/*
+ * 1 page for level4, 2 pages for first 2M.
+ * (2+2)*4 pages for kernel, param, cmd_line, random kernel
+ * if all cross 512G boundary.
+ * So total will be 19 pages.
+ */
+#ifdef CONFIG_X86_VERBOSE_BOOTUP
+#define BOOT_PGT_SIZE (19*4096)
+#else
+#define BOOT_PGT_SIZE (17*4096)
+#endif
+#else
+#define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE
+#endif
+
 #else
 #define BOOT_STACK_SIZE	0x1000
 #endif
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 11/42] x86, boot: Add checking for memcpy
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (9 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 10/42] x86, 64bit: Set ident_mapping for kaslr Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:19 ` [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G Yinghai Lu
                   ` (32 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

parse_elf is using local memcpy to move section to running position.

That memcpy actually only support no overlapping or dest < src.

Add checking in memcpy to find out wrong with future use, at that time
we will need to have backward memcpy for it.

Also put comments in parse_elf about the fact.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/misc.c   | 14 +++++++-------
 arch/x86/boot/compressed/misc.h   |  2 ++
 arch/x86/boot/compressed/string.c | 28 ++++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 8fb74ba..83f98a5 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -106,9 +106,6 @@
 #undef memset
 #define memzero(s, n)	memset((s), 0, (n))
 
-
-static void error(char *m);
-
 /*
  * This is set up by the setup-routine at boot-time
  */
@@ -218,7 +215,7 @@ void __putstr(const char *s)
 	outb(0xff & (pos >> 1), vidport+1);
 }
 
-static void error(char *x)
+void error(char *x)
 {
 	error_putstr("\n\n");
 	error_putstr(x);
@@ -353,9 +350,12 @@ static void parse_elf(void *output)
 #else
 			dest = (void *)(phdr->p_paddr);
 #endif
-			memcpy(dest,
-			       output + phdr->p_offset,
-			       phdr->p_filesz);
+			/*
+			 * simple version memcpy only can work when dest is
+			 *   smaller than src or no overlapping.
+			 * Here dest is smaller than src always.
+			 */
+			memcpy(dest, output + phdr->p_offset, phdr->p_filesz);
 			break;
 		default: /* Ignore other PT_* */ break;
 		}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 0104c0be..af135b7 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -36,6 +36,8 @@ extern struct boot_params *real_mode;		/* Pointer to real-mode data */
 void __putstr(const char *s);
 #define error_putstr(__x)  __putstr(__x)
 
+void error(char *x);
+
 #ifdef CONFIG_X86_VERBOSE_BOOTUP
 
 #define debug_putstr(__x)  __putstr(__x)
diff --git a/arch/x86/boot/compressed/string.c b/arch/x86/boot/compressed/string.c
index 00e788b..03805a4 100644
--- a/arch/x86/boot/compressed/string.c
+++ b/arch/x86/boot/compressed/string.c
@@ -1,7 +1,7 @@
 #include "../string.c"
 
 #ifdef CONFIG_X86_32
-void *memcpy(void *dest, const void *src, size_t n)
+void *__memcpy(void *dest, const void *src, size_t n)
 {
 	int d0, d1, d2;
 	asm volatile(
@@ -15,7 +15,7 @@ void *memcpy(void *dest, const void *src, size_t n)
 	return dest;
 }
 #else
-void *memcpy(void *dest, const void *src, size_t n)
+void *__memcpy(void *dest, const void *src, size_t n)
 {
 	long d0, d1, d2;
 	asm volatile(
@@ -30,6 +30,30 @@ void *memcpy(void *dest, const void *src, size_t n)
 }
 #endif
 
+void *memcpy(void *dest, const void *src, size_t n)
+{
+	unsigned long start_dest, end_dest;
+	unsigned long start_src, end_src;
+	unsigned long max_start, min_end;
+
+	if (dest < src)
+		return __memcpy(dest, src, n);
+
+	start_dest = (unsigned long)dest;
+	end_dest = (unsigned long)dest + n;
+	start_src = (unsigned long)src;
+	end_src = (unsigned long)src + n;
+	max_start = (start_dest > start_src) ?  start_dest : start_src;
+	min_end = (end_dest < end_src) ? end_dest : end_src;
+
+	if (max_start >= min_end)
+		return __memcpy(dest, src, n);
+
+	error("memcpy does not support overlapping with dest > src!\n");
+
+	return dest;
+}
+
 void *memset(void *s, int c, size_t n)
 {
 	int i;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (10 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 11/42] x86, boot: Add checking for memcpy Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 22:42   ` Kees Cook
  2015-07-07 20:19 ` [PATCH 13/42] x86, kaslr: Introduce struct slot_area to manage randomization slot info Yinghai Lu
                   ` (31 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

When process 32 bit relocation tables a local variable extended is
defined to calculate the physical address of relocs entry. However
it's type is int which is enough for i386, for x86_64 not enough.
That's why relocation can only be handled when kernel is loaded
below 2G, otherwise a overflow will happen and cause system hang.

Here change it to long as 32 bit inverse relocation processing does,
and this change is safe for i386 relocation handling too.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/misc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 83f98a5..bfa4f0a 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -273,7 +273,7 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 * So we work backwards from the end of the decompressed image.
 	 */
 	for (reloc = output + output_len - sizeof(*reloc); *reloc; reloc--) {
-		int extended = *reloc;
+		long extended = *reloc;
 		extended += map;
 
 		ptr = (unsigned long)extended;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 13/42] x86, kaslr: Introduce struct slot_area to manage randomization slot info
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (11 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G Yinghai Lu
@ 2015-07-07 20:19 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 14/42] x86, kaslr: Add two functions which will be used later Yinghai Lu
                   ` (30 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:19 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

Kernel is expected to be randomly reloaded anywhere in the whole
physical memory area, it could be near 64T at most. In this case
there could be about 4*1024*1024 randomization slots. Hence the
old slot array will cost too much memory and also not efficient
to store the slot information one by one into slot array.

Here introduce struct slot_area to manage randomization slot info
in one contiguous memory area excluding the avoid area. slot_areas
is used to store all slot area info. Since setup_data is a linked
list, could contain many datas by pointer to point one by one,
excluding them will split RAM memory into many smaller areas, here
only take the first 100 slot areas if too many of them.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 0990c78..e3995f1 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -216,8 +216,20 @@ static bool mem_avoid_overlap(struct mem_vector *img)
 
 static unsigned long slots[CONFIG_RANDOMIZE_BASE_MAX_OFFSET /
 			   CONFIG_PHYSICAL_ALIGN];
+
+struct slot_area {
+	unsigned long addr;
+	int num;
+};
+
+#define MAX_SLOT_AREA 100
+
+static struct slot_area slot_areas[MAX_SLOT_AREA];
+
 static unsigned long slot_max;
 
+static unsigned long slot_area_index;
+
 static void slots_append(unsigned long addr)
 {
 	/* Overflowing the slots list should be impossible. */
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 14/42] x86, kaslr: Add two functions which will be used later
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (12 preceding siblings ...)
  2015-07-07 20:19 ` [PATCH 13/42] x86, kaslr: Introduce struct slot_area to manage randomization slot info Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 15/42] x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address Yinghai Lu
                   ` (29 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

Add two functions mem_min_overlap() and store_slot_info() which will be
used later.

Given a memory region mem_min_overlap will iterate all avoid region to
find the first one which overlap with it.

store_slot_info() calculates the slot info of passed in region and
store it into slot_areas[].

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 51 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index e3995f1..81070e9 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -214,6 +214,40 @@ static bool mem_avoid_overlap(struct mem_vector *img)
 	return false;
 }
 
+static unsigned long
+mem_min_overlap(struct mem_vector *img, struct mem_vector *out)
+{
+	int i;
+	struct setup_data *ptr;
+	unsigned long min = img->start + img->size;
+
+	for (i = 0; i < MEM_AVOID_MAX; i++) {
+		if (mem_overlaps(img, &mem_avoid[i]) &&
+			(mem_avoid[i].start < min)) {
+			*out = mem_avoid[i];
+			min = mem_avoid[i].start;
+		}
+	}
+
+	/* Check all entries in the setup_data linked list. */
+	ptr = (struct setup_data *)(unsigned long)real_mode->hdr.setup_data;
+	while (ptr) {
+		struct mem_vector avoid;
+
+		avoid.start = (unsigned long)ptr;
+		avoid.size = sizeof(*ptr) + ptr->len;
+
+		if (mem_overlaps(img, &avoid) && (avoid.start < min)) {
+			*out = avoid;
+			min = avoid.start;
+		}
+
+		ptr = (struct setup_data *)(unsigned long)ptr->next;
+	}
+
+	return min;
+}
+
 static unsigned long slots[CONFIG_RANDOMIZE_BASE_MAX_OFFSET /
 			   CONFIG_PHYSICAL_ALIGN];
 
@@ -230,6 +264,23 @@ static unsigned long slot_max;
 
 static unsigned long slot_area_index;
 
+static void store_slot_info(struct mem_vector *region, unsigned long image_size)
+{
+	struct slot_area slot_area;
+
+	slot_area.addr = region->start;
+	if (image_size <= CONFIG_PHYSICAL_ALIGN)
+		slot_area.num = region->size / CONFIG_PHYSICAL_ALIGN;
+	else
+		slot_area.num = (region->size - image_size) /
+				CONFIG_PHYSICAL_ALIGN + 1;
+
+	if (slot_area.num > 0) {
+		slot_areas[slot_area_index++] = slot_area;
+		slot_max += slot_area.num;
+	}
+}
+
 static void slots_append(unsigned long addr)
 {
 	/* Overflowing the slots list should be impossible. */
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 15/42] x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (13 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 14/42] x86, kaslr: Add two functions which will be used later Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 16/42] x86, kaslr: Randomize physical and virtual address of kernel separately Yinghai Lu
                   ` (28 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

Kaslr extended kernel text mapping region size from 512M to 1G,
namely CONFIG_RANDOMIZE_BASE_MAX_OFFSET. This means kernel text
can be mapped to below region:

[__START_KERNEL_map + LOAD_PHYSICAL_ADDR, __START_KERNEL_map + 1G]

Introduce a function find_random_virt_offset() to get random value
between LOAD_PHYSICAL_ADDR and CONFIG_RANDOMIZE_BASE_MAX_OFFSET.
This random value will be added to __START_KERNEL_map to get the
starting address which kernel text is mapped from. Since slot can
be anywhere of this region, means it is an independent slot_area,
it is simple to get a slot according to random value.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 81070e9..775c6f9 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -366,6 +366,27 @@ static unsigned long find_random_addr(unsigned long minimum,
 	return slots_fetch_random();
 }
 
+static unsigned long find_random_virt_offset(unsigned long minimum,
+				  unsigned long image_size)
+{
+	unsigned long slot_num, random;
+
+	/* Make sure minimum is aligned. */
+	minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN);
+
+	if (image_size <= CONFIG_PHYSICAL_ALIGN)
+		slot_num = (CONFIG_RANDOMIZE_BASE_MAX_OFFSET - minimum) /
+				CONFIG_PHYSICAL_ALIGN;
+	else
+		slot_num = (CONFIG_RANDOMIZE_BASE_MAX_OFFSET -
+				minimum - image_size) /
+				CONFIG_PHYSICAL_ALIGN + 1;
+
+	random = get_random_long() % slot_num;
+
+	return random * CONFIG_PHYSICAL_ALIGN + minimum;
+}
+
 unsigned char *choose_kernel_location(unsigned char *input,
 				      unsigned long input_size,
 				      unsigned char *output,
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 16/42] x86, kaslr: Randomize physical and virtual address of kernel separately
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (14 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 15/42] x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 17/42] x86, kaslr: Add support of kernel physical address randomization above 4G Yinghai Lu
                   ` (27 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

On x86_64, in old kaslr implementaion only physical address of kernel
loading is randomized. Then calculate the delta of physical address
where vmlinux was linked to load and where it is finally loaded. If
delta is not equal to 0, namely there's a new physical address where
kernel is actually decompressed, relocation handling need be done. Then
delta is added to offset of kernel symbol relocation, this makes the
address of kernel text mapping move delta long.

Here the behavior is changed. Randomize both the physical address
where kernel is decompressed and the virtual address where kernel text
is mapped. And relocation handling only depends on virtual address
randomization. Means if and only if virtual address is randomized to
a different value, we add the delta to the offset of kernel relocs.

Note that up to now both virtual offset and physical addr randomization
cann't exceed CONFIG_RANDOMIZE_BASE_MAX_OFFSET, namely 1G.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 46 +++++++++++++++++++++--------------------
 arch/x86/boot/compressed/misc.c | 39 ++++++++++++++++++++--------------
 arch/x86/boot/compressed/misc.h | 19 +++++++++--------
 3 files changed, 58 insertions(+), 46 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 775c6f9..554b637 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -349,7 +349,7 @@ static void process_e820_entry(struct e820entry *entry,
 	}
 }
 
-static unsigned long find_random_addr(unsigned long minimum,
+static unsigned long find_random_phy_addr(unsigned long minimum,
 				      unsigned long size)
 {
 	int i;
@@ -387,23 +387,24 @@ static unsigned long find_random_virt_offset(unsigned long minimum,
 	return random * CONFIG_PHYSICAL_ALIGN + minimum;
 }
 
-unsigned char *choose_kernel_location(unsigned char *input,
-				      unsigned long input_size,
-				      unsigned char *output,
-				      unsigned long output_run_size)
+void choose_kernel_location(unsigned char *input,
+				unsigned long input_size,
+				unsigned char **output,
+				unsigned long output_run_size,
+				unsigned char **virt_offset)
 {
-	unsigned long choice = (unsigned long)output;
 	unsigned long random;
+	*virt_offset = (unsigned char *)LOAD_PHYSICAL_ADDR;
 
 #ifdef CONFIG_HIBERNATION
 	if (!cmdline_find_option_bool("kaslr")) {
 		debug_putstr("KASLR disabled by default...\n");
-		goto out;
+		return;
 	}
 #else
 	if (cmdline_find_option_bool("nokaslr")) {
 		debug_putstr("KASLR disabled by cmdline...\n");
-		goto out;
+		return;
 	}
 #endif
 
@@ -411,23 +412,24 @@ unsigned char *choose_kernel_location(unsigned char *input,
 
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init((unsigned long)input, input_size,
-		       (unsigned long)output);
+		       (unsigned long)*output);
 
 	/* Walk e820 and find a random address. */
-	random = find_random_addr(choice, output_run_size);
-	if (!random) {
+	random = find_random_phy_addr((unsigned long)*output, output_run_size);
+	if (!random)
 		debug_putstr("KASLR could not find suitable E820 region...\n");
-		goto out;
+	else {
+		if ((unsigned long)*output != random) {
+			fill_pagetable(random, output_run_size);
+			switch_pagetable();
+			*output = (unsigned char *)random;
+		}
 	}
 
-	/* Always enforce the minimum. */
-	if (random < choice)
-		goto out;
-
-	choice = random;
-
-	fill_pagetable(choice, output_run_size);
-	switch_pagetable();
-out:
-	return (unsigned char *)choice;
+	/*
+	 * Get a random address between LOAD_PHYSICAL_ADDR and
+	 * CONFIG_RANDOMIZE_BASE_MAX_OFFSET
+	 */
+	random = find_random_virt_offset(LOAD_PHYSICAL_ADDR, output_run_size);
+	*virt_offset = (unsigned char *)random;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index bfa4f0a..6b2a308 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -226,7 +226,8 @@ void error(char *x)
 }
 
 #if CONFIG_X86_NEED_RELOCS
-static void handle_relocations(void *output, unsigned long output_len)
+static void handle_relocations(void *output, unsigned long output_len,
+			       void *virt_offset)
 {
 	int *reloc;
 	unsigned long delta, map, ptr;
@@ -238,11 +239,6 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 * and where it was actually loaded.
 	 */
 	delta = min_addr - LOAD_PHYSICAL_ADDR;
-	if (!delta) {
-		debug_putstr("No relocation needed... ");
-		return;
-	}
-	debug_putstr("Performing relocations... ");
 
 	/*
 	 * The kernel contains a table of relocation addresses. Those
@@ -253,6 +249,22 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 */
 	map = delta - __START_KERNEL_map;
 
+
+
+	/*
+	 * 32-bit always performs relocations. 64-bit relocations are only
+	 * needed if kASLR has chosen a different starting address offset
+	 * from __START_KERNEL_map.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64))
+		delta = (unsigned long)virt_offset - LOAD_PHYSICAL_ADDR;
+
+	if (!delta) {
+		debug_putstr("No relocation needed... ");
+		return;
+	}
+	debug_putstr("Performing relocations... ");
+
 	/*
 	 * Process relocations: 32 bit relocations first then 64 bit after.
 	 * Three sets of binary relocations are added to the end of the kernel
@@ -306,7 +318,8 @@ static void handle_relocations(void *output, unsigned long output_len)
 #endif
 }
 #else
-static inline void handle_relocations(void *output, unsigned long output_len)
+static inline void handle_relocations(void *output, unsigned long output_len,
+				      void *virt_offset)
 { }
 #endif
 
@@ -373,6 +386,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	unsigned long run_size = VO__end - VO__text;
 	unsigned char *output_orig = output;
 	unsigned long output_run_size;
+	unsigned char *virt_offset;
 
 	real_mode = rmode;
 
@@ -405,8 +419,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
-	output = choose_kernel_location(input_data, input_len, output,
-					output_run_size);
+	choose_kernel_location(input_data, input_len, &output,
+			       output_run_size, &virt_offset);
 
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
@@ -426,12 +440,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	debug_putstr("\nDecompressing Linux... ");
 	decompress(input_data, input_len, NULL, NULL, output, NULL, error);
 	parse_elf(output);
-	/*
-	 * 32-bit always performs relocations. 64-bit relocations are only
-	 * needed if kASLR has chosen a different load address.
-	 */
-	if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
-		handle_relocations(output, output_len);
+	handle_relocations(output, output_len, virt_offset);
 	debug_putstr("done.\nBooting the kernel.\n");
 	return output;
 }
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index af135b7..b44a7c0 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -58,20 +58,21 @@ int cmdline_find_option_bool(const char *option);
 
 #if CONFIG_RANDOMIZE_BASE
 /* aslr.c */
-unsigned char *choose_kernel_location(unsigned char *input,
-				      unsigned long input_size,
-				      unsigned char *output,
-				      unsigned long output_run_size);
+void choose_kernel_location(unsigned char *input,
+				unsigned long input_size,
+				unsigned char **output,
+				unsigned long output_run_size,
+				unsigned char **virt_offset);
 /* cpuflags.c */
 bool has_cpuflag(int flag);
 #else
 static inline
-unsigned char *choose_kernel_location(unsigned char *input,
-				      unsigned long input_size,
-				      unsigned char *output,
-				      unsigned long output_run_size)
+void choose_kernel_location(unsigned char *input,
+				unsigned long input_size,
+				unsigned char **output,
+				unsigned long output_run_size,
+				unsigned char **virt_offset)
 {
-	return output;
 }
 #endif
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 17/42] x86, kaslr: Add support of kernel physical address randomization above 4G
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (15 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 16/42] x86, kaslr: Randomize physical and virtual address of kernel separately Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 18/42] x86, kaslr: Remove useless codes Yinghai Lu
                   ` (26 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

In kaslr implementation mechanism, mainly process_e820_entry and
slots_fetch_random do the job. process_e820_entry is responsible
for storing the slot information. slots_fetch_random takes care
of fetching slot information. In this patch, for adding support
of kernel physical address randomization above 4G, both of these
two functions are changed based on the new slot_area data structure.

Now kernel can be reloaded and decompressed anywhere of the whole
physical memory, even near 64T at most.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 68 ++++++++++++++++++++++++++++++-----------
 1 file changed, 51 insertions(+), 17 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 554b637..9158882 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -293,27 +293,40 @@ static void slots_append(unsigned long addr)
 
 static unsigned long slots_fetch_random(void)
 {
+	unsigned long random;
+	int i;
+
 	/* Handle case of no slots stored. */
 	if (slot_max == 0)
 		return 0;
 
-	return slots[get_random_long() % slot_max];
+	random = get_random_long() % slot_max;
+
+	for (i = 0; i < slot_area_index; i++) {
+		if (random >= slot_areas[i].num) {
+			random -= slot_areas[i].num;
+			continue;
+		}
+		return slot_areas[i].addr + random * CONFIG_PHYSICAL_ALIGN;
+	}
+
+	if (i == slot_area_index)
+		debug_putstr("Something wrong happened in slots_fetch_random()...\n");
+	return 0;
 }
 
 static void process_e820_entry(struct e820entry *entry,
 			       unsigned long minimum,
 			       unsigned long image_size)
 {
-	struct mem_vector region, img;
+	struct mem_vector region, out;
+	struct slot_area slot_area;
+	unsigned long min, start_orig;
 
 	/* Skip non-RAM entries. */
 	if (entry->type != E820_RAM)
 		return;
 
-	/* Ignore entries entirely above our maximum. */
-	if (entry->addr >= CONFIG_RANDOMIZE_BASE_MAX_OFFSET)
-		return;
-
 	/* Ignore entries entirely below our minimum. */
 	if (entry->addr + entry->size < minimum)
 		return;
@@ -321,10 +334,17 @@ static void process_e820_entry(struct e820entry *entry,
 	region.start = entry->addr;
 	region.size = entry->size;
 
+repeat:
+	start_orig = region.start;
+
 	/* Potentially raise address to minimum location. */
 	if (region.start < minimum)
 		region.start = minimum;
 
+	/* Return if slot area array is full */
+	if (slot_area_index == MAX_SLOT_AREA)
+		return;
+
 	/* Potentially raise address to meet alignment requirements. */
 	region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
 
@@ -333,20 +353,30 @@ static void process_e820_entry(struct e820entry *entry,
 		return;
 
 	/* Reduce size by any delta from the original address. */
-	region.size -= region.start - entry->addr;
+	region.size -= region.start - start_orig;
 
-	/* Reduce maximum size to fit end of image within maximum limit. */
-	if (region.start + region.size > CONFIG_RANDOMIZE_BASE_MAX_OFFSET)
-		region.size = CONFIG_RANDOMIZE_BASE_MAX_OFFSET - region.start;
+	/* Return if region can't contain decompressed kernel */
+	if (region.size < image_size)
+		return;
 
-	/* Walk each aligned slot and check for avoided areas. */
-	for (img.start = region.start, img.size = image_size ;
-	     mem_contains(&region, &img) ;
-	     img.start += CONFIG_PHYSICAL_ALIGN) {
-		if (mem_avoid_overlap(&img))
-			continue;
-		slots_append(img.start);
+	if (!mem_avoid_overlap(&region)) {
+		store_slot_info(&region, image_size);
+		return;
 	}
+
+	min = mem_min_overlap(&region, &out);
+
+	if (min > region.start + image_size) {
+		struct mem_vector tmp;
+
+		tmp.start = region.start;
+		tmp.size = min - region.start;
+		store_slot_info(&tmp, image_size);
+	}
+
+	region.size -= out.start - region.start + out.size;
+	region.start = out.start + out.size;
+	goto repeat;
 }
 
 static unsigned long find_random_phy_addr(unsigned long minimum,
@@ -361,6 +391,10 @@ static unsigned long find_random_phy_addr(unsigned long minimum,
 	/* Verify potential e820 positions, appending to slots list. */
 	for (i = 0; i < real_mode->e820_entries; i++) {
 		process_e820_entry(&real_mode->e820_map[i], minimum, size);
+		if (slot_area_index == MAX_SLOT_AREA) {
+			debug_putstr("Stop processing e820 since slot_areas is full...\n");
+			break;
+		}
 	}
 
 	return slots_fetch_random();
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 18/42] x86, kaslr: Remove useless codes
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (16 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 17/42] x86, kaslr: Add support of kernel physical address randomization above 4G Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 19/42] x86, kaslr: Allow random address could be below loaded address Yinghai Lu
                   ` (25 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel

From: Baoquan He <bhe@redhat.com>

Several auxiliary functions and slots[] are not needed any more since
struct slot_area is used to store the slot info of kaslr now. Hence
remove them in this patch.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/aslr.c | 24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 9158882..7c0e1da 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -112,17 +112,6 @@ struct mem_vector {
 #define MEM_AVOID_MAX 4
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
-static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
-{
-	/* Item at least partially before region. */
-	if (item->start < region->start)
-		return false;
-	/* Item at least partially after region. */
-	if (item->start + item->size > region->start + region->size)
-		return false;
-	return true;
-}
-
 static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 {
 	/* Item one is entirely before item two. */
@@ -248,9 +237,6 @@ mem_min_overlap(struct mem_vector *img, struct mem_vector *out)
 	return min;
 }
 
-static unsigned long slots[CONFIG_RANDOMIZE_BASE_MAX_OFFSET /
-			   CONFIG_PHYSICAL_ALIGN];
-
 struct slot_area {
 	unsigned long addr;
 	int num;
@@ -281,16 +267,6 @@ static void store_slot_info(struct mem_vector *region, unsigned long image_size)
 	}
 }
 
-static void slots_append(unsigned long addr)
-{
-	/* Overflowing the slots list should be impossible. */
-	if (slot_max >= CONFIG_RANDOMIZE_BASE_MAX_OFFSET /
-			CONFIG_PHYSICAL_ALIGN)
-		return;
-
-	slots[slot_max++] = addr;
-}
-
 static unsigned long slots_fetch_random(void)
 {
 	unsigned long random;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 19/42] x86, kaslr: Allow random address could be below loaded address
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (17 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 18/42] x86, kaslr: Remove useless codes Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 20/42] x86, boot: Add printf support for early console in compressed/misc.c Yinghai Lu
                   ` (24 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Now new output buffer is always after current one.

With correct tracking in mem_avoid, we can buffer below that.

That would make sure when bootloader like patched grub2 or kexec
have put output rather near the end of ram, we still can get
random base below output.

Now just pick 512M as min_addr.

with this patch, will get:

early console in decompress_kernel
decompress_kernel:
  input: [0x13e9ee3b4-0x13f36b9df], output: [0x13c000000-0x13f394fff], heap: [0x13f376ac0-0x13f37eabf]
boot via startup_64
KASLR using RDTSC...
KASLR using RDTSC...
                     new output: [0x6f000000-0x72394fff]

Decompressing Linux... xz... Parsing ELF... Performing relocations... done.
Booting the kernel.
[    0.000000] bootconsole [uart0] enabled
[    0.000000] Kernel Layout:
[    0.000000]   .text: [0x6f000000-0x70096a9c]
[    0.000000] .rodata: [0x70200000-0x70a4efff]
[    0.000000]   .data: [0x70c00000-0x70e4e9bf]
[    0.000000]   .init: [0x70e50000-0x7120bfff]
[    0.000000]    .bss: [0x71219000-0x7234efff]
[    0.000000]    .brk: [0x7234f000-0x72374fff]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/aslr.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 7c0e1da..a1535c1 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -403,7 +403,8 @@ void choose_kernel_location(unsigned char *input,
 				unsigned long output_run_size,
 				unsigned char **virt_offset)
 {
-	unsigned long random;
+	unsigned long random, min_addr;
+
 	*virt_offset = (unsigned char *)LOAD_PHYSICAL_ADDR;
 
 #ifdef CONFIG_HIBERNATION
@@ -424,8 +425,13 @@ void choose_kernel_location(unsigned char *input,
 	mem_avoid_init((unsigned long)input, input_size,
 		       (unsigned long)*output);
 
+	/* start from 512M */
+	min_addr = (unsigned long)*output;
+	if (min_addr > (512UL<<20))
+		min_addr = 512UL<<20;
+
 	/* Walk e820 and find a random address. */
-	random = find_random_phy_addr((unsigned long)*output, output_run_size);
+	random = find_random_phy_addr(min_addr, output_run_size);
 	if (!random)
 		debug_putstr("KASLR could not find suitable E820 region...\n");
 	else {
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 20/42] x86, boot: Add printf support for early console in compressed/misc.c
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (18 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 19/42] x86, kaslr: Allow random address could be below loaded address Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 21/42] x86, boot: Add more debug printout " Yinghai Lu
                   ` (23 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Reuse printf.c in x86 setup code.
And print out decompress_kernel input and output info.

Later decompresser code could print out more info for debug info.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/Makefile |  2 +-
 arch/x86/boot/compressed/misc.c   | 38 ++++++++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h   |  7 +++++++
 arch/x86/boot/compressed/printf.c |  5 +++++
 4 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/boot/compressed/printf.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 66461b4..8fc7dd9 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -54,7 +54,7 @@ $(obj)/misc.o: $(obj)/../voffset.h
 
 vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o \
-	$(obj)/piggy.o $(obj)/cpuflags.o
+	$(obj)/printf.o $(obj)/piggy.o $(obj)/cpuflags.o
 
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/aslr.o
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 6b2a308..ee73b7b 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -387,6 +387,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	unsigned char *output_orig = output;
 	unsigned long output_run_size;
 	unsigned char *virt_offset;
+	unsigned long init_size;
 
 	real_mode = rmode;
 
@@ -414,6 +415,37 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 
 	output_run_size = output_len > run_size ? output_len : run_size;
 
+	init_size = real_mode->hdr.init_size;
+	debug_putstr("decompress_kernel:\n");
+	debug_printf("       input: [0x%010lx-0x%010lx]\n",
+		 (unsigned long)input_data,
+		 (unsigned long)input_data + input_len - 1);
+	debug_printf("      output: [0x%010lx-0x%010lx] 0x%08lx: output_len\n",
+		 (unsigned long)output,
+		 (unsigned long)output + output_len - 1,
+		 (unsigned long)output_len);
+	debug_printf("              [0x%010lx-0x%010lx] 0x%08lx: run_size\n",
+		 (unsigned long)output,
+		 (unsigned long)output + run_size - 1,
+		 (unsigned long)run_size);
+	debug_printf("              [0x%010lx-0x%010lx] 0x%08lx: output_run_size\n",
+		 (unsigned long)output,
+		 (unsigned long)output + output_run_size - 1,
+		 (unsigned long)output_run_size);
+	debug_printf("              [0x%010lx-0x%010lx] 0x%08lx: init_size\n",
+		 (unsigned long)output,
+		 (unsigned long)output + init_size - 1,
+		 (unsigned long)init_size);
+	debug_printf("ZO text/data: [0x%010lx-0x%010lx]\n",
+		 (unsigned long)input_data + input_len,
+		 (unsigned long)output + init_size - 1);
+	debug_printf("     ZO heap: [0x%010lx-0x%010lx]\n",
+		 (unsigned long)heap,
+		 (unsigned long)heap + BOOT_HEAP_SIZE - 1);
+	debug_printf("  VO bss/brk: [0x%010lx-0x%010lx]\n",
+		 (unsigned long)output + (VO___bss_start - VO__text),
+		 (unsigned long)output + run_size - 1);
+
 	/*
 	 * The memory hole needed for the kernel is the larger of either
 	 * the entire decompressed kernel plus relocation table, or the
@@ -422,6 +454,12 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 	choose_kernel_location(input_data, input_len, &output,
 			       output_run_size, &virt_offset);
 
+	if (output != output_orig)
+		debug_printf("  new output: [0x%010lx-0x%010lx] 0x%08lx: output_run_size\n",
+			 (unsigned long)output,
+			 (unsigned long)output + output_run_size - 1,
+			 (unsigned long)output_run_size);
+
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
 		error("Destination address inappropriately aligned");
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index b44a7c0..410e5d3 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -38,14 +38,21 @@ void __putstr(const char *s);
 
 void error(char *x);
 
+/* printf.c */
+int sprintf(char *buf, const char *fmt, ...);
+int printf(const char *fmt, ...);
+
 #ifdef CONFIG_X86_VERBOSE_BOOTUP
 
 #define debug_putstr(__x)  __putstr(__x)
+#define debug_printf printf
 
 #else
 
 static inline void debug_putstr(const char *s)
 { }
+static inline int debug_printf(const char *fmt, ...)
+{ }
 
 #endif
 
diff --git a/arch/x86/boot/compressed/printf.c b/arch/x86/boot/compressed/printf.c
new file mode 100644
index 0000000..a3a080d
--- /dev/null
+++ b/arch/x86/boot/compressed/printf.c
@@ -0,0 +1,5 @@
+#include "misc.h"
+
+#define puts(__x)  __putstr(__x)
+
+#include "../printf.c"
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 21/42] x86, boot: Add more debug printout in compressed/misc.c
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (19 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 20/42] x86, boot: Add printf support for early console in compressed/misc.c Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 22/42] x86, setup: Check early serial console per string instead of one char Yinghai Lu
                   ` (22 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

with support that use printf.c in x86 setup code.
print out more info for debug info.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/misc.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index ee73b7b..a428c03 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -344,7 +344,7 @@ static void parse_elf(void *output)
 		return;
 	}
 
-	debug_putstr("Parsing ELF... ");
+	debug_putstr("Parsing ELF...\n");
 
 	phdrs = malloc(sizeof(*phdrs) * ehdr.e_phnum);
 	if (!phdrs)
@@ -369,6 +369,11 @@ static void parse_elf(void *output)
 			 * Here dest is smaller than src always.
 			 */
 			memcpy(dest, output + phdr->p_offset, phdr->p_filesz);
+			debug_printf("   parse_elf: [0x%010lx-0x%010lx] <=== [0x%010lx-0x%010lx]\n",
+				(unsigned long)dest,
+				(unsigned long)dest + phdr->p_filesz - 1,
+				(unsigned long)output + phdr->p_offset,
+				(unsigned long)output + phdr->p_offset + phdr->p_filesz - 1);
 			break;
 		default: /* Ignore other PT_* */ break;
 		}
@@ -475,6 +480,11 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 		error("Wrong destination address");
 #endif
 
+	debug_printf("  decompress: [0x%010lx-0x%010lx] <=== [0x%010lx-0x%010lx]\n",
+		(unsigned long)output,
+		(unsigned long)output + output_len - 1,
+		(unsigned long)input_data,
+		(unsigned long)input_data + input_len - 1);
 	debug_putstr("\nDecompressing Linux... ");
 	decompress(input_data, input_len, NULL, NULL, output, NULL, error);
 	parse_elf(output);
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 22/42] x86, setup: Check early serial console per string instead of one char
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (20 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 21/42] x86, boot: Add more debug printout " Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 22:59   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 23/42] x86, setup: Use puts() instead of printf() in edd code Yinghai Lu
                   ` (21 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Move out serial_putchar() calling out of putchar
Let puts() to call serial_putchar() directly.

So only need to check early_serial_base per string.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/tty.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/x86/boot/tty.c b/arch/x86/boot/tty.c
index def2451..114caea 100644
--- a/arch/x86/boot/tty.c
+++ b/arch/x86/boot/tty.c
@@ -52,16 +52,22 @@ static void __attribute__((section(".inittext"))) bios_putchar(int ch)
 void __attribute__((section(".inittext"))) putchar(int ch)
 {
 	if (ch == '\n')
-		putchar('\r');	/* \n -> \r\n */
+		bios_putchar('\r');	/* \n -> \r\n */
 
 	bios_putchar(ch);
-
-	if (early_serial_base != 0)
-		serial_putchar(ch);
 }
 
 void __attribute__((section(".inittext"))) puts(const char *str)
 {
+	if (early_serial_base) {
+		const char *s = str;
+		while (*s) {
+			if (*s == '\n')
+				serial_putchar('\r');
+			serial_putchar(*s++);
+		}
+	}
+
 	while (*str)
 		putchar(*str++);
 }
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 23/42] x86, setup: Use puts() instead of printf() in edd code
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (21 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 22/42] x86, setup: Check early serial console per string instead of one char Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 24/42] x86: Setup early console as early as possible in x86_start_kernel() Yinghai Lu
                   ` (20 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

don't need to use printf there.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/edd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/edd.c b/arch/x86/boot/edd.c
index 223e425..88d7c7f 100644
--- a/arch/x86/boot/edd.c
+++ b/arch/x86/boot/edd.c
@@ -157,7 +157,7 @@ void query_edd(void)
 	 */
 
 	if (!be_quiet)
-		printf("Probing EDD (edd=off to disable)... ");
+		puts("Probing EDD (edd=off to disable)... ");
 
 	for (devno = 0x80; devno < 0x80+EDD_MBR_SIG_MAX; devno++) {
 		/*
@@ -176,7 +176,7 @@ void query_edd(void)
 	}
 
 	if (!be_quiet)
-		printf("ok\n");
+		puts("ok\n");
 }
 
 #endif
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 24/42] x86: Setup early console as early as possible in x86_start_kernel()
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (22 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 23/42] x86, setup: Use puts() instead of printf() in edd code Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 25/42] x86, boot: print compression suffix in decompress stage Yinghai Lu
                   ` (19 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Analyze "console=uart8250,io,0x3f8,115200n8" in i386_start_kernel/x86_64_start_kernel,
and call setup_early_serial8250_console() to init early serial console.

Only can handle io port kind of 8250, because mmio need ioremap.

Use boot_params.hdr.version instead of adding another variable, Suggested by hpa.
Also need to apply this one after x86 memblock patchset.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/setup.h         |  2 ++
 arch/x86/kernel/head.c               | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/head32.c             |  1 +
 arch/x86/kernel/head64.c             |  5 ++++-
 drivers/tty/serial/8250/8250_early.c | 17 +++++++++++++++++
 kernel/printk/printk.c               | 11 +++++++----
 6 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 11af24e..3e5aa41 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -40,6 +40,8 @@ static inline void vsmp_init(void) { }
 void setup_bios_corruption_check(void);
 
 extern unsigned long saved_video_mode;
+int setup_early_serial8250_console(char *cmdline);
+void setup_early_console(void);
 
 extern void reserve_standard_io_resources(void);
 extern void i386_reserve_resources(void);
diff --git a/arch/x86/kernel/head.c b/arch/x86/kernel/head.c
index 992f442..cc0cd83 100644
--- a/arch/x86/kernel/head.c
+++ b/arch/x86/kernel/head.c
@@ -69,3 +69,29 @@ void __init reserve_ebda_region(void)
 	/* reserve all memory between lowmem and the 1MB mark */
 	memblock_reserve(lowmem, 0x100000 - lowmem);
 }
+
+void __init setup_early_console(void)
+{
+#ifdef CONFIG_SERIAL_8250_CONSOLE
+	char constr[64], *p, *q;
+
+	/* Can not handle mmio type 8250 uart yet, too early */
+	p = strstr(boot_command_line, "console=uart8250,io,");
+	if (!p)
+		p = strstr(boot_command_line, "console=uart,io,");
+	if (!p)
+		return;
+
+	p += 8;	/* sizeof "console=" */
+	q = strchrnul(p, ' ');
+	if ((q - p) >= sizeof(constr))
+		return;
+
+	memset(constr, 0, sizeof(constr));
+	memcpy(constr, p, q - p);
+
+	lockdep_init();
+
+	setup_early_serial8250_console(constr);
+#endif
+}
diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index 2911ef3..87ddca1 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -33,6 +33,7 @@ asmlinkage __visible void __init i386_start_kernel(void)
 {
 	cr4_init_shadow();
 	sanitize_boot_params(&boot_params);
+	setup_early_console();
 
 	/* Call the subarch specific early setup function */
 	switch (boot_params.hdr.hardware_subarch) {
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 5a46681..44dc63b 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -171,6 +171,7 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 	load_idt((const struct desc_ptr *)&idt_descr);
 
 	copy_bootdata(__va(real_mode_data));
+	setup_early_console();
 
 	/*
 	 * Load microcode early on BSP.
@@ -189,8 +190,10 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 void __init x86_64_start_reservations(char *real_mode_data)
 {
 	/* version is always not zero if it is copied */
-	if (!boot_params.hdr.version)
+	if (!boot_params.hdr.version) {
 		copy_bootdata(__va(real_mode_data));
+		setup_early_console();
+	}
 
 	reserve_ebda_region();
 
diff --git a/drivers/tty/serial/8250/8250_early.c b/drivers/tty/serial/8250/8250_early.c
index 771dda2..8a7fe75 100644
--- a/drivers/tty/serial/8250/8250_early.c
+++ b/drivers/tty/serial/8250/8250_early.c
@@ -152,3 +152,20 @@ int __init early_serial8250_setup(struct earlycon_device *device,
 }
 EARLYCON_DECLARE(uart8250, early_serial8250_setup);
 EARLYCON_DECLARE(uart, early_serial8250_setup);
+
+/* for x86 early early console */
+int __init setup_early_serial8250_console(char *cmdline)
+{
+	char *options;
+
+	options = strstr(cmdline, "uart8250,");
+	if (options)
+		return setup_earlycon(options);
+
+	options = strstr(cmdline, "uart,");
+	if (options)
+		return setup_earlycon(options);
+
+	return 0;
+}
+
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index cf8c242..f554c5f 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2454,11 +2454,14 @@ void register_console(struct console *newcon)
 	struct console_cmdline *c;
 
 	if (console_drivers)
-		for_each_console(bcon)
-			if (WARN(bcon == newcon,
-					"console '%s%d' already registered\n",
-					bcon->name, bcon->index))
+		for_each_console(bcon) {
+			/* not again */
+			if (bcon == newcon) {
+				printk(KERN_INFO "console '%s%d' already registered\n",
+					bcon->name, bcon->index);
 				return;
+			}
+	}
 
 	/*
 	 * before we register a new CON_BOOT console, make sure we don't
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 25/42] x86, boot: print compression suffix in decompress stage
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (23 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 24/42] x86: Setup early console as early as possible in x86_start_kernel() Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:13   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 26/42] x86: remove not needed clear_page calling Yinghai Lu
                   ` (18 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/misc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index a428c03..9266f78 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -120,26 +120,32 @@ static int lines, cols;
 
 #ifdef CONFIG_KERNEL_GZIP
 #include "../../../../lib/decompress_inflate.c"
+static char *suffix_str = "gz";
 #endif
 
 #ifdef CONFIG_KERNEL_BZIP2
 #include "../../../../lib/decompress_bunzip2.c"
+static char *suffix_str = "bz2";
 #endif
 
 #ifdef CONFIG_KERNEL_LZMA
 #include "../../../../lib/decompress_unlzma.c"
+static char *suffix_str = "lzma";
 #endif
 
 #ifdef CONFIG_KERNEL_XZ
 #include "../../../../lib/decompress_unxz.c"
+static char *suffix_str = "xz";
 #endif
 
 #ifdef CONFIG_KERNEL_LZO
 #include "../../../../lib/decompress_unlzo.c"
+static char *suffix_str = "lzo";
 #endif
 
 #ifdef CONFIG_KERNEL_LZ4
 #include "../../../../lib/decompress_unlz4.c"
+static char *suffix_str = "lz4";
 #endif
 
 static void scroll(void)
@@ -486,6 +492,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
 		(unsigned long)input_data,
 		(unsigned long)input_data + input_len - 1);
 	debug_putstr("\nDecompressing Linux... ");
+	debug_putstr(suffix_str);
+	debug_putstr("... ");
 	decompress(input_data, input_len, NULL, NULL, output, NULL, error);
 	parse_elf(output);
 	handle_relocations(output, output_len, virt_offset);
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 26/42] x86: remove not needed clear_page calling
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (24 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 25/42] x86, boot: print compression suffix in decompress stage Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:14   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 27/42] x86: restore end_of_ram to E820_RAM Yinghai Lu
                   ` (17 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

remove not needed clear_page for init_level4_page in x86_64_start_kernel(),
as it is with fill 512,8,0 already in head_64.S

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/head64.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 44dc63b..a9f0299 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -178,7 +178,6 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 	 */
 	load_ucode_bsp();
 
-	clear_page(init_level4_pgt);
 	/* set init_level4_pgt kernel high mapping*/
 	init_level4_pgt[511] = early_level4_pgt[511];
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 27/42] x86: restore end_of_ram to E820_RAM
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (25 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 26/42] x86: remove not needed clear_page calling Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-08 17:44   ` Matt Fleming
  2015-07-07 20:20 ` [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G Yinghai Lu
                   ` (16 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

We don't need to create mapping for E820_PRAM.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/e820.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index a102564..46ec08d 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -753,7 +753,7 @@ u64 __init early_reserve_e820(u64 size, u64 align)
 /*
  * Find the highest page frame number we have available
  */
-static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
+static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned type)
 {
 	int i;
 	unsigned long last_pfn = 0;
@@ -764,11 +764,7 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
 		unsigned long start_pfn;
 		unsigned long end_pfn;
 
-		/*
-		 * Persistent memory is accounted as ram for purposes of
-		 * establishing max_pfn and mem_map.
-		 */
-		if (ei->type != E820_RAM && ei->type != E820_PRAM)
+		if (ei->type != type)
 			continue;
 
 		start_pfn = ei->addr >> PAGE_SHIFT;
@@ -793,12 +789,12 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
 }
 unsigned long __init e820_end_of_ram_pfn(void)
 {
-	return e820_end_pfn(MAX_ARCH_PFN);
+	return e820_end_pfn(MAX_ARCH_PFN, E820_RAM);
 }
 
 unsigned long __init e820_end_of_low_ram_pfn(void)
 {
-	return e820_end_pfn(1UL << (32-PAGE_SHIFT));
+	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
 }
 
 static void early_panic(char *msg)
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (26 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 27/42] x86: restore end_of_ram to E820_RAM Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:12   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update Yinghai Lu
                   ` (15 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Now could use kexec to place kernel/boot_params/cmd_line/initrd
above 4G, but that is with legacy interface with startup_64 directly.

This patch will allow 64bit EFI kernel to be loaded above 4G
and use EFI HANDOVER PROTOCOL to start the kernel.

Current 32bit code32_start is used for passing around load address,
so it will overflow when kernel is loaded abover 4G.

The patch mainly add ext_code32_start to take load address high 32bits.

After this patch, could use patched grub2-x86_64.efi to place
kernel/boot_params/cmd_line/initrd all above 4G and execute the kernel
above 4G.

bootlog like:

kernel: done                           [ linux  9.25MiB  100%  6.66MiB/s ]
params: [1618fc000,1618fffff]
cmdline: [1618fb000,1618fb7fe]
kernel: [15e000000,161385fff]
initrd: [15bcbe000,15dffffbb]
initrd: 1 file done             [ initrd.img  35.26MiB  100%  11.93MiB/s ]
early console in decompress_kernel
decompress_kernel:
  input: [0x15fd0b3b4-0x16063c803], output: 0x15e000000, heap: [0x160645b00-0x16064daff]

Decompressing Linux... xz... Parsing ELF... done.
Booting the kernel.
[    0.000000] bootconsole [uart0] enabled
[    0.000000]    real_mode_data :      phys 00000001618fc000
[    0.000000]    real_mode_data :      virt ffff8801618fc000
[    0.000000] Kernel Layout:
[    0.000000]   .text: [0x15e000000-0x15f08f72c]
[    0.000000] .rodata: [0x15f200000-0x15fa44fff]
[    0.000000]   .data: [0x15fc00000-0x15fe545ff]
[    0.000000]   .init: [0x15fe56000-0x16021afff]
[    0.000000]    .bss: [0x160229000-0x16135ffff]
[    0.000000]    .brk: [0x161360000-0x161385fff]
[    0.000000] memblock_reserve: [0x0000000009f000-0x000000000fffff] flags 0x0 * BIOS reserved
...
[    0.000000] memblock_reserve: [0x0000015e000000-0x0000016135ffff] flags 0x0 TEXT DATA BSS
[    0.000000] memblock_reserve: [0x0000015bcbe000-0x0000015dffffff] flags 0x0 RAMDISK

-v2: add cast to avoid warning with 32bit, also update description for
     ext_code32_start in boot.txt
-v3: change to 4.0 from 3.20.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 Documentation/x86/boot.txt            | 19 +++++++++++++++++++
 arch/x86/boot/compressed/eboot.c      | 15 ++++++++++-----
 arch/x86/boot/compressed/head_64.S    |  7 ++++++-
 arch/x86/boot/header.S                |  3 ++-
 arch/x86/include/uapi/asm/bootparam.h |  1 +
 arch/x86/kernel/asm-offsets.c         |  1 +
 6 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 9da6f35..90efaa2 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -61,6 +61,9 @@ Protocol 2.12:	(Kernel 3.8) Added the xloadflags field and extension fields
 	 	to struct boot_params for loading bzImage and ramdisk
 		above 4G in 64bit.
 
+Protocol 2.14:	(Kernel 4.0) Added the ext_code32_start to support 64bit
+		EFI kernel to be loaded above 4G.
+
 **** MEMORY LAYOUT
 
 The traditional memory map for the kernel loader, used for Image or
@@ -197,6 +200,7 @@ Offset	Proto	Name		Meaning
 0258/8	2.10+	pref_address	Preferred loading address
 0260/4	2.10+	init_size	Linear memory required during initialization
 0264/4	2.11+	handover_offset	Offset of handover entry point
+0268/4	2.14+	ext_code32_start	Extended part for code32_start
 
 (1) For backwards compatibility, if the setup_sects field contains 0, the
     real value is 4.
@@ -744,6 +748,14 @@ Offset/size:	0x264/4
 
   See EFI HANDOVER PROTOCOL below for more details.
 
+Field name:	ext_code32_start
+Type:		modify (optional, reloc)
+Offset/size:	0x268/4
+Protocol:	2.14+
+
+  This field is the upper 32bits of load address when EFI 64bit kernel
+  is loaded above 4G. And it is used with code32_start to compare to
+  pref_address to decide if kernel need to be relocated further.
 
 **** THE IMAGE CHECKSUM
 
@@ -1127,4 +1139,11 @@ The boot loader *must* fill out the following fields in bp,
     o hdr.ramdisk_image (if applicable)
     o hdr.ramdisk_size  (if applicable)
 
+for 64bit, when loading above 4G, *must* fill out the following fields,
+
+    o hdr.ext_code32_start
+    o ext_cmd_line_ptr
+    o ext_ramdisk_image (if applicable)
+    o ext_ramdisk_size  (if applicable)
+
 All other fields should be zero.
diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 2c82bd1..05d77a5 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -1394,6 +1394,7 @@ struct boot_params *efi_main(struct efi_config *c,
 	void *handle;
 	efi_system_table_t *_table;
 	bool is64;
+	unsigned long loaded_addr;
 
 	efi_early = c;
 
@@ -1435,9 +1436,12 @@ struct boot_params *efi_main(struct efi_config *c,
 	 * If the kernel isn't already loaded at the preferred load
 	 * address, relocate it.
 	 */
-	if (hdr->pref_address != hdr->code32_start) {
-		unsigned long bzimage_addr = hdr->code32_start;
-		status = efi_relocate_kernel(sys_table, &bzimage_addr,
+	loaded_addr = hdr->code32_start;
+	loaded_addr |= (unsigned long)((u64)hdr->ext_code32_start << 32);
+	if (hdr->pref_address != loaded_addr) {
+		unsigned long loaded_addr_orig = loaded_addr;
+
+		status = efi_relocate_kernel(sys_table, &loaded_addr,
 					     hdr->init_size, hdr->init_size,
 					     hdr->pref_address,
 					     hdr->kernel_alignment);
@@ -1446,8 +1450,9 @@ struct boot_params *efi_main(struct efi_config *c,
 			goto fail;
 		}
 
-		hdr->pref_address = hdr->code32_start;
-		hdr->code32_start = bzimage_addr;
+		hdr->pref_address = loaded_addr_orig;
+		hdr->code32_start = loaded_addr & 0xffffffff;
+		hdr->ext_code32_start = (unsigned long)((u64)loaded_addr >> 32);
 	}
 
 	status = exit_boot(boot_params, handle, is64);
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 075bb15..ab52d2c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -266,6 +266,8 @@ ENTRY(efi_pe_entry)
 	mov	%rax, %rsi
 	leaq	startup_32(%rip), %rax
 	movl	%eax, BP_code32_start(%rsi)
+	shr	$32, %rax
+	movl	%eax, BP_ext_code32_start(%rsi)
 	jmp	2f		/* Skip the relocation */
 
 handover_entry:
@@ -289,7 +291,10 @@ fail:
 	hlt
 	jmp	fail
 2:
-	movl	BP_code32_start(%esi), %eax
+	movl	BP_code32_start(%rsi), %eax
+	movl	BP_ext_code32_start(%rsi), %ebx
+	shl	$32, %rbx
+	orq	%rbx, %rax
 	leaq	preferred_addr(%rax), %rax
 	jmp	*%rax
 
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 99204e5..09e7c69 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -301,7 +301,7 @@ _start:
 	# Part 2 of the header, from the old setup.S
 
 		.ascii	"HdrS"		# header signature
-		.word	0x020d		# header version number (>= 0x0105)
+		.word	0x020e		# header version number (>= 0x0105)
 					# or else old loadlin-1.5 will fail)
 		.globl realmode_swtch
 realmode_swtch:	.word	0, 0		# default_switch, SETUPSEG
@@ -478,6 +478,7 @@ pref_address:		.quad LOAD_PHYSICAL_ADDR	# preferred load addr
 #endif
 init_size:		.long INIT_SIZE		# kernel initialization size
 handover_offset:	.long 0			# Filled in by build.c
+ext_code32_start:	.long 0			# werid one!
 
 # End of setup header #####################################################
 
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index ab456dc..bb9973d 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -84,6 +84,7 @@ struct setup_header {
 	__u64	pref_address;
 	__u32	init_size;
 	__u32	handover_offset;
+	__u32	ext_code32_start;
 } __attribute__((packed));
 
 struct sys_desc_table {
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index d2e00bc..3f9789f 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -90,6 +90,7 @@ void common(void) {
 	OFFSET(BP_init_size, boot_params, hdr.init_size);
 	OFFSET(BP_pref_address, boot_params, hdr.pref_address);
 	OFFSET(BP_code32_start, boot_params, hdr.code32_start);
+	OFFSET(BP_ext_code32_start, boot_params, hdr.ext_code32_start);
 
 	BLANK();
 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (27 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:08   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 30/42] x86: Kill E820_RESERVED_KERN Yinghai Lu
                   ` (14 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

When using kexec with 64bit kernel, bzImage and ramdisk could be
loaded above 4G. We need this to get correct ramdisk adress.

Make get_ramdisk_image() global and use it for early microcode updating.

-v2: update changelog.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/setup.h                |  3 +++
 arch/x86/kernel/cpu/microcode/amd_early.c   | 10 +++++-----
 arch/x86/kernel/cpu/microcode/intel_early.c |  8 ++++----
 arch/x86/kernel/setup.c                     | 28 ++++++++++++++--------------
 4 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 3e5aa41..496515b 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -119,6 +119,9 @@ void *extend_brk(size_t size, size_t align);
 	RESERVE_BRK(name, sizeof(type) * entries)
 
 extern void probe_roms(void);
+u64 get_ramdisk_image(struct boot_params *bp);
+u64 get_ramdisk_size(struct boot_params *bp);
+
 #ifdef __i386__
 
 asmlinkage void __init i386_start_kernel(void);
diff --git a/arch/x86/kernel/cpu/microcode/amd_early.c b/arch/x86/kernel/cpu/microcode/amd_early.c
index e8a215a..4c579c7 100644
--- a/arch/x86/kernel/cpu/microcode/amd_early.c
+++ b/arch/x86/kernel/cpu/microcode/amd_early.c
@@ -51,12 +51,12 @@ static struct cpio_data __init find_ucode_in_initrd(void)
 	 */
 	p       = (struct boot_params *)__pa_nodebug(&boot_params);
 	path    = (char *)__pa_nodebug(ucode_path);
-	start   = (void *)p->hdr.ramdisk_image;
-	size    = p->hdr.ramdisk_size;
+	start   = (void *)(unsigned long)get_ramdisk_image(p);
+	size    = get_ramdisk_size(p);
 #else
 	path    = ucode_path;
-	start   = (void *)(boot_params.hdr.ramdisk_image + PAGE_OFFSET);
-	size    = boot_params.hdr.ramdisk_size;
+	start   = (void *)(get_ramdisk_image(&boot_params) + PAGE_OFFSET);
+	size    = get_ramdisk_size(&boot_params);
 #endif
 
 	return find_cpio_data(path, start, size, &offset);
@@ -396,7 +396,7 @@ int __init save_microcode_in_initrd_amd(void)
 	 */
 	if (relocated_ramdisk)
 		container = (u8 *)(__va(relocated_ramdisk) +
-			     (cont - boot_params.hdr.ramdisk_image));
+			     (cont - get_ramdisk_size(&boot_params)));
 	else
 		container = cont_va;
 
diff --git a/arch/x86/kernel/cpu/microcode/intel_early.c b/arch/x86/kernel/cpu/microcode/intel_early.c
index 8187b72..c85dcb2 100644
--- a/arch/x86/kernel/cpu/microcode/intel_early.c
+++ b/arch/x86/kernel/cpu/microcode/intel_early.c
@@ -736,16 +736,16 @@ void __init load_ucode_intel_bsp(void)
 	struct boot_params *p;
 
 	p	= (struct boot_params *)__pa_nodebug(&boot_params);
-	start	= p->hdr.ramdisk_image;
-	size	= p->hdr.ramdisk_size;
+	start	= get_ramdisk_image(p);
+	size	= get_ramdisk_size(p);
 
 	_load_ucode_intel_bsp(
 			(struct mc_saved_data *)__pa_nodebug(&mc_saved_data),
 			(unsigned long *)__pa_nodebug(&mc_saved_in_initrd),
 			start, size);
 #else
-	start	= boot_params.hdr.ramdisk_image + PAGE_OFFSET;
-	size	= boot_params.hdr.ramdisk_size;
+	start	= get_ramdisk_image(&boot_params) + PAGE_OFFSET;
+	size	= get_ramdisk_size(&boot_params);
 
 	_load_ucode_intel_bsp(&mc_saved_data, mc_saved_in_initrd, start, size);
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 80f874b..2d808e6 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,19 +300,19 @@ u64 relocated_ramdisk;
 
 #ifdef CONFIG_BLK_DEV_INITRD
 
-static u64 __init get_ramdisk_image(void)
+u64 __init get_ramdisk_image(struct boot_params *bp)
 {
-	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+	u64 ramdisk_image = bp->hdr.ramdisk_image;
 
-	ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
+	ramdisk_image |= (u64)bp->ext_ramdisk_image << 32;
 
 	return ramdisk_image;
 }
-static u64 __init get_ramdisk_size(void)
+u64 __init get_ramdisk_size(struct boot_params *bp)
 {
-	u64 ramdisk_size = boot_params.hdr.ramdisk_size;
+	u64 ramdisk_size = bp->hdr.ramdisk_size;
 
-	ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
+	ramdisk_size |= (u64)bp->ext_ramdisk_size << 32;
 
 	return ramdisk_size;
 }
@@ -321,8 +321,8 @@ static u64 __init get_ramdisk_size(void)
 static void __init relocate_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = get_ramdisk_image();
-	u64 ramdisk_size  = get_ramdisk_size();
+	u64 ramdisk_image = get_ramdisk_image(&boot_params);
+	u64 ramdisk_size  = get_ramdisk_size(&boot_params);
 	u64 area_size     = PAGE_ALIGN(ramdisk_size);
 	unsigned long slop, clen, mapaddr;
 	char *p, *q;
@@ -360,8 +360,8 @@ static void __init relocate_initrd(void)
 		ramdisk_size  -= clen;
 	}
 
-	ramdisk_image = get_ramdisk_image();
-	ramdisk_size  = get_ramdisk_size();
+	ramdisk_image = get_ramdisk_image(&boot_params);
+	ramdisk_size  = get_ramdisk_size(&boot_params);
 	printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
 		" [mem %#010llx-%#010llx]\n",
 		ramdisk_image, ramdisk_image + ramdisk_size - 1,
@@ -371,8 +371,8 @@ static void __init relocate_initrd(void)
 static void __init early_reserve_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = get_ramdisk_image();
-	u64 ramdisk_size  = get_ramdisk_size();
+	u64 ramdisk_image = get_ramdisk_image(&boot_params);
+	u64 ramdisk_size  = get_ramdisk_size(&boot_params);
 	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 
 	if (!boot_params.hdr.type_of_loader ||
@@ -384,8 +384,8 @@ static void __init early_reserve_initrd(void)
 static void __init reserve_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = get_ramdisk_image();
-	u64 ramdisk_size  = get_ramdisk_size();
+	u64 ramdisk_image = get_ramdisk_image(&boot_params);
+	u64 ramdisk_size  = get_ramdisk_size(&boot_params);
 	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 	u64 mapped_size;
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 30/42] x86: Kill E820_RESERVED_KERN
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (28 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly Yinghai Lu
                   ` (13 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Lee, Chun-Yi, stable

Now we are using memblock to do early resource reserver/allocation
instead of using e820 map directly, and setup_data is reserved in
memblock early already.
Also kexec generate setup_data and pass pointer to second kernel,
so second kernel reserve setup_data by their own.
(Now kexec-tools create SETUP_EFI and SETUP_E820_EXT).

We can kill E820_RESERVED_KERN and not touch e820 map at all.

That will fix bug in mark_nonsave_region that can not handle that
case: E820_RAM and E820_RESERVED_KERN ranges are continuous and
boundary is not page aligned.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=913885
Reported-by: "Lee, Chun-Yi" <jlee@suse.com>
Tested-by: "Lee, Chun-Yi" <jlee@suse.com>
Cc: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@vger.kernel.org
---
 arch/x86/include/uapi/asm/e820.h |  8 --------
 arch/x86/kernel/e820.c           |  6 ++----
 arch/x86/kernel/setup.c          | 25 -------------------------
 arch/x86/kernel/tboot.c          |  3 +--
 arch/x86/mm/init_64.c            | 11 ++++-------
 5 files changed, 7 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/uapi/asm/e820.h b/arch/x86/include/uapi/asm/e820.h
index 0f457e6..a9216a1 100644
--- a/arch/x86/include/uapi/asm/e820.h
+++ b/arch/x86/include/uapi/asm/e820.h
@@ -45,14 +45,6 @@
  */
 #define E820_PRAM	12
 
-/*
- * reserved RAM used by kernel itself
- * if CONFIG_INTEL_TXT is enabled, memory of this type will be
- * included in the S3 integrity calculation and so should not include
- * any memory that BIOS might alter over the S3 transition
- */
-#define E820_RESERVED_KERN        128
-
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
 struct e820entry {
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 46ec08d..49d8c50 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -134,7 +134,6 @@ static void __init e820_print_type(u32 type)
 {
 	switch (type) {
 	case E820_RAM:
-	case E820_RESERVED_KERN:
 		printk(KERN_CONT "usable");
 		break;
 	case E820_RESERVED:
@@ -693,7 +692,7 @@ void __init e820_mark_nosave_regions(unsigned long limit_pfn)
 
 		pfn = PFN_DOWN(ei->addr + ei->size);
 
-		if (ei->type != E820_RAM && ei->type != E820_RESERVED_KERN)
+		if (ei->type != E820_RAM)
 			register_nosave_region(PFN_UP(ei->addr), pfn);
 
 		if (pfn >= limit_pfn)
@@ -910,7 +909,6 @@ void __init finish_e820_parsing(void)
 static inline const char *e820_type_to_string(int e820_type)
 {
 	switch (e820_type) {
-	case E820_RESERVED_KERN:
 	case E820_RAM:	return "System RAM";
 	case E820_ACPI:	return "ACPI Tables";
 	case E820_NVS:	return "ACPI Non-volatile Storage";
@@ -1107,7 +1105,7 @@ void __init memblock_x86_fill(void)
 		if (end != (resource_size_t)end)
 			continue;
 
-		if (ei->type != E820_RAM && ei->type != E820_RESERVED_KERN)
+		if (ei->type != E820_RAM)
 			continue;
 
 		memblock_add(ei->addr, ei->size);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2d808e6..a3b65f1 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -457,29 +457,6 @@ static void __init parse_setup_data(void)
 	}
 }
 
-static void __init e820_reserve_setup_data(void)
-{
-	struct setup_data *data;
-	u64 pa_data;
-
-	pa_data = boot_params.hdr.setup_data;
-	if (!pa_data)
-		return;
-
-	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
-		e820_update_range(pa_data, sizeof(*data)+data->len,
-			 E820_RAM, E820_RESERVED_KERN);
-		pa_data = data->next;
-		early_memunmap(data, sizeof(*data));
-	}
-
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
-	printk(KERN_INFO "extended physical RAM map:\n");
-	e820_print_map("reserve setup_data");
-}
-
 static void __init memblock_x86_reserve_range_setup_data(void)
 {
 	struct setup_data *data;
@@ -1018,8 +995,6 @@ void __init setup_arch(char **cmdline_p)
 		early_dump_pci_devices();
 #endif
 
-	/* update the e820_saved too */
-	e820_reserve_setup_data();
 	finish_e820_parsing();
 
 	if (efi_enabled(EFI_BOOT))
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 91a4496..3c2752a 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -195,8 +195,7 @@ static int tboot_setup_sleep(void)
 	tboot->num_mac_regions = 0;
 
 	for (i = 0; i < e820.nr_map; i++) {
-		if ((e820.map[i].type != E820_RAM)
-		 && (e820.map[i].type != E820_RESERVED_KERN))
+		if (e820.map[i].type != E820_RAM)
 			continue;
 
 		add_mac_region(e820.map[i].addr, e820.map[i].size);
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 6f457a4..257ba4b 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -340,8 +340,7 @@ phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
 		next = (addr & PAGE_MASK) + PAGE_SIZE;
 		if (addr >= end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RAM))
 				set_pte(pte, __pte(0));
 			continue;
 		}
@@ -387,9 +386,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 
 		next = (address & PMD_MASK) + PMD_SIZE;
 		if (address >= end) {
-			if (!after_bootmem &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RESERVED_KERN))
+			if (!after_bootmem && !e820_any_mapped(
+					address & PMD_MASK, next, E820_RAM))
 				set_pmd(pmd, __pmd(0));
 			continue;
 		}
@@ -462,8 +460,7 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 		next = (addr & PUD_MASK) + PUD_SIZE;
 		if (addr >= end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(addr & PUD_MASK, next, E820_RAM))
 				set_pud(pud, __pud(0));
 			continue;
 		}
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (29 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 30/42] x86: Kill E820_RESERVED_KERN Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-22 10:58     ` Matt Fleming
  2015-07-24  2:07     ` Dave Young
  2015-07-07 20:20 ` [PATCH 32/42] x86, of: Let add_dtb reserve setup_data locally Yinghai Lu
                   ` (12 subsequent siblings)
  43 siblings, 2 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Matt Fleming, linux-efi

The copy will be in __initdata, and it is small.

We can use pointer to access the setup_data instead of using early_memmap
everywhere.

Cc: Matt Fleming <matt.fleming@intel.com>
Cc: linux-efi@vger.kernel.org
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/efi.h     |  2 +-
 arch/x86/platform/efi/efi.c    | 13 ++-----------
 arch/x86/platform/efi/efi_64.c | 10 +++++++++-
 arch/x86/platform/efi/quirks.c | 23 ++++++-----------------
 4 files changed, 18 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 155162e..a3e3aee 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -116,7 +116,7 @@ struct efi_setup_data {
 	u64 reserved[8];
 };
 
-extern u64 efi_setup;
+extern struct efi_setup_data *efi_setup;
 
 #ifdef CONFIG_EFI
 
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index cfba30f..33036ce 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -68,7 +68,7 @@ static efi_config_table_type_t arch_tables[] __initdata = {
 	{NULL_GUID, NULL, NULL},
 };
 
-u64 efi_setup;		/* efi setup_data physical address */
+struct efi_setup_data *efi_setup __initdata; /* cached efi setup_data pointer */
 
 static int add_efi_memmap __initdata;
 static int __init setup_add_efi_memmap(char *arg)
@@ -257,20 +257,13 @@ static int __init efi_systab_init(void *phys)
 {
 	if (efi_enabled(EFI_64BIT)) {
 		efi_system_table_64_t *systab64;
-		struct efi_setup_data *data = NULL;
+		struct efi_setup_data *data = efi_setup;
 		u64 tmp = 0;
 
-		if (efi_setup) {
-			data = early_memremap(efi_setup, sizeof(*data));
-			if (!data)
-				return -ENOMEM;
-		}
 		systab64 = early_memremap((unsigned long)phys,
 					 sizeof(*systab64));
 		if (systab64 == NULL) {
 			pr_err("Couldn't map the system table!\n");
-			if (data)
-				early_memunmap(data, sizeof(*data));
 			return -ENOMEM;
 		}
 
@@ -303,8 +296,6 @@ static int __init efi_systab_init(void *phys)
 		tmp |= data ? data->tables : systab64->tables;
 
 		early_memunmap(systab64, sizeof(*systab64));
-		if (data)
-			early_memunmap(data, sizeof(*data));
 #ifdef CONFIG_X86_32
 		if (tmp >> 32) {
 			pr_err("EFI data located above 4GB, disabling EFI.\n");
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index a0ac0f9..a255491 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -295,9 +295,17 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
 	return (void __iomem *)__va(phys_addr);
 }
 
+static struct efi_setup_data efi_setup_data __initdata;
+
 void __init parse_efi_setup(u64 phys_addr, u32 data_len)
 {
-	efi_setup = phys_addr + sizeof(struct setup_data);
+	struct efi_setup_data *data;
+
+	data = early_memremap(phys_addr + sizeof(struct setup_data),
+			      sizeof(*data));
+	efi_setup_data = *data;
+	early_memunmap(data, sizeof(*data));
+	efi_setup = &efi_setup_data;
 }
 
 void __init efi_runtime_mkexec(void)
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 1c7380d..45fec7d 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -203,9 +203,8 @@ void __init efi_free_boot_services(void)
  */
 int __init efi_reuse_config(u64 tables, int nr_tables)
 {
-	int i, sz, ret = 0;
+	int i, sz;
 	void *p, *tablep;
-	struct efi_setup_data *data;
 
 	if (!efi_setup)
 		return 0;
@@ -213,22 +212,15 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 	if (!efi_enabled(EFI_64BIT))
 		return 0;
 
-	data = early_memremap(efi_setup, sizeof(*data));
-	if (!data) {
-		ret = -ENOMEM;
-		goto out;
-	}
-
-	if (!data->smbios)
-		goto out_memremap;
+	if (!efi_setup->smbios)
+		return 0;
 
 	sz = sizeof(efi_config_table_64_t);
 
 	p = tablep = early_memremap(tables, nr_tables * sz);
 	if (!p) {
 		pr_err("Could not map Configuration table!\n");
-		ret = -ENOMEM;
-		goto out_memremap;
+		return -ENOMEM;
 	}
 
 	for (i = 0; i < efi.systab->nr_tables; i++) {
@@ -237,15 +229,12 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 		guid = ((efi_config_table_64_t *)p)->guid;
 
 		if (!efi_guidcmp(guid, SMBIOS_TABLE_GUID))
-			((efi_config_table_64_t *)p)->table = data->smbios;
+			((efi_config_table_64_t *)p)->table = efi_setup->smbios;
 		p += sz;
 	}
 	early_memunmap(tablep, nr_tables * sz);
 
-out_memremap:
-	early_memunmap(data, sizeof(*data));
-out:
-	return ret;
+	return 0;
 }
 
 void __init efi_apply_memmap_quirks(void)
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 32/42] x86, of: Let add_dtb reserve setup_data locally
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (30 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI Yinghai Lu
                   ` (11 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Rob Herring, David Vrabel

We will not reserve setup_data in generic code. Every handler need to
reserve and copy setup_data locally.

Current dtd handling already have code for copying, just add reserve code.

Also simplify code a bit by storing real dtb size.

Cc: Rob Herring <robh@kernel.org>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/prom.h  |  9 ++++++---
 arch/x86/kernel/devicetree.c | 39 +++++++++++++++++++++------------------
 2 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/prom.h b/arch/x86/include/asm/prom.h
index 1d081ac..fb716eddc 100644
--- a/arch/x86/include/asm/prom.h
+++ b/arch/x86/include/asm/prom.h
@@ -24,17 +24,20 @@
 
 #ifdef CONFIG_OF
 extern int of_ioapic;
-extern u64 initial_dtb;
-extern void add_dtb(u64 data);
 void x86_of_pci_init(void);
 void x86_dtb_init(void);
 #else
-static inline void add_dtb(u64 data) { }
 static inline void x86_of_pci_init(void) { }
 static inline void x86_dtb_init(void) { }
 #define of_ioapic 0
 #endif
 
+#ifdef CONFIG_OF_FLATTREE
+extern void add_dtb(u64 data);
+#else
+static inline void add_dtb(u64 data) { }
+#endif
+
 extern char cmd_line[COMMAND_LINE_SIZE];
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c
index 1f4acd6..19fb3cf 100644
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -2,6 +2,7 @@
  * Architecture specific OF callbacks.
  */
 #include <linux/bootmem.h>
+#include <linux/memblock.h>
 #include <linux/export.h>
 #include <linux/io.h>
 #include <linux/interrupt.h>
@@ -23,7 +24,6 @@
 #include <asm/setup.h>
 #include <asm/i8259.h>
 
-__initdata u64 initial_dtb;
 char __initdata cmd_line[COMMAND_LINE_SIZE];
 
 int __initdata of_ioapic;
@@ -43,11 +43,23 @@ void * __init early_init_dt_alloc_memory_arch(u64 size, u64 align)
 	return __alloc_bootmem(size, align, __pa(MAX_DMA_ADDRESS));
 }
 
+#ifdef CONFIG_OF_FLATTREE
+static u64 initial_dtb __initdata;
+static u32 initial_dtb_size __initdata;
 void __init add_dtb(u64 data)
 {
+	u32 map_len;
+
 	initial_dtb = data + offsetof(struct setup_data, data);
-}
 
+	map_len = max(PAGE_SIZE - (initial_dtb & ~PAGE_MASK), (u64)128);
+	initial_boot_params = early_memremap(initial_dtb, map_len);
+	initial_dtb_size = of_get_flat_dt_size();
+	early_memunmap(initial_boot_params, map_len);
+	initial_boot_params = NULL;
+	memblock_reserve(initial_dtb, initial_dtb_size);
+}
+#endif
 /*
  * CE4100 ids. Will be moved to machine_device_initcall() once we have it.
  */
@@ -265,31 +277,22 @@ static void __init dtb_apic_setup(void)
 	dtb_ioapic_setup();
 }
 
-#ifdef CONFIG_OF_FLATTREE
 static void __init x86_flattree_get_config(void)
 {
-	u32 size, map_len;
+#ifdef CONFIG_OF_FLATTREE
 	void *dt;
 
 	if (!initial_dtb)
 		return;
 
-	map_len = max(PAGE_SIZE - (initial_dtb & ~PAGE_MASK), (u64)128);
-
-	initial_boot_params = dt = early_memremap(initial_dtb, map_len);
-	size = of_get_flat_dt_size();
-	if (map_len < size) {
-		early_memunmap(dt, map_len);
-		initial_boot_params = dt = early_memremap(initial_dtb, size);
-		map_len = size;
-	}
-
+	initial_boot_params = dt = early_memremap(initial_dtb,
+						  initial_dtb_size);
 	unflatten_and_copy_device_tree();
-	early_memunmap(dt, map_len);
-}
-#else
-static inline void x86_flattree_get_config(void) { }
+	early_memunmap(dt, initial_dtb_size);
+
+	memblock_free(initial_dtb, initial_dtb_size);
 #endif
+}
 
 void __init x86_dtb_init(void)
 {
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (31 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 32/42] x86, of: Let add_dtb reserve setup_data locally Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-14 22:30   ` Bjorn Helgaas
  2015-07-07 20:20 ` [PATCH 34/42] x86: Kill not used setup_data handling code Yinghai Lu
                   ` (10 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Bjorn Helgaas, Matt Fleming, linux-pci

Let it reserve setup_data, and keep it's own list.

Also clear the hdr.setup_data, as all handler now handle or
reserve setup_data locally already.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/pci.h |  2 ++
 arch/x86/kernel/setup.c    |  8 ++++++++
 arch/x86/pci/common.c      | 42 ++++++++++++++++++++++++++++--------------
 3 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 4625943..7d2468c 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -80,8 +80,10 @@ extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 
 #ifdef CONFIG_PCI
 extern void early_quirks(void);
+void add_pci(u64 pa_data);
 #else
 static inline void early_quirks(void) { }
+static inline void add_pci(u64 pa_data) { }
 #endif
 
 extern void pci_iommu_alloc(void);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3b65f1..de0f830 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -440,6 +440,8 @@ static void __init parse_setup_data(void)
 		pa_next = data->next;
 		early_memunmap(data, sizeof(*data));
 
+		printk(KERN_DEBUG "setup_data type: %d @ %#010llx\n",
+				data_type, pa_data);
 		switch (data_type) {
 		case SETUP_E820_EXT:
 			parse_e820_ext(pa_data, data_len);
@@ -447,14 +449,20 @@ static void __init parse_setup_data(void)
 		case SETUP_DTB:
 			add_dtb(pa_data);
 			break;
+		case SETUP_PCI:
+			add_pci(pa_data);
+			break;
 		case SETUP_EFI:
 			parse_efi_setup(pa_data, data_len);
 			break;
 		default:
+			pr_warn("Unknown setup_data type: %d @ %#010llx ignored!\n",
+				data_type, pa_data);
 			break;
 		}
 		pa_data = pa_next;
 	}
+	boot_params.hdr.setup_data = 0; /* all done */
 }
 
 static void __init memblock_x86_reserve_range_setup_data(void)
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 8fd6f44..16ace12 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -9,6 +9,7 @@
 #include <linux/pci-acpi.h>
 #include <linux/ioport.h>
 #include <linux/init.h>
+#include <linux/memblock.h>
 #include <linux/dmi.h>
 #include <linux/slab.h>
 
@@ -641,31 +642,44 @@ unsigned int pcibios_assign_all_busses(void)
 	return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0;
 }
 
+static u64 pci_setup_data;
+void __init add_pci(u64 pa_data)
+{
+	struct setup_data *data;
+
+	data = early_memremap(pa_data, sizeof(*data));
+	memblock_reserve(pa_data, sizeof(*data) + data->len);
+	data->next = pci_setup_data;
+	pci_setup_data = pa_data;
+	early_memunmap(data, sizeof(*data));
+}
+
 int pcibios_add_device(struct pci_dev *dev)
 {
 	struct setup_data *data;
 	struct pci_setup_rom *rom;
 	u64 pa_data;
 
-	pa_data = boot_params.hdr.setup_data;
+	pa_data = pci_setup_data;
 	while (pa_data) {
 		data = ioremap(pa_data, sizeof(*rom));
 		if (!data)
 			return -ENOMEM;
 
-		if (data->type == SETUP_PCI) {
-			rom = (struct pci_setup_rom *)data;
-
-			if ((pci_domain_nr(dev->bus) == rom->segment) &&
-			    (dev->bus->number == rom->bus) &&
-			    (PCI_SLOT(dev->devfn) == rom->device) &&
-			    (PCI_FUNC(dev->devfn) == rom->function) &&
-			    (dev->vendor == rom->vendor) &&
-			    (dev->device == rom->devid)) {
-				dev->rom = pa_data +
-				      offsetof(struct pci_setup_rom, romdata);
-				dev->romlen = rom->pcilen;
-			}
+		rom = (struct pci_setup_rom *)data;
+
+		if ((pci_domain_nr(dev->bus) == rom->segment) &&
+		    (dev->bus->number == rom->bus) &&
+		    (PCI_SLOT(dev->devfn) == rom->device) &&
+		    (PCI_FUNC(dev->devfn) == rom->function) &&
+		    (dev->vendor == rom->vendor) &&
+		    (dev->device == rom->devid)) {
+			dev->rom = pa_data +
+			      offsetof(struct pci_setup_rom, romdata);
+			dev->romlen = rom->pcilen;
+			dev_printk(KERN_DEBUG, &dev->dev, "set rom to [%#010lx, %#010lx] via SETUP_PCI\n",
+				   (unsigned long)dev->rom,
+				   (unsigned long)(dev->rom + dev->romlen - 1));
 		}
 		pa_data = data->next;
 		iounmap(data);
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 34/42] x86: Kill not used setup_data handling code
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (32 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list Yinghai Lu
                   ` (9 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Matt Fleming

Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/kdebugfs.c | 142 ---------------------------------------------
 arch/x86/kernel/setup.c    |  17 ------
 2 files changed, 159 deletions(-)

diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c
index dc1404b..c8ca86c 100644
--- a/arch/x86/kernel/kdebugfs.c
+++ b/arch/x86/kernel/kdebugfs.c
@@ -21,142 +21,6 @@ struct dentry *arch_debugfs_dir;
 EXPORT_SYMBOL(arch_debugfs_dir);
 
 #ifdef CONFIG_DEBUG_BOOT_PARAMS
-struct setup_data_node {
-	u64 paddr;
-	u32 type;
-	u32 len;
-};
-
-static ssize_t setup_data_read(struct file *file, char __user *user_buf,
-			       size_t count, loff_t *ppos)
-{
-	struct setup_data_node *node = file->private_data;
-	unsigned long remain;
-	loff_t pos = *ppos;
-	struct page *pg;
-	void *p;
-	u64 pa;
-
-	if (pos < 0)
-		return -EINVAL;
-
-	if (pos >= node->len)
-		return 0;
-
-	if (count > node->len - pos)
-		count = node->len - pos;
-
-	pa = node->paddr + sizeof(struct setup_data) + pos;
-	pg = pfn_to_page((pa + count - 1) >> PAGE_SHIFT);
-	if (PageHighMem(pg)) {
-		p = ioremap_cache(pa, count);
-		if (!p)
-			return -ENXIO;
-	} else
-		p = __va(pa);
-
-	remain = copy_to_user(user_buf, p, count);
-
-	if (PageHighMem(pg))
-		iounmap(p);
-
-	if (remain)
-		return -EFAULT;
-
-	*ppos = pos + count;
-
-	return count;
-}
-
-static const struct file_operations fops_setup_data = {
-	.read		= setup_data_read,
-	.open		= simple_open,
-	.llseek		= default_llseek,
-};
-
-static int __init
-create_setup_data_node(struct dentry *parent, int no,
-		       struct setup_data_node *node)
-{
-	struct dentry *d, *type, *data;
-	char buf[16];
-
-	sprintf(buf, "%d", no);
-	d = debugfs_create_dir(buf, parent);
-	if (!d)
-		return -ENOMEM;
-
-	type = debugfs_create_x32("type", S_IRUGO, d, &node->type);
-	if (!type)
-		goto err_dir;
-
-	data = debugfs_create_file("data", S_IRUGO, d, node, &fops_setup_data);
-	if (!data)
-		goto err_type;
-
-	return 0;
-
-err_type:
-	debugfs_remove(type);
-err_dir:
-	debugfs_remove(d);
-	return -ENOMEM;
-}
-
-static int __init create_setup_data_nodes(struct dentry *parent)
-{
-	struct setup_data_node *node;
-	struct setup_data *data;
-	int error;
-	struct dentry *d;
-	struct page *pg;
-	u64 pa_data;
-	int no = 0;
-
-	d = debugfs_create_dir("setup_data", parent);
-	if (!d)
-		return -ENOMEM;
-
-	pa_data = boot_params.hdr.setup_data;
-
-	while (pa_data) {
-		node = kmalloc(sizeof(*node), GFP_KERNEL);
-		if (!node) {
-			error = -ENOMEM;
-			goto err_dir;
-		}
-
-		pg = pfn_to_page((pa_data+sizeof(*data)-1) >> PAGE_SHIFT);
-		if (PageHighMem(pg)) {
-			data = ioremap_cache(pa_data, sizeof(*data));
-			if (!data) {
-				kfree(node);
-				error = -ENXIO;
-				goto err_dir;
-			}
-		} else
-			data = __va(pa_data);
-
-		node->paddr = pa_data;
-		node->type = data->type;
-		node->len = data->len;
-		error = create_setup_data_node(d, no, node);
-		pa_data = data->next;
-
-		if (PageHighMem(pg))
-			iounmap(data);
-		if (error)
-			goto err_dir;
-		no++;
-	}
-
-	return 0;
-
-err_dir:
-	debugfs_remove(d);
-	return error;
-}
-
 static struct debugfs_blob_wrapper boot_params_blob = {
 	.data		= &boot_params,
 	.size		= sizeof(boot_params),
@@ -181,14 +45,8 @@ static int __init boot_params_kdebugfs_init(void)
 	if (!data)
 		goto err_version;
 
-	error = create_setup_data_nodes(dbp);
-	if (error)
-		goto err_data;
-
 	return 0;
 
-err_data:
-	debugfs_remove(data);
 err_version:
 	debugfs_remove(version);
 err_dir:
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index de0f830..35d9ff5 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -465,20 +465,6 @@ static void __init parse_setup_data(void)
 	boot_params.hdr.setup_data = 0; /* all done */
 }
 
-static void __init memblock_x86_reserve_range_setup_data(void)
-{
-	struct setup_data *data;
-	u64 pa_data;
-
-	pa_data = boot_params.hdr.setup_data;
-	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
-		memblock_reserve(pa_data, sizeof(*data) + data->len);
-		pa_data = data->next;
-		early_memunmap(data, sizeof(*data));
-	}
-}
-
 /*
  * --------- Crashkernel reservation ------------------------------
  */
@@ -988,9 +974,6 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_report_nx();
 
-	/* after early param, so could get panic from serial */
-	memblock_x86_reserve_range_setup_data();
-
 	if (acpi_mps_check()) {
 #ifdef CONFIG_X86_LOCAL_APIC
 		disable_apic = 1;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (33 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 34/42] x86: Kill not used setup_data handling code Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-14 22:35   ` Bjorn Helgaas
  2015-07-07 20:20 ` [PATCH 36/42] x86, boot, PCI: Copy SETUP_PCI rom to kernel space Yinghai Lu
                   ` (8 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Bjorn Helgaas, linux-pci

So we could avoid ioremap every time later.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/pci.h |  2 ++
 arch/x86/kernel/setup.c    |  1 +
 arch/x86/pci/common.c      | 77 +++++++++++++++++++++++++++++++++++++---------
 3 files changed, 65 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 7d2468c..1c905a8 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -81,9 +81,11 @@ extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 #ifdef CONFIG_PCI
 extern void early_quirks(void);
 void add_pci(u64 pa_data);
+int fill_setup_pci_entries(void);
 #else
 static inline void early_quirks(void) { }
 static inline void add_pci(u64 pa_data) { }
+static inline int fill_setup_pci_entries(void) { }
 #endif
 
 extern void pci_iommu_alloc(void);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 35d9ff5..6badf66 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1180,6 +1180,7 @@ void __init setup_arch(char **cmdline_p)
 	acpi_boot_init();
 	sfi_init();
 	x86_dtb_init();
+	fill_setup_pci_entries();
 
 	/*
 	 * get boot-time SMP configuration:
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 16ace12..32d4f21 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -642,7 +642,7 @@ unsigned int pcibios_assign_all_busses(void)
 	return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0;
 }
 
-static u64 pci_setup_data;
+static u64 pci_setup_data __initdata;
 void __init add_pci(u64 pa_data)
 {
 	struct setup_data *data;
@@ -654,36 +654,83 @@ void __init add_pci(u64 pa_data)
 	early_memunmap(data, sizeof(*data));
 }
 
-int pcibios_add_device(struct pci_dev *dev)
+struct firmware_setup_pci_entry {
+	struct list_head list;
+	uint16_t vendor;
+	uint16_t devid;
+	uint64_t pcilen;
+	unsigned long segment;
+	unsigned long bus;
+	unsigned long device;
+	unsigned long function;
+	phys_addr_t romdata;
+};
+
+static LIST_HEAD(setup_pci_entries);
+
+int __init fill_setup_pci_entries(void)
 {
 	struct setup_data *data;
 	struct pci_setup_rom *rom;
+	struct firmware_setup_pci_entry *entry;
+	phys_addr_t pa_entry;
 	u64 pa_data;
 
 	pa_data = pci_setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data  = early_memremap(pa_data, sizeof(*rom));
 		if (!data)
 			return -ENOMEM;
-
 		rom = (struct pci_setup_rom *)data;
 
-		if ((pci_domain_nr(dev->bus) == rom->segment) &&
-		    (dev->bus->number == rom->bus) &&
-		    (PCI_SLOT(dev->devfn) == rom->device) &&
-		    (PCI_FUNC(dev->devfn) == rom->function) &&
-		    (dev->vendor == rom->vendor) &&
-		    (dev->device == rom->devid)) {
-			dev->rom = pa_data +
-			      offsetof(struct pci_setup_rom, romdata);
-			dev->romlen = rom->pcilen;
+		pa_entry = memblock_alloc(sizeof(*entry), sizeof(long));
+		if (!pa_entry) {
+			early_memunmap(data, sizeof(*rom));
+			return -ENOMEM;
+		}
+
+		entry = phys_to_virt(pa_entry);
+		entry->segment = rom->segment;
+		entry->bus = rom->bus;
+		entry->device = rom->device;
+		entry->function = rom->function;
+		entry->vendor = rom->vendor;
+		entry->devid = rom->devid;
+		entry->pcilen = rom->pcilen;
+		entry->romdata = pa_data +
+				 offsetof(struct pci_setup_rom, romdata);
+
+		list_add(&entry->list, &setup_pci_entries);
+
+		memblock_free(pa_data, sizeof(*rom));
+		pa_data = data->next;
+		early_memunmap(data, sizeof(*rom));
+	}
+
+	pci_setup_data = 0;
+
+	return 0;
+}
+
+int pcibios_add_device(struct pci_dev *dev)
+{
+	struct firmware_setup_pci_entry *entry;
+
+	list_for_each_entry(entry, &setup_pci_entries, list) {
+		if ((pci_domain_nr(dev->bus) == entry->segment) &&
+		    (dev->bus->number == entry->bus) &&
+		    (PCI_SLOT(dev->devfn) == entry->device) &&
+		    (PCI_FUNC(dev->devfn) == entry->function) &&
+		    (dev->vendor == entry->vendor) &&
+		    (dev->device == entry->devid)) {
+			dev->rom = entry->romdata;
+			dev->romlen = entry->pcilen;
 			dev_printk(KERN_DEBUG, &dev->dev, "set rom to [%#010lx, %#010lx] via SETUP_PCI\n",
 				   (unsigned long)dev->rom,
 				   (unsigned long)(dev->rom + dev->romlen - 1));
 		}
-		pa_data = data->next;
-		iounmap(data);
 	}
+
 	return 0;
 }
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 36/42] x86, boot, PCI: Copy SETUP_PCI rom to kernel space
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (34 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 37/42] x86, boot, PCI: Export SETUP_PCI data via sysfs Yinghai Lu
                   ` (7 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

As EFI stub code could put them high when on 32bit or with exactmap=
on 64bit conf.

Check if the range is mapped, otherwise allocate new one and have
the rom data copied. So we could access them directly.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/pci/common.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 32d4f21..4d6b128 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -668,6 +668,48 @@ struct firmware_setup_pci_entry {
 
 static LIST_HEAD(setup_pci_entries);
 
+static phys_addr_t check_copy(phys_addr_t start, unsigned long size)
+{
+	unsigned long start_pfn = PFN_DOWN(start);
+	unsigned long end_pfn = PFN_UP(start + size);
+	unsigned char *p, *q;
+	phys_addr_t pa_p, pa_q;
+	long sz = size;
+
+	if (pfn_range_is_mapped(start_pfn, end_pfn))
+		return start;
+
+	/* allocate and copy */
+	pa_p = memblock_alloc(size, PAGE_SIZE);
+	if (!pa_p)
+		return start;
+
+	p = phys_to_virt(pa_p);
+
+	pa_q = start;
+	while (sz > 0) {
+		long chunk_size = 64<<10;
+
+		if (chunk_size > sz)
+			chunk_size = sz;
+
+		q = early_memremap(pa_q, chunk_size);
+		if (!q) {
+			memblock_free(pa_p, size);
+			return start;
+		}
+		memcpy(p, q, chunk_size);
+		early_memunmap(q, chunk_size);
+		p += chunk_size;
+		pa_q += chunk_size;
+		sz -= chunk_size;
+	}
+
+	memblock_free(start, size);
+
+	return pa_p;
+}
+
 int __init fill_setup_pci_entries(void)
 {
 	struct setup_data *data;
@@ -697,8 +739,9 @@ int __init fill_setup_pci_entries(void)
 		entry->vendor = rom->vendor;
 		entry->devid = rom->devid;
 		entry->pcilen = rom->pcilen;
-		entry->romdata = pa_data +
-				 offsetof(struct pci_setup_rom, romdata);
+		entry->romdata = check_copy(pa_data +
+				      offsetof(struct pci_setup_rom, romdata),
+				      rom->pcilen);
 
 		list_add(&entry->list, &setup_pci_entries);
 
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 37/42] x86, boot, PCI: Export SETUP_PCI data via sysfs
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (35 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 36/42] x86, boot, PCI: Copy SETUP_PCI rom to kernel space Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 38/42] x86: Fix typo in mark_rodata_ro Yinghai Lu
                   ` (6 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He
  Cc: linux-kernel, Yinghai Lu, Bjorn Helgaas, linux-pci

So we could let kexec-tools to rebuild SETUP_PCI and pass it to
second kernel if needed.

Now kexec-tools already build SETUP_EFI and SETUP_E820EXT.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/pci/common.c | 175 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 175 insertions(+)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 4d6b128..9112d92 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -656,6 +656,8 @@ void __init add_pci(u64 pa_data)
 
 struct firmware_setup_pci_entry {
 	struct list_head list;
+	struct kobject kobj;
+	struct bin_attribute *rom_attr;
 	uint16_t vendor;
 	uint16_t devid;
 	uint64_t pcilen;
@@ -777,6 +779,179 @@ int pcibios_add_device(struct pci_dev *dev)
 	return 0;
 }
 
+#ifdef CONFIG_SYSFS
+static inline struct firmware_setup_pci_entry *
+to_setup_pci_entry(struct kobject *kobj)
+{
+	return container_of(kobj, struct firmware_setup_pci_entry, kobj);
+}
+
+static ssize_t vendor_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%04llx\n",
+			(unsigned long long)entry->vendor);
+}
+
+static ssize_t devid_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%04llx\n",
+			(unsigned long long)entry->devid);
+}
+
+static ssize_t pcilen_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%llx\n",
+			(unsigned long long)entry->pcilen);
+}
+
+static ssize_t segment_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%04llx\n",
+			(unsigned long long)entry->segment);
+}
+
+static ssize_t bus_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%02llx\n",
+			(unsigned long long)entry->bus);
+}
+
+static ssize_t device_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%02llx\n",
+			(unsigned long long)entry->device);
+}
+
+static ssize_t function_show(struct firmware_setup_pci_entry *entry, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0x%1llx\n",
+			(unsigned long long)entry->function);
+}
+
+struct setup_pci_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct firmware_setup_pci_entry *entry, char *buf);
+};
+
+static inline struct setup_pci_attribute *to_setup_pci_attr(
+							struct attribute *attr)
+{
+	return container_of(attr, struct setup_pci_attribute, attr);
+}
+
+static ssize_t setup_pci_attr_show(struct kobject *kobj,
+				   struct attribute *attr, char *buf)
+{
+	struct firmware_setup_pci_entry *entry = to_setup_pci_entry(kobj);
+	struct setup_pci_attribute *setup_pci_attr = to_setup_pci_attr(attr);
+
+	return setup_pci_attr->show(entry, buf);
+}
+
+static struct setup_pci_attribute setup_pci_vendor_attr = __ATTR_RO(vendor);
+static struct setup_pci_attribute setup_pci_devid_attr = __ATTR_RO(devid);
+static struct setup_pci_attribute setup_pci_pcilen_attr = __ATTR_RO(pcilen);
+static struct setup_pci_attribute setup_pci_segment_attr = __ATTR_RO(segment);
+static struct setup_pci_attribute setup_pci_bus_attr = __ATTR_RO(bus);
+static struct setup_pci_attribute setup_pci_device_attr = __ATTR_RO(device);
+static struct setup_pci_attribute setup_pci_function_attr = __ATTR_RO(function);
+
+/*
+ * These are default attributes that are added for every memmap entry.
+ */
+static struct attribute *def_attrs[] = {
+	&setup_pci_vendor_attr.attr,
+	&setup_pci_devid_attr.attr,
+	&setup_pci_pcilen_attr.attr,
+	&setup_pci_segment_attr.attr,
+	&setup_pci_bus_attr.attr,
+	&setup_pci_device_attr.attr,
+	&setup_pci_function_attr.attr,
+	NULL
+};
+
+static const struct sysfs_ops setup_pci_attr_ops = {
+	.show = setup_pci_attr_show,
+};
+
+static struct kobj_type __refdata setup_pci_ktype = {
+	.sysfs_ops      = &setup_pci_attr_ops,
+	.default_attrs  = def_attrs,
+};
+
+static ssize_t setup_pci_rom_read(struct file *filp, struct kobject *kobj,
+				  struct bin_attribute *bin_attr, char *buf,
+				  loff_t off, size_t count)
+{
+	struct firmware_setup_pci_entry *entry = to_setup_pci_entry(kobj);
+
+	if (off >= entry->pcilen)
+		count = 0;
+	else {
+		unsigned char *rom = phys_to_virt(entry->romdata);
+
+		if (off + count > entry->pcilen)
+			count = entry->pcilen - off;
+
+		memcpy(buf, rom + off, count);
+	}
+
+	return count;
+}
+
+static int __init add_sysfs_fw_setup_pci_entry(
+					struct firmware_setup_pci_entry *entry)
+{
+	int retval = 0;
+	static int setup_pci_entries_nr;
+	static struct kset *setup_pci_kset;
+	struct bin_attribute *attr;
+
+	kobject_init(&entry->kobj, &setup_pci_ktype);
+
+	if (!setup_pci_kset) {
+		setup_pci_kset = kset_create_and_add("setup_pci", NULL,
+						     firmware_kobj);
+		if (!setup_pci_kset)
+			return -ENOMEM;
+	}
+
+	entry->kobj.kset = setup_pci_kset;
+	retval = kobject_add(&entry->kobj, NULL, "%d", setup_pci_entries_nr++);
+	if (retval) {
+		kobject_put(&entry->kobj);
+		return retval;
+	}
+
+	attr = kzalloc(sizeof(*attr), GFP_ATOMIC);
+	if (!attr)
+		return -ENOMEM;
+
+	sysfs_bin_attr_init(attr);
+	attr->size = entry->pcilen;
+	attr->attr.name = "rom";
+	attr->attr.mode = S_IRUSR;
+	attr->read = setup_pci_rom_read;
+	retval = sysfs_create_bin_file(&entry->kobj, attr);
+	if (retval)
+		kfree(attr);
+	entry->rom_attr = attr;
+
+	return retval;
+}
+
+static int __init firmware_setup_pci_init(void)
+{
+	struct firmware_setup_pci_entry *entry;
+
+	list_for_each_entry(entry, &setup_pci_entries, list)
+		add_sysfs_fw_setup_pci_entry(entry);
+
+	return 0;
+}
+late_initcall(firmware_setup_pci_init);
+#endif
+
 int pcibios_enable_device(struct pci_dev *dev, int mask)
 {
 	int err;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 38/42] x86: Fix typo in mark_rodata_ro
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (36 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 37/42] x86, boot, PCI: Export SETUP_PCI data via sysfs Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:05   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 39/42] x86, 64bit: add pfn_range_is_highmapped() Yinghai Lu
                   ` (5 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

In the comment, should use cleanup_highmap().
and also remove not needed cast for _brk_end, as it is
unsigned long.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/mm/init_64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 257ba4b..3b7453a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1054,9 +1054,9 @@ void mark_rodata_ro(void)
 	 * of the PMD will remain mapped executable.
 	 *
 	 * Any PMD which was setup after the one which covers _brk_end
-	 * has been zapped already via cleanup_highmem().
+	 * has been zapped already via cleanup_highmap().
 	 */
-	all_end = roundup((unsigned long)_brk_end, PMD_SIZE);
+	all_end = roundup(_brk_end, PMD_SIZE);
 	set_memory_nx(rodata_start, (all_end - rodata_start) >> PAGE_SHIFT);
 
 	rodata_test();
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 39/42] x86, 64bit: add pfn_range_is_highmapped()
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (37 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 38/42] x86: Fix typo in mark_rodata_ro Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges Yinghai Lu
                   ` (4 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

Need to use it to support holes in highmap when remove not
used range in highmap.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/pgtable_64.h |  2 ++
 arch/x86/mm/init_64.c             | 22 ++++++++++++++++++++++
 arch/x86/mm/pageattr.c            | 16 +---------------
 3 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 2ee7811..6b2aae2 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -158,6 +158,8 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
 extern int kern_addr_valid(unsigned long addr);
 extern void cleanup_highmap(void);
 
+int pfn_range_is_highmapped(unsigned long start_pfn, unsigned long end_pfn);
+
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3b7453a..2507b98 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -290,6 +290,23 @@ void __init init_extra_mapping_uc(unsigned long phys, unsigned long size)
 	__init_extra_mapping(phys, size, _PAGE_CACHE_MODE_UC);
 }
 
+/* three holes at most*/
+#define NR_RANGE 4
+static struct range pfn_highmapped[NR_RANGE];
+static int nr_pfn_highmapped;
+
+int pfn_range_is_highmapped(unsigned long start_pfn, unsigned long end_pfn)
+{
+	int i;
+
+	for (i = 0; i < nr_pfn_highmapped; i++)
+		if ((start_pfn >= pfn_highmapped[i].start) &&
+		    (end_pfn <= pfn_highmapped[i].end))
+			return 1;
+
+	return 0;
+}
+
 /*
  * The head.S code sets up the kernel high mapping:
  *
@@ -324,6 +341,11 @@ void __init cleanup_highmap(void)
 		if (vaddr < (unsigned long) _text || vaddr > end)
 			set_pmd(pmd, __pmd(0));
 	}
+
+	nr_pfn_highmapped = add_range(pfn_highmapped, NR_RANGE,
+			nr_pfn_highmapped,
+			__pa_symbol(_text) >> PAGE_SHIFT,
+			__pa_symbol(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT);
 }
 
 static unsigned long __meminit
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 727158c..06a0116 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -90,20 +90,6 @@ void arch_report_meminfo(struct seq_file *m)
 static inline void split_page_count(int level) { }
 #endif
 
-#ifdef CONFIG_X86_64
-
-static inline unsigned long highmap_start_pfn(void)
-{
-	return __pa_symbol(_text) >> PAGE_SHIFT;
-}
-
-static inline unsigned long highmap_end_pfn(void)
-{
-	return __pa_symbol(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
-}
-
-#endif
-
 #ifdef CONFIG_DEBUG_PAGEALLOC
 # define debug_pagealloc 1
 #else
@@ -1271,7 +1257,7 @@ static int cpa_process_alias(struct cpa_data *cpa)
 	 * to touch the high mapped kernel as well:
 	 */
 	if (!within(vaddr, (unsigned long)_text, _brk_end) &&
-	    within(cpa->pfn, highmap_start_pfn(), highmap_end_pfn())) {
+	    pfn_range_is_highmapped(cpa->pfn, 1)) {
 		unsigned long temp_cpa_vaddr = (cpa->pfn << PAGE_SHIFT) +
 					       __START_KERNEL_map - phys_base;
 		alias_cpa = *cpa;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (38 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 39/42] x86, 64bit: add pfn_range_is_highmapped() Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:17   ` Kees Cook
  2015-07-07 20:20 ` [PATCH 41/42] x86, 64bit: Add __pa_high/__va_high Yinghai Lu
                   ` (3 subsequent siblings)
  43 siblings, 1 reply; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

add cleanup_highmap_late to remove highmap for initmem, around rodata, and
[_brk_end, all_end).

Kernel Layout:

[    0.000000]   .text: [0x01000000-0x0200df88]
[    0.000000] .rodata: [0x02200000-0x02a1dfff]
[    0.000000]   .data: [0x02c00000-0x02e510ff]
[    0.000000]   .init: [0x02e53000-0x03213fff]
[    0.000000]    .bss: [0x03222000-0x0437cfff]
[    0.000000]    .brk: [0x0437d000-0x043a2fff]

Actually used brk:
[    0.270365] memblock_reserve: [0x0000000437d000-0x00000004383fff] flags 0x0 BRK

Before patch:
---[ High Kernel Mapping ]---
0xffffffff80000000-0xffffffff81000000          16M                           pmd
0xffffffff81000000-0xffffffff82000000          16M     ro         PSE GLB x  pmd
0xffffffff82000000-0xffffffff82011000          68K     ro             GLB x  pte
0xffffffff82011000-0xffffffff82200000        1980K     RW             GLB x  pte
0xffffffff82200000-0xffffffff82a00000           8M     ro         PSE GLB NX pmd
0xffffffff82a00000-0xffffffff82a1e000         120K     ro             GLB NX pte
0xffffffff82a1e000-0xffffffff82c00000        1928K     RW             GLB NX pte
0xffffffff82c00000-0xffffffff82e00000           2M     RW         PSE GLB NX pmd
0xffffffff82e00000-0xffffffff83000000           2M     RW             GLB NX pte
0xffffffff83000000-0xffffffff83200000           2M     RW         PSE GLB NX pmd
0xffffffff83200000-0xffffffff83400000           2M     RW             GLB NX pte
0xffffffff83400000-0xffffffff84400000          16M     RW         PSE GLB NX pmd
0xffffffff84400000-0xffffffffa0000000         444M                           pmd

After patch:
---[ High Kernel Mapping ]---
0xffffffff80000000-0xffffffff81000000          16M                           pmd
0xffffffff81000000-0xffffffff82000000          16M     ro         PSE GLB x  pmd
0xffffffff82000000-0xffffffff82012000          72K     ro             GLB x  pte
0xffffffff82012000-0xffffffff82200000        1976K                           pte
0xffffffff82200000-0xffffffff82a00000           8M     ro         PSE GLB NX pmd
0xffffffff82a00000-0xffffffff82a1e000         120K     ro             GLB NX pte
0xffffffff82a1e000-0xffffffff82c00000        1928K                           pte
0xffffffff82c00000-0xffffffff82e00000           2M     RW         PSE GLB NX pmd
0xffffffff82e00000-0xffffffff82e53000         332K     RW             GLB NX pte
0xffffffff82e53000-0xffffffff83000000        1716K                           pte
0xffffffff83000000-0xffffffff83200000           2M                           pmd
0xffffffff83200000-0xffffffff83214000          80K                           pte
0xffffffff83214000-0xffffffff83400000        1968K     RW             GLB NX pte
0xffffffff83400000-0xffffffff84200000          14M     RW         PSE GLB NX pmd
0xffffffff84200000-0xffffffff84384000        1552K     RW             GLB NX pte
0xffffffff84384000-0xffffffff84400000         496K                           pte
0xffffffff84400000-0xffffffffa0000000         444M                           pmd

So remove some range around rodata.

-v4: adapt it to all_end change.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/mm/init_64.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2507b98..38aa59c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1010,6 +1010,61 @@ void __init mem_init(void)
 }
 
 #ifdef CONFIG_DEBUG_RODATA
+static void remove_highmap_2m(unsigned long addr)
+{
+	pgd_t *pgd = pgd_offset_k(addr);
+	pud_t *pud = (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
+	pmd_t *pmd = (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
+
+	set_pmd(pmd, __pmd(0));
+}
+
+static void remove_highmap_2m_partial(unsigned long addr, unsigned long end)
+{
+	int i;
+	pgd_t *pgd = pgd_offset_k(addr);
+	pud_t *pud = (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
+	pmd_t *pmd = (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
+	pte_t *pte = (pte_t *)pmd_page_vaddr(*pmd) + pte_index(addr);
+
+	for (i = pte_index(addr); i < pte_index(end - 1) + 1; i++, pte++)
+		set_pte(pte, __pte(0));
+}
+
+static void cleanup_highmap_late(unsigned long start, unsigned long end)
+{
+	unsigned long addr;
+	unsigned long start_2m_aligned = roundup(start, PMD_SIZE);
+	unsigned long end_2m_aligned = rounddown(end, PMD_SIZE);
+
+	start = PFN_ALIGN(start);
+	end &= PAGE_MASK;
+
+	if (start >= end)
+		return;
+
+	if (start < start_2m_aligned) {
+		unsigned long tmp = min(start_2m_aligned, end);
+
+		set_memory_4k(start, (tmp - start) >> PAGE_SHIFT);
+		remove_highmap_2m_partial(start, tmp);
+	}
+
+	for (addr = start_2m_aligned; addr < end_2m_aligned; addr += PMD_SIZE)
+		remove_highmap_2m(addr);
+
+	if (start <= end_2m_aligned && end_2m_aligned < end) {
+		set_memory_4k(end_2m_aligned,
+				(end - end_2m_aligned) >> PAGE_SHIFT);
+		remove_highmap_2m_partial(end_2m_aligned, end);
+	}
+
+	subtract_range(pfn_highmapped, NR_RANGE,
+			__pa_symbol(start) >> PAGE_SHIFT,
+			__pa_symbol(end) >> PAGE_SHIFT);
+	nr_pfn_highmapped = clean_sort_range(pfn_highmapped, NR_RANGE);
+}
+
 const int rodata_test_data = 0xC3;
 EXPORT_SYMBOL_GPL(rodata_test_data);
 
@@ -1058,6 +1113,7 @@ void mark_rodata_ro(void)
 	unsigned long end = (unsigned long) &__end_rodata_hpage_align;
 	unsigned long text_end = PFN_ALIGN(&__stop___ex_table);
 	unsigned long rodata_end = PFN_ALIGN(&__end_rodata);
+	unsigned long data_start = PFN_ALIGN(&_sdata);
 	unsigned long all_end;
 
 	printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
@@ -1081,6 +1137,12 @@ void mark_rodata_ro(void)
 	all_end = roundup(_brk_end, PMD_SIZE);
 	set_memory_nx(rodata_start, (all_end - rodata_start) >> PAGE_SHIFT);
 
+	cleanup_highmap_late(text_end, rodata_start);
+	cleanup_highmap_late(rodata_end, data_start);
+	cleanup_highmap_late(PFN_ALIGN(_brk_end), all_end);
+	cleanup_highmap_late((unsigned long)(&__init_begin),
+				(unsigned long)(&__init_end));
+
 	rodata_test();
 
 #ifdef CONFIG_CPA_DEBUG
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 41/42] x86, 64bit: Add __pa_high/__va_high
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (39 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 20:20 ` [PATCH 42/42] x86: fix msr print again Yinghai Lu
                   ` (2 subsequent siblings)
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

and use it to make the early page table setup code more readable,
as we are using kernel high mapping address.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/head64.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index a9f0299..cd0a820 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -37,6 +37,9 @@ extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
 static unsigned int __initdata next_early_pgt = 2;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
 
+#define __va_high(x) ((void *)((unsigned long)(x) + __START_KERNEL_map - phys_base))
+#define __pa_high(x) ((unsigned long)(x) - __START_KERNEL_map + phys_base)
+
 /* Wipe all early page tables except for the kernel symbol map */
 static void __init reset_early_page_tables(void)
 {
@@ -47,7 +50,7 @@ static void __init reset_early_page_tables(void)
 
 	next_early_pgt = 0;
 
-	write_cr3(__pa_nodebug(early_level4_pgt));
+	write_cr3(__pa_high(early_level4_pgt));
 }
 
 /* Create a new PMD entry */
@@ -60,7 +63,7 @@ int __init early_make_pgtable(unsigned long address)
 	pmdval_t pmd, *pmd_p;
 
 	/* Invalid address or early pgt is done ?  */
-	if (physaddr >= MAXMEM || read_cr3() != __pa_nodebug(early_level4_pgt))
+	if (physaddr >= MAXMEM || read_cr3() != __pa_high(early_level4_pgt))
 		return -1;
 
 again:
@@ -73,7 +76,7 @@ again:
 	 * range and we might end up looping forever...
 	 */
 	if (pgd)
-		pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
+		pud_p = (pudval_t *)__va_high(pgd & PTE_PFN_MASK);
 	else {
 		if (next_early_pgt >= EARLY_DYNAMIC_PAGE_TABLES) {
 			reset_early_page_tables();
@@ -83,13 +86,13 @@ again:
 		pud_p = (pudval_t *)early_dynamic_pgts[next_early_pgt++];
 		for (i = 0; i < PTRS_PER_PUD; i++)
 			pud_p[i] = 0;
-		*pgd_p = (pgdval_t)pud_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE;
+		*pgd_p = __pa_high(pud_p) + _KERNPG_TABLE;
 	}
 	pud_p += pud_index(address);
 	pud = *pud_p;
 
 	if (pud)
-		pmd_p = (pmdval_t *)((pud & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
+		pmd_p = (pmdval_t *)__va_high(pud & PTE_PFN_MASK);
 	else {
 		if (next_early_pgt >= EARLY_DYNAMIC_PAGE_TABLES) {
 			reset_early_page_tables();
@@ -99,7 +102,7 @@ again:
 		pmd_p = (pmdval_t *)early_dynamic_pgts[next_early_pgt++];
 		for (i = 0; i < PTRS_PER_PMD; i++)
 			pmd_p[i] = 0;
-		*pud_p = (pudval_t)pmd_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE;
+		*pud_p = __pa_high(pmd_p) + _KERNPG_TABLE;
 	}
 	pmd = (physaddr & PMD_MASK) + early_pmd_flags;
 	pmd_p[pmd_index(address)] = pmd;
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 42/42] x86: fix msr print again
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (40 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 41/42] x86, 64bit: Add __pa_high/__va_high Yinghai Lu
@ 2015-07-07 20:20 ` Yinghai Lu
  2015-07-07 23:21 ` [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Kees Cook
  2015-07-08 10:51 ` Ingo Molnar
  43 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-07 20:20 UTC (permalink / raw)
  To: Kees Cook, H. Peter Anvin, Baoquan He; +Cc: linux-kernel, Yinghai Lu

msr early print out get broken again, fix it.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/processor.h |  1 -
 arch/x86/kernel/cpu/common.c     | 61 +++++++++++++++++++++-------------------
 2 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 43e6519..3a7bd35 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -177,7 +177,6 @@ extern void early_cpu_init(void);
 extern void identify_boot_cpu(void);
 extern void identify_secondary_cpu(struct cpuinfo_x86 *);
 extern void print_cpu_info(struct cpuinfo_x86 *);
-void print_cpu_msr(struct cpuinfo_x86 *);
 extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
 extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
 extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 922c5e0..3c87e75 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1016,27 +1016,6 @@ out:
 }
 #endif
 
-void __init identify_boot_cpu(void)
-{
-	identify_cpu(&boot_cpu_data);
-	init_amd_e400_c1e_mask();
-#ifdef CONFIG_X86_32
-	sysenter_setup();
-	enable_sep_cpu();
-#endif
-	cpu_detect_tlb(&boot_cpu_data);
-}
-
-void identify_secondary_cpu(struct cpuinfo_x86 *c)
-{
-	BUG_ON(c == &boot_cpu_data);
-	identify_cpu(c);
-#ifdef CONFIG_X86_32
-	enable_sep_cpu();
-#endif
-	mtrr_ap_init();
-}
-
 struct msr_range {
 	unsigned	min;
 	unsigned	max;
@@ -1082,6 +1061,38 @@ static __init int setup_show_msr(char *arg)
 }
 __setup("show_msr=", setup_show_msr);
 
+static void print_cpu_msr(struct cpuinfo_x86 *c)
+{
+	if (c->cpu_index < show_msr)
+		__print_cpu_msr();
+}
+
+void __init identify_boot_cpu(void)
+{
+	identify_cpu(&boot_cpu_data);
+	init_amd_e400_c1e_mask();
+#ifdef CONFIG_X86_32
+	sysenter_setup();
+	enable_sep_cpu();
+#endif
+	cpu_detect_tlb(&boot_cpu_data);
+
+	print_cpu_msr(&boot_cpu_data);
+}
+
+void identify_secondary_cpu(struct cpuinfo_x86 *c)
+{
+	BUG_ON(c == &boot_cpu_data);
+	identify_cpu(c);
+#ifdef CONFIG_X86_32
+	enable_sep_cpu();
+#endif
+
+	print_cpu_msr(c);
+
+	mtrr_ap_init();
+}
+
 static __init int setup_noclflush(char *arg)
 {
 	setup_clear_cpu_cap(X86_FEATURE_CLFLUSH);
@@ -1115,14 +1126,6 @@ void print_cpu_info(struct cpuinfo_x86 *c)
 		printk(KERN_CONT ", stepping: %02x)\n", c->x86_mask);
 	else
 		printk(KERN_CONT ")\n");
-
-	print_cpu_msr(c);
-}
-
-void print_cpu_msr(struct cpuinfo_x86 *c)
-{
-	if (c->cpu_index < show_msr)
-		__print_cpu_msr();
 }
 
 static __init int setup_disablecpuid(char *arg)
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location
  2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
@ 2015-07-07 20:57   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 20:57 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

The subject has a typo "kasl" should be "kaslr".

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> real_mode is global variable, so we do not need to pass it around.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>

Other than that, yeah, there's no good reason to pass that around.

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  arch/x86/boot/compressed/aslr.c | 5 ++---
>  arch/x86/boot/compressed/misc.c | 2 +-
>  arch/x86/boot/compressed/misc.h | 6 ++----
>  3 files changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
> index d7b1f65..71520c4 100644
> --- a/arch/x86/boot/compressed/aslr.c
> +++ b/arch/x86/boot/compressed/aslr.c
> @@ -295,8 +295,7 @@ static unsigned long find_random_addr(unsigned long minimum,
>         return slots_fetch_random();
>  }
>
> -unsigned char *choose_kernel_location(struct boot_params *boot_params,
> -                                     unsigned char *input,
> +unsigned char *choose_kernel_location(unsigned char *input,
>                                       unsigned long input_size,
>                                       unsigned char *output,
>                                       unsigned long output_size)
> @@ -316,7 +315,7 @@ unsigned char *choose_kernel_location(struct boot_params *boot_params,
>         }
>  #endif
>
> -       boot_params->hdr.loadflags |= KASLR_FLAG;
> +       real_mode->hdr.loadflags |= KASLR_FLAG;
>
>         /* Record the various known unsafe memory ranges. */
>         mem_avoid_init((unsigned long)input, input_size,
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index a107b93..ebf72ce 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -404,7 +404,7 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>          * the entire decompressed kernel plus relocation table, or the
>          * entire decompressed kernel plus .bss and .brk sections.
>          */
> -       output = choose_kernel_location(real_mode, input_data, input_len, output,
> +       output = choose_kernel_location(input_data, input_len, output,
>                                         output_len > run_size ? output_len
>                                                               : run_size);
>
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 805d25c..8c96cc5 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -56,8 +56,7 @@ int cmdline_find_option_bool(const char *option);
>
>  #if CONFIG_RANDOMIZE_BASE
>  /* aslr.c */
> -unsigned char *choose_kernel_location(struct boot_params *boot_params,
> -                                     unsigned char *input,
> +unsigned char *choose_kernel_location(unsigned char *input,
>                                       unsigned long input_size,
>                                       unsigned char *output,
>                                       unsigned long output_size);
> @@ -65,8 +64,7 @@ unsigned char *choose_kernel_location(struct boot_params *boot_params,
>  bool has_cpuflag(int flag);
>  #else
>  static inline
> -unsigned char *choose_kernel_location(struct boot_params *boot_params,
> -                                     unsigned char *input,
> +unsigned char *choose_kernel_location(unsigned char *input,
>                                       unsigned long input_size,
>                                       unsigned char *output,
>                                       unsigned long output_size)
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing
  2015-07-07 20:19 ` [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing Yinghai Lu
@ 2015-07-07 21:22   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 21:22 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> So we can find out ZO position easily during run-time for kasl buffer
> searching.

Can you define VO and ZO for this changelog? I understand "VO" to mean
"uncompressed kernel", and "ZO" to mean "compressed kernel", is that
accurate? Maybe "ZO" should be "compressed kernel blob with
decompression code"? I'm not clear on the elements you're talking
about here.

> Current code is using extract_offset to control copied kernel position, it
> will put the copied kernel in the middle of buffer when kernel run size is
> bigger than decompressed needed buffer size.

Doesn't it unconditionally put the compressed kernel at extract_offset?

> Current layout:
> when init_size is the same as kernel run_size:
>                                         run_size
> 0              extract_offset          init_size
> |------------------|------------------------|
>    VO text/data                   VO bss/brk
>                    input ZO text ZO data

I don't understand this picture. The locations of "VO bss/brk" and
"input ZO text ZO data" don't make sense to me. Are you trying to show
that they are aligned with extract_offset and init_size?

>
> This patch try to:
> move ZO to the end of buffer instead of middle of the buffer.
> When init_size is bigger than kernel run size, will have
>
> 0                            run_size    init_size
> |--------------------------------|----------|
>    VO text/data        VO bss/brk
>                        input ZO text ZO data

Won't run_size always be larger than init_size?

> We already have init_size the buffer size, we can find the end easily
> when copying ZO before decompressing.

Which buffer do you mean?

Another thing I'd like to understand is what problem does this patch
solve? I see that it rearranges things, but why is this useful?

Thanks for working on these!

-Kees

>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/boot/compressed/head_32.S     | 11 +++++++++--
>  arch/x86/boot/compressed/head_64.S     |  8 ++++++--
>  arch/x86/boot/compressed/mkpiggy.c     |  7 ++-----
>  arch/x86/boot/compressed/vmlinux.lds.S |  1 +
>  arch/x86/boot/header.S                 |  2 +-
>  arch/x86/kernel/asm-offsets.c          |  1 +
>  arch/x86/kernel/vmlinux.lds.S          |  1 +
>  7 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 8ef964d..0c140f9 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> @@ -148,7 +148,9 @@ preferred_addr:
>  1:
>
>         /* Target address to relocate to for decompression */
> -       addl    $z_extract_offset, %ebx
> +       movl    BP_init_size(%esi), %eax
> +       subl    $_end, %eax
> +       addl    %eax, %ebx
>
>         /* Set up the stack */
>         leal    boot_stack_end(%ebx), %esp
> @@ -210,8 +212,13 @@ relocated:
>                                 /* push arguments for decompress_kernel: */
>         pushl   $z_run_size     /* size of kernel with .bss and .brk */
>         pushl   $z_output_len   /* decompressed length, end of relocs */
> -       leal    z_extract_offset_negative(%ebx), %ebp
> +
> +       movl    BP_init_size(%esi), %eax
> +       subl    $_end, %eax
> +       movl    %ebx, %ebp
> +       subl    %eax, %ebp
>         pushl   %ebp            /* output address */
> +
>         pushl   $z_input_len    /* input_len */
>         leal    input_data(%ebx), %eax
>         pushl   %eax            /* input_data */
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index b0c0d16..67dd8d3 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -102,7 +102,9 @@ ENTRY(startup_32)
>  1:
>
>         /* Target address to relocate to for decompression */
> -       addl    $z_extract_offset, %ebx
> +       movl    BP_init_size(%esi), %eax
> +       subl    $_end, %eax
> +       addl    %eax, %ebx
>
>  /*
>   * Prepare for entering 64 bit mode
> @@ -330,7 +332,9 @@ preferred_addr:
>  1:
>
>         /* Target address to relocate to for decompression */
> -       leaq    z_extract_offset(%rbp), %rbx
> +       movl    BP_init_size(%rsi), %ebx
> +       subl    $_end, %ebx
> +       addq    %rbp, %rbx
>
>         /* Set up the stack */
>         leaq    boot_stack_end(%rbx), %rsp
> diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
> index d8222f2..5faad09 100644
> --- a/arch/x86/boot/compressed/mkpiggy.c
> +++ b/arch/x86/boot/compressed/mkpiggy.c
> @@ -83,11 +83,8 @@ int main(int argc, char *argv[])
>         printf("z_input_len = %lu\n", ilen);
>         printf(".globl z_output_len\n");
>         printf("z_output_len = %lu\n", (unsigned long)olen);
> -       printf(".globl z_extract_offset\n");
> -       printf("z_extract_offset = 0x%lx\n", offs);
> -       /* z_extract_offset_negative allows simplification of head_32.S */
> -       printf(".globl z_extract_offset_negative\n");
> -       printf("z_extract_offset_negative = -0x%lx\n", offs);
> +       printf(".globl z_min_extract_offset\n");
> +       printf("z_min_extract_offset = 0x%lx\n", offs);
>         printf(".globl z_run_size\n");
>         printf("z_run_size = %lu\n", run_size);
>
> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> index 34d047c..e24e0a0 100644
> --- a/arch/x86/boot/compressed/vmlinux.lds.S
> +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> @@ -70,5 +70,6 @@ SECTIONS
>                 _epgtable = . ;
>         }
>  #endif
> +       . = ALIGN(PAGE_SIZE);   /* keep ZO size page aligned */
>         _end = .;
>  }
> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> index 16ef025..9bfab22 100644
> --- a/arch/x86/boot/header.S
> +++ b/arch/x86/boot/header.S
> @@ -440,7 +440,7 @@ setup_data:         .quad 0                 # 64-bit physical pointer to
>
>  pref_address:          .quad LOAD_PHYSICAL_ADDR        # preferred load addr
>
> -#define ZO_INIT_SIZE   (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
> +#define ZO_INIT_SIZE   (ZO__end - ZO_startup_32 + ZO_z_min_extract_offset)
>  #define VO_INIT_SIZE   (VO__end - VO__text)
>  #if ZO_INIT_SIZE > VO_INIT_SIZE
>  #define INIT_SIZE ZO_INIT_SIZE
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index 8e3d22a1..d2e00bc 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -87,6 +87,7 @@ void common(void) {
>         OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch);
>         OFFSET(BP_version, boot_params, hdr.version);
>         OFFSET(BP_kernel_alignment, boot_params, hdr.kernel_alignment);
> +       OFFSET(BP_init_size, boot_params, hdr.init_size);
>         OFFSET(BP_pref_address, boot_params, hdr.pref_address);
>         OFFSET(BP_code32_start, boot_params, hdr.code32_start);
>
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 00bf300..5816920 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -325,6 +325,7 @@ SECTIONS
>                 __brk_limit = .;
>         }
>
> +       . = ALIGN(PAGE_SIZE);           /* keep VO_INIT_SIZE page aligned */
>         _end = .;
>
>          STABS_DEBUG
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 03/42] x86, boot: Fix run_size calculation
  2015-07-07 20:19 ` [PATCH 03/42] x86, boot: Fix run_size calculation Yinghai Lu
@ 2015-07-07 22:15   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:15 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: H. Peter Anvin, Baoquan He, LKML, Junjie Mao, Josh Triplett,
	Matt Fleming, Andrew Morton

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> While looking at the boot code to add mem mapping for kasl
> with 64bit above 4G support, I found that e6023367d779 ("x86, kaslr: Prevent
> .bss from overlaping initrd") and later introduced way to get kernel run_size
> and pass it around.  At first run_size calculation is via perl and then
> changed to shell scripts.
>
> At first, that calculation is not right in the shell scripts:
> it is using bss offset in the file plus bss/brk section size.
>
>    run_size=$(( $offsetA + $sizeA + $sizeB ))
>
> Idx Name          Size      VMA               LMA               File off  Algn
> ...
>  24 .bss          000a1000  ffffffff825e0000  00000000025e0000  019e0000  2**12
>                   ALLOC
>  25 .brk          00026000  ffffffff82681000  0000000002681000  019e0000  2**0
>                   ALLOC
>
> that run_size will be 27947008.

In hex, this is 0x1aa7000. What should it be?

> it has extra not needed size as
> 1. we have hole between the sections in file to get aligned in file.
> 2. start of text is from 0x200000 in elf file.
>
>   [Nr] Name              Type             Address           Offset
>        Size              EntSize          Flags  Link  Info  Align
>   ...
>   [25] .bss              NOBITS           ffffffff825e0000  019e0000
>        00000000000a1000  0000000000000000  WA       0     0     4096
>   [26] .brk              NOBITS           ffffffff82681000  019e0000
>        0000000000026000  0000000000000000  WA       0     0     1
>
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000
>                  0x00000000013a9000 0x00000000013a9000  R E    200000
>   LOAD           0x0000000001600000 0xffffffff82400000 0x0000000002400000
>                  0x00000000000ed000 0x00000000000ed000  RW     200000
>   LOAD           0x0000000001800000 0x0000000000000000 0x00000000024ed000
>                  0x0000000000013698 0x0000000000013698  RW     200000
>   LOAD           0x0000000001901000 0xffffffff82501000 0x0000000002501000
>                  0x00000000000df000 0x00000000001a6000  RWE    200000
>   NOTE           0x0000000000e9d7dc 0xffffffff81c9d7dc 0x0000000001c9d7dc
>                  0x0000000000000024 0x0000000000000024         4
>
>  Section to Segment mapping:
>   Segment Sections...
>    00     .text .notes ..
>    01     .data .vvar
>    02     .data..percpu
>    03     .init.text ... .bss .brk
>    04     .notes
>
> During decompress_kernel, parse_elf will move forward section to run time position.
>
>    parse_elf: [0x009a000000-0x009b3a8fff] <=== [0x009a200000-0x009b5a8fff]
>    parse_elf: [0x009b400000-0x009b4ecfff] <=== [0x009b600000-0x009b6ecfff]
>    parse_elf: [0x009b4ed000-0x009b500697] <=== [0x009b800000-0x009b813697]
>    parse_elf: [0x009b501000-0x009b5dffff] <=== [0x009b901000-0x009b9dffff]

Where is this output from? The first LOAD is at 0x1000000, so I assume
the extra 0x99000000 came from kASLR as the base address?

Looking at Program Header loading, .brk ends at 0x009b9dffff (offset
from 0x009a200000) this is a run_size of 0x17e0000, not 0x1aa7000 as
calculated by the shell script?

I'm having a hard time seeing the holes, though. It looks like the 03
Segment starts at 0x2501000 with a file size of 0xdf000, which matches
the .bss LMA start: 0x25e0000. And the 03 Segment has a MemSiz of
0x1a6000, getting us from 0x2501000 to 0x26a7000, which exactly
matches the end of .brk, which is 0x2681000 + 0x26000 = 0x26a7000.
(And offset from the .text start of 0x1000000, means a run_size of
0x16a7000.)

The missing 0x2c7000 seems to be from various Sections being laid out
differently? Perhaps the shell script should be using LMA (or VMA),
not File offset? .brk LMA 0x2681000 - .text LMA 0x1000000 + .brk size
0x26000 = 0x16a7000, as above.

So, I'm convinced the shell script is calculating something wrong (it
gets 0x1aa7000, but it should be 0x16a7000, off by 0x400000), but I
don't see how this relates to the parse_elf output that suggests the
run_size should be 0x009b9dffff - 0x9a200000 = 0x17e0000. Is 0x16a7000
too low, and if so, why?

> Secondly it is not necessary. As run_size is simple constant, we don't
> need to pass it around and we already have voffset.h for that.
>
> We can share voffset.h between misc.c and header.S instead of adding
> other way to get run_size.

Can you demonstrate that "VO__end - VO__text" matches an expected
value? This isn't obvious to me. I would expect one of 0x1aa7000,
0x16a7000, or 0x17e0000 here.

Thanks! Regardless of my confusion, I would _love_ to get rid of the
external calculation of run_size. It's a really terrible hack, but I
hadn't been able to convince myself anything else was correct. I just
want to make sure we're getting the right value. :)

-Kees

>
> In this patch, we move voffset.h creation code to boot/compressed/Makefile.
>
> Dependence was:
> boot/header.S ==> boot/voffset.h ==> vmlinux
> boot/header.S ==> compressed/vmlinux ==> compressed/misc.c
> Now become:
> boot/header.S ==> compressed/vmlinux ==> compressed/misc.c ==> boot/voffset.h ==> vmlinux
>
> Use macro in misc.c to replace passed run_size.
>
> Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
> Cc: Junjie Mao <eternal.n08@gmail.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/boot/Makefile            | 11 +----------
>  arch/x86/boot/compressed/Makefile | 12 ++++++++++++
>  arch/x86/boot/compressed/misc.c   |  3 +++
>  3 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
> index 57bbf2f..4d27e8b 100644
> --- a/arch/x86/boot/Makefile
> +++ b/arch/x86/boot/Makefile
> @@ -77,15 +77,6 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
>
>  SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
>
> -sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 0x\1/p'
> -
> -quiet_cmd_voffset = VOFFSET $@
> -      cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
> -
> -targets += voffset.h
> -$(obj)/voffset.h: vmlinux FORCE
> -       $(call if_changed,voffset)
> -
>  sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|z_.*\)$$/\#define ZO_\2 0x\1/p'
>
>  quiet_cmd_zoffset = ZOFFSET $@
> @@ -97,7 +88,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
>
>
>  AFLAGS_header.o += -I$(obj)
> -$(obj)/header.o: $(obj)/voffset.h $(obj)/zoffset.h
> +$(obj)/header.o: $(obj)/zoffset.h
>
>  LDFLAGS_setup.elf      := -T
>  $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 0a291cd..d9fee82 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -40,6 +40,18 @@ LDFLAGS_vmlinux := -T
>  hostprogs-y    := mkpiggy
>  HOST_EXTRACFLAGS += -I$(srctree)/tools/include
>
> +sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
> +
> +quiet_cmd_voffset = VOFFSET $@
> +      cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
> +
> +targets += ../voffset.h
> +
> +$(obj)/../voffset.h: vmlinux FORCE
> +       $(call if_changed,voffset)
> +
> +$(obj)/misc.o: $(obj)/../voffset.h
> +
>  vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
>         $(obj)/string.o $(obj)/cmdline.o \
>         $(obj)/piggy.o $(obj)/cpuflags.o
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index ebf72ce..a88b591 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -11,6 +11,7 @@
>
>  #include "misc.h"
>  #include "../string.h"
> +#include "../voffset.h"
>
>  /* WARNING!!
>   * This code is compiled with -fPIC and it is relocated dynamically
> @@ -393,6 +394,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>         lines = real_mode->screen_info.orig_video_lines;
>         cols = real_mode->screen_info.orig_video_cols;
>
> +       run_size = VO__end - VO__text;
> +
>         console_init();
>         debug_putstr("early console in decompress_kernel\n");
>
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code.
  2015-07-07 20:19 ` [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code Yinghai Lu
@ 2015-07-07 22:18   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:18 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: H. Peter Anvin, Baoquan He, LKML, Josh Triplett, Matt Fleming,
	Andrew Morton, Ard Biesheuvel, Junjie Mao

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> We use simple and correct version to get run_size now, remove code for
> wrong run_size calculation.

I feel like this should be merged with the prior patch, but I'm on the
fence. The prior patch fixes the calculation. Why leave the clean up
until here? I don't really mind it, I'm just curious why you chose to
separate this?

-Kees

>
> Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Junjie Mao <eternal.n08@gmail.com>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/boot/compressed/Makefile  |  4 +---
>  arch/x86/boot/compressed/head_32.S |  3 +--
>  arch/x86/boot/compressed/head_64.S |  3 ---
>  arch/x86/boot/compressed/misc.c    |  6 ++----
>  arch/x86/boot/compressed/mkpiggy.c |  9 ++------
>  arch/x86/tools/calc_run_size.sh    | 42 --------------------------------------
>  6 files changed, 6 insertions(+), 61 deletions(-)
>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index d9fee82..50daea7 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -104,10 +104,8 @@ suffix-$(CONFIG_KERNEL_XZ) := xz
>  suffix-$(CONFIG_KERNEL_LZO)    := lzo
>  suffix-$(CONFIG_KERNEL_LZ4)    := lz4
>
> -RUN_SIZE = $(shell $(OBJDUMP) -h vmlinux | \
> -            $(CONFIG_SHELL) $(srctree)/arch/x86/tools/calc_run_size.sh)
>  quiet_cmd_mkpiggy = MKPIGGY $@
> -      cmd_mkpiggy = $(obj)/mkpiggy $< $(RUN_SIZE) > $@ || ( rm -f $@ ; false )
> +      cmd_mkpiggy = $(obj)/mkpiggy $< > $@ || ( rm -f $@ ; false )
>
>  targets += piggy.S
>  $(obj)/piggy.S: $(obj)/vmlinux.bin.$(suffix-y) $(obj)/mkpiggy FORCE
> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 0c140f9..122b32f 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> @@ -210,7 +210,6 @@ relocated:
>   * Do the decompression, and jump to the new kernel..
>   */
>                                 /* push arguments for decompress_kernel: */
> -       pushl   $z_run_size     /* size of kernel with .bss and .brk */
>         pushl   $z_output_len   /* decompressed length, end of relocs */
>
>         movl    BP_init_size(%esi), %eax
> @@ -226,7 +225,7 @@ relocated:
>         pushl   %eax            /* heap area */
>         pushl   %esi            /* real mode pointer */
>         call    decompress_kernel /* returns kernel location in %eax */
> -       addl    $28, %esp
> +       addl    $24, %esp
>
>  /*
>   * Jump to the decompressed kernel.
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index 67dd8d3..3691451 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -407,8 +407,6 @@ relocated:
>   * Do the decompression, and jump to the new kernel..
>   */
>         pushq   %rsi                    /* Save the real mode argument */
> -       movq    $z_run_size, %r9        /* size of kernel with .bss and .brk */
> -       pushq   %r9
>         movq    %rsi, %rdi              /* real mode address */
>         leaq    boot_heap(%rip), %rsi   /* malloc area for uncompression */
>         leaq    input_data(%rip), %rdx  /* input_data */
> @@ -416,7 +414,6 @@ relocated:
>         movq    %rbp, %r8               /* output target address */
>         movq    $z_output_len, %r9      /* decompressed length, end of relocs */
>         call    decompress_kernel       /* returns kernel location in %rax */
> -       popq    %r9
>         popq    %rsi
>
>  /*
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index a88b591..96201aa 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -371,9 +371,9 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>                                   unsigned char *input_data,
>                                   unsigned long input_len,
>                                   unsigned char *output,
> -                                 unsigned long output_len,
> -                                 unsigned long run_size)
> +                                 unsigned long output_len)
>  {
> +       unsigned long run_size = VO__end - VO__text;
>         unsigned char *output_orig = output;
>
>         real_mode = rmode;
> @@ -394,8 +394,6 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>         lines = real_mode->screen_info.orig_video_lines;
>         cols = real_mode->screen_info.orig_video_cols;
>
> -       run_size = VO__end - VO__text;
> -
>         console_init();
>         debug_putstr("early console in decompress_kernel\n");
>
> diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
> index 5faad09..c03b009 100644
> --- a/arch/x86/boot/compressed/mkpiggy.c
> +++ b/arch/x86/boot/compressed/mkpiggy.c
> @@ -36,13 +36,11 @@ int main(int argc, char *argv[])
>         uint32_t olen;
>         long ilen;
>         unsigned long offs;
> -       unsigned long run_size;
>         FILE *f = NULL;
>         int retval = 1;
>
> -       if (argc < 3) {
> -               fprintf(stderr, "Usage: %s compressed_file run_size\n",
> -                               argv[0]);
> +       if (argc < 2) {
> +               fprintf(stderr, "Usage: %s compressed_file\n", argv[0]);
>                 goto bail;
>         }
>
> @@ -76,7 +74,6 @@ int main(int argc, char *argv[])
>         offs += olen >> 12;     /* Add 8 bytes for each 32K block */
>         offs += 64*1024 + 128;  /* Add 64K + 128 bytes slack */
>         offs = (offs+4095) & ~4095; /* Round to a 4K boundary */
> -       run_size = atoi(argv[2]);
>
>         printf(".section \".rodata..compressed\",\"a\",@progbits\n");
>         printf(".globl z_input_len\n");
> @@ -85,8 +82,6 @@ int main(int argc, char *argv[])
>         printf("z_output_len = %lu\n", (unsigned long)olen);
>         printf(".globl z_min_extract_offset\n");
>         printf("z_min_extract_offset = 0x%lx\n", offs);
> -       printf(".globl z_run_size\n");
> -       printf("z_run_size = %lu\n", run_size);
>
>         printf(".globl input_data, input_data_end\n");
>         printf("input_data:\n");
> diff --git a/arch/x86/tools/calc_run_size.sh b/arch/x86/tools/calc_run_size.sh
> deleted file mode 100644
> index 1a4c17b..0000000
> --- a/arch/x86/tools/calc_run_size.sh
> +++ /dev/null
> @@ -1,42 +0,0 @@
> -#!/bin/sh
> -#
> -# Calculate the amount of space needed to run the kernel, including room for
> -# the .bss and .brk sections.
> -#
> -# Usage:
> -# objdump -h a.out | sh calc_run_size.sh
> -
> -NUM='\([0-9a-fA-F]*[ \t]*\)'
> -OUT=$(sed -n 's/^[ \t0-9]*.b[sr][sk][ \t]*'"$NUM$NUM$NUM$NUM"'.*/\1\4/p')
> -if [ -z "$OUT" ] ; then
> -       echo "Never found .bss or .brk file offset" >&2
> -       exit 1
> -fi
> -
> -OUT=$(echo ${OUT# })
> -sizeA=$(printf "%d" 0x${OUT%% *})
> -OUT=${OUT#* }
> -offsetA=$(printf "%d" 0x${OUT%% *})
> -OUT=${OUT#* }
> -sizeB=$(printf "%d" 0x${OUT%% *})
> -OUT=${OUT#* }
> -offsetB=$(printf "%d" 0x${OUT%% *})
> -
> -run_size=$(( $offsetA + $sizeA + $sizeB ))
> -
> -# BFD linker shows the same file offset in ELF.
> -if [ "$offsetA" -ne "$offsetB" ] ; then
> -       # Gold linker shows them as consecutive.
> -       endB=$(( $offsetB + $sizeB ))
> -       if [ "$endB" != "$run_size" ] ; then
> -               printf "sizeA: 0x%x\n" $sizeA >&2
> -               printf "offsetA: 0x%x\n" $offsetA >&2
> -               printf "sizeB: 0x%x\n" $sizeB >&2
> -               printf "offsetB: 0x%x\n" $offsetB >&2
> -               echo ".bss and .brk are non-contiguous" >&2
> -               exit 1
> -       fi
> -fi
> -
> -printf "%d\n" $run_size
> -exit 0
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling
  2015-07-07 20:19 ` [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling Yinghai Lu
@ 2015-07-07 22:36   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:36 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> We are going to support kaslr with 64bit above 4G, and new random output
> buffer could be anywhere.
>
> mem_avoid array is used for kaslr to search new output buffer.
> Current code only track range that is after output+output_run_size.
>
> We need to track all range instead of just after output+output_run_size.
>
> Current code has first entry is extra bytes after input+input_size, and it
> is according to output_run_size. Other entries are for initrd, cmdline,
> and heap/stack for ZO running.
>
> At first, check the first entry that should be in the mem_avoid array.
>
> Now ZO sit end of the buffer always, we can find out where is ZO text
> and data/bss etc.
>                                                 output+run_size
>                                                       |
> 0   output               input      input+input_size  |     output+init_size
> |     |                    |               |          |          |
> |-----|-----------------|--|---------------|------|---|----------|
>                         |                         |
>                output+init_size-ZO_SIZE   output+output_size
>
> [output, output+init_size) is the buffer for decompress.
>
> [output, output+run_size) is for VO run size.
> [output, output+output_size) is (VO (vmlinux after objcopy) plus relocs)
>
> [output+init_size-ZO_SIZE, output+init_size) is copied ZO.
> [input, input+input_size) is copied compressed (VO (vmlinux after objcopy)
> plus relocs), not the ZO.
>
> [input+input_size, output+init_size) is [_text, _end) for ZO. that could be
> first range in mem_avoid.

This picture is great, thank you! I don't think it's correct, though.
In this picture, you have input and output overlapping. That only
happens after the output location has been chosen and then only if it
ended up in ASLR slot 0, meaning no relocation happened. Normally the
chosen output buffer is in some entirely different memory area.

> That new first entry already include heap and stack for ZO running.  So we
> don't need to put them separatedly into mem_avoid array.
>
> Also we need to put [input, input+input_size) in mem_avoid array, ant it
> is connected to first one, so merge them.
>
> At last we need to put boot_params into the mem_avoid too. As with 64bit bootloader
> could put it anywhere.
>
> After those changes, we have all range needed to be avoided in mem_avoid array.

I don't think we can remove the regions you're suggesting we remove. I
do think we have to add an avoid for the real_mode memory, though.
(Currently it gets avoided in most boot loaders when they load stuff
low due to the minimum relocation value that gets checked.)

I feel like I'm missing something. :)

Thanks!

-Kees

>
> Cc: Kees Cook <keescook@chromium.org>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/boot/compressed/aslr.c | 29 +++++++++++++----------------
>  1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
> index 0e1dac0..d753fb3 100644
> --- a/arch/x86/boot/compressed/aslr.c
> +++ b/arch/x86/boot/compressed/aslr.c
> @@ -109,7 +109,7 @@ struct mem_vector {
>         unsigned long size;
>  };
>
> -#define MEM_AVOID_MAX 5
> +#define MEM_AVOID_MAX 4
>  static struct mem_vector mem_avoid[MEM_AVOID_MAX];
>
>  static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
> @@ -135,21 +135,22 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>  }
>
>  static void mem_avoid_init(unsigned long input, unsigned long input_size,
> -                          unsigned long output, unsigned long output_run_size)
> +                          unsigned long output)
>  {
> +       unsigned long init_size = real_mode->hdr.init_size;
>         u64 initrd_start, initrd_size;
>         u64 cmd_line, cmd_line_size;
> -       unsigned long unsafe, unsafe_len;
>         char *ptr;
>
>         /*
>          * Avoid the region that is unsafe to overlap during
> -        * decompression (see calculations at top of misc.c).
> +        * decompression.
> +        * As we already move ZO (arch/x86/boot/compressed/vmlinux)
> +        * to the end of buffer, [input+input_size, output+init_size)
> +        * has [_text, _end) for ZO.
>          */
> -       unsafe_len = (output_run_size >> 12) + 32768 + 18;
> -       unsafe = (unsigned long)input + input_size - unsafe_len;
> -       mem_avoid[0].start = unsafe;
> -       mem_avoid[0].size = unsafe_len;
> +       mem_avoid[0].start = input;
> +       mem_avoid[0].size = (output + init_size) - input;
>
>         /* Avoid initrd. */
>         initrd_start  = (u64)real_mode->ext_ramdisk_image << 32;
> @@ -169,13 +170,9 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>         mem_avoid[2].start = cmd_line;
>         mem_avoid[2].size = cmd_line_size;
>
> -       /* Avoid heap memory. */
> -       mem_avoid[3].start = (unsigned long)free_mem_ptr;
> -       mem_avoid[3].size = BOOT_HEAP_SIZE;
> -
> -       /* Avoid stack memory. */
> -       mem_avoid[4].start = (unsigned long)free_mem_end_ptr;
> -       mem_avoid[4].size = BOOT_STACK_SIZE;
> +       /* Avoid params */
> +       mem_avoid[3].start = (unsigned long)real_mode;
> +       mem_avoid[3].size = sizeof(*real_mode);
>  }
>
>  /* Does this memory vector overlap a known avoided area? */
> @@ -319,7 +316,7 @@ unsigned char *choose_kernel_location(unsigned char *input,
>
>         /* Record the various known unsafe memory ranges. */
>         mem_avoid_init((unsigned long)input, input_size,
> -                      (unsigned long)output, output_run_size);
> +                      (unsigned long)output);
>
>         /* Walk e820 and find a random address. */
>         random = find_random_addr(choice, output_run_size);
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer
  2015-07-07 20:19 ` [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer Yinghai Lu
@ 2015-07-07 22:40   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:40 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> There is boundary checking for pointer in kaslr relocation handling.
>
> Current code is using output_len, and that is VO (vmlinux after objcopy)
> file size plus vmlinux.relocs file size.
>
> That is not right, as we should use loaded address for running.
>
> At that time parse_elf already move the sections according to ELF headers.
>
> The valid range should be VO [_text, __bss_start) loaded physical addresses.
>
> In the patch, add export for __bss_start to voffset.h and use it to get
> max_addr.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>

This seems correct, thanks!

Acked-by: Kees Cook <keescook@chromium.org>

> ---
>  arch/x86/boot/compressed/Makefile | 2 +-
>  arch/x86/boot/compressed/misc.c   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 50daea7..e12a93c 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -40,7 +40,7 @@ LDFLAGS_vmlinux := -T
>  hostprogs-y    := mkpiggy
>  HOST_EXTRACFLAGS += -I$(srctree)/tools/include
>
> -sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
> +sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
>
>  quiet_cmd_voffset = VOFFSET $@
>        cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index db97bdf..8fb74ba 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -234,7 +234,7 @@ static void handle_relocations(void *output, unsigned long output_len)
>         int *reloc;
>         unsigned long delta, map, ptr;
>         unsigned long min_addr = (unsigned long)output;
> -       unsigned long max_addr = min_addr + output_len;
> +       unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
>
>         /*
>          * Calculate the delta between where vmlinux was linked to load
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
  2015-07-07 20:19 ` [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G Yinghai Lu
@ 2015-07-07 22:42   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:42 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> From: Baoquan He <bhe@redhat.com>
>
> When process 32 bit relocation tables a local variable extended is
> defined to calculate the physical address of relocs entry. However
> it's type is int which is enough for i386, for x86_64 not enough.
> That's why relocation can only be handled when kernel is loaded
> below 2G, otherwise a overflow will happen and cause system hang.
>
> Here change it to long as 32 bit inverse relocation processing does,
> and this change is safe for i386 relocation handling too.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>

This looks right, thanks!

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  arch/x86/boot/compressed/misc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index 83f98a5..bfa4f0a 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -273,7 +273,7 @@ static void handle_relocations(void *output, unsigned long output_len)
>          * So we work backwards from the end of the decompressed image.
>          */
>         for (reloc = output + output_len - sizeof(*reloc); *reloc; reloc--) {
> -               int extended = *reloc;
> +               long extended = *reloc;
>                 extended += map;
>
>                 ptr = (unsigned long)extended;
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 22/42] x86, setup: Check early serial console per string instead of one char
  2015-07-07 20:20 ` [PATCH 22/42] x86, setup: Check early serial console per string instead of one char Yinghai Lu
@ 2015-07-07 22:59   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 22:59 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Move out serial_putchar() calling out of putchar
> Let puts() to call serial_putchar() directly.
>
> So only need to check early_serial_base per string.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>

If you're going to do this, I think putchar should be static, and all
callers should be switched to puts. There are callers that expect
putchar to send to both bios and serial.

-Kees

> ---
>  arch/x86/boot/tty.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/boot/tty.c b/arch/x86/boot/tty.c
> index def2451..114caea 100644
> --- a/arch/x86/boot/tty.c
> +++ b/arch/x86/boot/tty.c
> @@ -52,16 +52,22 @@ static void __attribute__((section(".inittext"))) bios_putchar(int ch)
>  void __attribute__((section(".inittext"))) putchar(int ch)
>  {
>         if (ch == '\n')
> -               putchar('\r');  /* \n -> \r\n */
> +               bios_putchar('\r');     /* \n -> \r\n */
>
>         bios_putchar(ch);
> -
> -       if (early_serial_base != 0)
> -               serial_putchar(ch);
>  }
>
>  void __attribute__((section(".inittext"))) puts(const char *str)
>  {
> +       if (early_serial_base) {
> +               const char *s = str;
> +               while (*s) {
> +                       if (*s == '\n')
> +                               serial_putchar('\r');
> +                       serial_putchar(*s++);
> +               }
> +       }
> +
>         while (*str)
>                 putchar(*str++);
>  }
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 38/42] x86: Fix typo in mark_rodata_ro
  2015-07-07 20:20 ` [PATCH 38/42] x86: Fix typo in mark_rodata_ro Yinghai Lu
@ 2015-07-07 23:05   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:05 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> In the comment, should use cleanup_highmap().
> and also remove not needed cast for _brk_end, as it is
> unsigned long.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/mm/init_64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 257ba4b..3b7453a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1054,9 +1054,9 @@ void mark_rodata_ro(void)
>          * of the PMD will remain mapped executable.
>          *
>          * Any PMD which was setup after the one which covers _brk_end
> -        * has been zapped already via cleanup_highmem().
> +        * has been zapped already via cleanup_highmap().
>          */
> -       all_end = roundup((unsigned long)_brk_end, PMD_SIZE);
> +       all_end = roundup(_brk_end, PMD_SIZE);
>         set_memory_nx(rodata_start, (all_end - rodata_start) >> PAGE_SHIFT);
>
>         rodata_test();

This should also fix the casts in xen/mmu.c, kernel/setup.c, and the
earlier cast in mm/init_64.c.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update
  2015-07-07 20:20 ` [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update Yinghai Lu
@ 2015-07-07 23:08   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:08 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> When using kexec with 64bit kernel, bzImage and ramdisk could be
> loaded above 4G. We need this to get correct ramdisk adress.
>
> Make get_ramdisk_image() global and use it for early microcode updating.

This looks correct, thanks!

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

>
> -v2: update changelog.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/include/asm/setup.h                |  3 +++
>  arch/x86/kernel/cpu/microcode/amd_early.c   | 10 +++++-----
>  arch/x86/kernel/cpu/microcode/intel_early.c |  8 ++++----
>  arch/x86/kernel/setup.c                     | 28 ++++++++++++++--------------
>  4 files changed, 26 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> index 3e5aa41..496515b 100644
> --- a/arch/x86/include/asm/setup.h
> +++ b/arch/x86/include/asm/setup.h
> @@ -119,6 +119,9 @@ void *extend_brk(size_t size, size_t align);
>         RESERVE_BRK(name, sizeof(type) * entries)
>
>  extern void probe_roms(void);
> +u64 get_ramdisk_image(struct boot_params *bp);
> +u64 get_ramdisk_size(struct boot_params *bp);
> +
>  #ifdef __i386__
>
>  asmlinkage void __init i386_start_kernel(void);
> diff --git a/arch/x86/kernel/cpu/microcode/amd_early.c b/arch/x86/kernel/cpu/microcode/amd_early.c
> index e8a215a..4c579c7 100644
> --- a/arch/x86/kernel/cpu/microcode/amd_early.c
> +++ b/arch/x86/kernel/cpu/microcode/amd_early.c
> @@ -51,12 +51,12 @@ static struct cpio_data __init find_ucode_in_initrd(void)
>          */
>         p       = (struct boot_params *)__pa_nodebug(&boot_params);
>         path    = (char *)__pa_nodebug(ucode_path);
> -       start   = (void *)p->hdr.ramdisk_image;
> -       size    = p->hdr.ramdisk_size;
> +       start   = (void *)(unsigned long)get_ramdisk_image(p);
> +       size    = get_ramdisk_size(p);
>  #else
>         path    = ucode_path;
> -       start   = (void *)(boot_params.hdr.ramdisk_image + PAGE_OFFSET);
> -       size    = boot_params.hdr.ramdisk_size;
> +       start   = (void *)(get_ramdisk_image(&boot_params) + PAGE_OFFSET);
> +       size    = get_ramdisk_size(&boot_params);
>  #endif
>
>         return find_cpio_data(path, start, size, &offset);
> @@ -396,7 +396,7 @@ int __init save_microcode_in_initrd_amd(void)
>          */
>         if (relocated_ramdisk)
>                 container = (u8 *)(__va(relocated_ramdisk) +
> -                            (cont - boot_params.hdr.ramdisk_image));
> +                            (cont - get_ramdisk_size(&boot_params)));
>         else
>                 container = cont_va;
>
> diff --git a/arch/x86/kernel/cpu/microcode/intel_early.c b/arch/x86/kernel/cpu/microcode/intel_early.c
> index 8187b72..c85dcb2 100644
> --- a/arch/x86/kernel/cpu/microcode/intel_early.c
> +++ b/arch/x86/kernel/cpu/microcode/intel_early.c
> @@ -736,16 +736,16 @@ void __init load_ucode_intel_bsp(void)
>         struct boot_params *p;
>
>         p       = (struct boot_params *)__pa_nodebug(&boot_params);
> -       start   = p->hdr.ramdisk_image;
> -       size    = p->hdr.ramdisk_size;
> +       start   = get_ramdisk_image(p);
> +       size    = get_ramdisk_size(p);
>
>         _load_ucode_intel_bsp(
>                         (struct mc_saved_data *)__pa_nodebug(&mc_saved_data),
>                         (unsigned long *)__pa_nodebug(&mc_saved_in_initrd),
>                         start, size);
>  #else
> -       start   = boot_params.hdr.ramdisk_image + PAGE_OFFSET;
> -       size    = boot_params.hdr.ramdisk_size;
> +       start   = get_ramdisk_image(&boot_params) + PAGE_OFFSET;
> +       size    = get_ramdisk_size(&boot_params);
>
>         _load_ucode_intel_bsp(&mc_saved_data, mc_saved_in_initrd, start, size);
>  #endif
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 80f874b..2d808e6 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -300,19 +300,19 @@ u64 relocated_ramdisk;
>
>  #ifdef CONFIG_BLK_DEV_INITRD
>
> -static u64 __init get_ramdisk_image(void)
> +u64 __init get_ramdisk_image(struct boot_params *bp)
>  {
> -       u64 ramdisk_image = boot_params.hdr.ramdisk_image;
> +       u64 ramdisk_image = bp->hdr.ramdisk_image;
>
> -       ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
> +       ramdisk_image |= (u64)bp->ext_ramdisk_image << 32;
>
>         return ramdisk_image;
>  }
> -static u64 __init get_ramdisk_size(void)
> +u64 __init get_ramdisk_size(struct boot_params *bp)
>  {
> -       u64 ramdisk_size = boot_params.hdr.ramdisk_size;
> +       u64 ramdisk_size = bp->hdr.ramdisk_size;
>
> -       ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
> +       ramdisk_size |= (u64)bp->ext_ramdisk_size << 32;
>
>         return ramdisk_size;
>  }
> @@ -321,8 +321,8 @@ static u64 __init get_ramdisk_size(void)
>  static void __init relocate_initrd(void)
>  {
>         /* Assume only end is not page aligned */
> -       u64 ramdisk_image = get_ramdisk_image();
> -       u64 ramdisk_size  = get_ramdisk_size();
> +       u64 ramdisk_image = get_ramdisk_image(&boot_params);
> +       u64 ramdisk_size  = get_ramdisk_size(&boot_params);
>         u64 area_size     = PAGE_ALIGN(ramdisk_size);
>         unsigned long slop, clen, mapaddr;
>         char *p, *q;
> @@ -360,8 +360,8 @@ static void __init relocate_initrd(void)
>                 ramdisk_size  -= clen;
>         }
>
> -       ramdisk_image = get_ramdisk_image();
> -       ramdisk_size  = get_ramdisk_size();
> +       ramdisk_image = get_ramdisk_image(&boot_params);
> +       ramdisk_size  = get_ramdisk_size(&boot_params);
>         printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
>                 " [mem %#010llx-%#010llx]\n",
>                 ramdisk_image, ramdisk_image + ramdisk_size - 1,
> @@ -371,8 +371,8 @@ static void __init relocate_initrd(void)
>  static void __init early_reserve_initrd(void)
>  {
>         /* Assume only end is not page aligned */
> -       u64 ramdisk_image = get_ramdisk_image();
> -       u64 ramdisk_size  = get_ramdisk_size();
> +       u64 ramdisk_image = get_ramdisk_image(&boot_params);
> +       u64 ramdisk_size  = get_ramdisk_size(&boot_params);
>         u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
>
>         if (!boot_params.hdr.type_of_loader ||
> @@ -384,8 +384,8 @@ static void __init early_reserve_initrd(void)
>  static void __init reserve_initrd(void)
>  {
>         /* Assume only end is not page aligned */
> -       u64 ramdisk_image = get_ramdisk_image();
> -       u64 ramdisk_size  = get_ramdisk_size();
> +       u64 ramdisk_image = get_ramdisk_image(&boot_params);
> +       u64 ramdisk_size  = get_ramdisk_size(&boot_params);
>         u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
>         u64 mapped_size;
>
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G
  2015-07-07 20:20 ` [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G Yinghai Lu
@ 2015-07-07 23:12   ` Kees Cook
  2015-07-08 18:00     ` Matt Fleming
  0 siblings, 1 reply; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:12 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Now could use kexec to place kernel/boot_params/cmd_line/initrd
> above 4G, but that is with legacy interface with startup_64 directly.
>
> This patch will allow 64bit EFI kernel to be loaded above 4G
> and use EFI HANDOVER PROTOCOL to start the kernel.
>
> Current 32bit code32_start is used for passing around load address,
> so it will overflow when kernel is loaded abover 4G.
>
> The patch mainly add ext_code32_start to take load address high 32bits.
>
> After this patch, could use patched grub2-x86_64.efi to place
> kernel/boot_params/cmd_line/initrd all above 4G and execute the kernel
> above 4G.
>
> bootlog like:
>
> kernel: done                           [ linux  9.25MiB  100%  6.66MiB/s ]
> params: [1618fc000,1618fffff]
> cmdline: [1618fb000,1618fb7fe]
> kernel: [15e000000,161385fff]
> initrd: [15bcbe000,15dffffbb]
> initrd: 1 file done             [ initrd.img  35.26MiB  100%  11.93MiB/s ]
> early console in decompress_kernel
> decompress_kernel:
>   input: [0x15fd0b3b4-0x16063c803], output: 0x15e000000, heap: [0x160645b00-0x16064daff]
>
> Decompressing Linux... xz... Parsing ELF... done.
> Booting the kernel.
> [    0.000000] bootconsole [uart0] enabled
> [    0.000000]    real_mode_data :      phys 00000001618fc000
> [    0.000000]    real_mode_data :      virt ffff8801618fc000
> [    0.000000] Kernel Layout:
> [    0.000000]   .text: [0x15e000000-0x15f08f72c]
> [    0.000000] .rodata: [0x15f200000-0x15fa44fff]
> [    0.000000]   .data: [0x15fc00000-0x15fe545ff]
> [    0.000000]   .init: [0x15fe56000-0x16021afff]
> [    0.000000]    .bss: [0x160229000-0x16135ffff]
> [    0.000000]    .brk: [0x161360000-0x161385fff]
> [    0.000000] memblock_reserve: [0x0000000009f000-0x000000000fffff] flags 0x0 * BIOS reserved
> ...
> [    0.000000] memblock_reserve: [0x0000015e000000-0x0000016135ffff] flags 0x0 TEXT DATA BSS
> [    0.000000] memblock_reserve: [0x0000015bcbe000-0x0000015dffffff] flags 0x0 RAMDISK
>
> -v2: add cast to avoid warning with 32bit, also update description for
>      ext_code32_start in boot.txt
> -v3: change to 4.0 from 3.20.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  Documentation/x86/boot.txt            | 19 +++++++++++++++++++
>  arch/x86/boot/compressed/eboot.c      | 15 ++++++++++-----
>  arch/x86/boot/compressed/head_64.S    |  7 ++++++-
>  arch/x86/boot/header.S                |  3 ++-
>  arch/x86/include/uapi/asm/bootparam.h |  1 +
>  arch/x86/kernel/asm-offsets.c         |  1 +
>  6 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> index 9da6f35..90efaa2 100644
> --- a/Documentation/x86/boot.txt
> +++ b/Documentation/x86/boot.txt
> @@ -61,6 +61,9 @@ Protocol 2.12:        (Kernel 3.8) Added the xloadflags field and extension fields
>                 to struct boot_params for loading bzImage and ramdisk
>                 above 4G in 64bit.
>
> +Protocol 2.14: (Kernel 4.0) Added the ext_code32_start to support 64bit
> +               EFI kernel to be loaded above 4G.
> +

Should be at least kernel 4.3.

>  **** MEMORY LAYOUT
>
>  The traditional memory map for the kernel loader, used for Image or
> @@ -197,6 +200,7 @@ Offset      Proto   Name            Meaning
>  0258/8 2.10+   pref_address    Preferred loading address
>  0260/4 2.10+   init_size       Linear memory required during initialization
>  0264/4 2.11+   handover_offset Offset of handover entry point
> +0268/4 2.14+   ext_code32_start        Extended part for code32_start
>
>  (1) For backwards compatibility, if the setup_sects field contains 0, the
>      real value is 4.
> @@ -744,6 +748,14 @@ Offset/size:       0x264/4
>
>    See EFI HANDOVER PROTOCOL below for more details.
>
> +Field name:    ext_code32_start
> +Type:          modify (optional, reloc)
> +Offset/size:   0x268/4
> +Protocol:      2.14+
> +
> +  This field is the upper 32bits of load address when EFI 64bit kernel
> +  is loaded above 4G. And it is used with code32_start to compare to
> +  pref_address to decide if kernel need to be relocated further.
>
>  **** THE IMAGE CHECKSUM
>
> @@ -1127,4 +1139,11 @@ The boot loader *must* fill out the following fields in bp,
>      o hdr.ramdisk_image (if applicable)
>      o hdr.ramdisk_size  (if applicable)
>
> +for 64bit, when loading above 4G, *must* fill out the following fields,
> +
> +    o hdr.ext_code32_start
> +    o ext_cmd_line_ptr
> +    o ext_ramdisk_image (if applicable)
> +    o ext_ramdisk_size  (if applicable)
> +
>  All other fields should be zero.
> diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
> index 2c82bd1..05d77a5 100644
> --- a/arch/x86/boot/compressed/eboot.c
> +++ b/arch/x86/boot/compressed/eboot.c
> @@ -1394,6 +1394,7 @@ struct boot_params *efi_main(struct efi_config *c,
>         void *handle;
>         efi_system_table_t *_table;
>         bool is64;
> +       unsigned long loaded_addr;
>
>         efi_early = c;
>
> @@ -1435,9 +1436,12 @@ struct boot_params *efi_main(struct efi_config *c,
>          * If the kernel isn't already loaded at the preferred load
>          * address, relocate it.
>          */
> -       if (hdr->pref_address != hdr->code32_start) {
> -               unsigned long bzimage_addr = hdr->code32_start;
> -               status = efi_relocate_kernel(sys_table, &bzimage_addr,
> +       loaded_addr = hdr->code32_start;
> +       loaded_addr |= (unsigned long)((u64)hdr->ext_code32_start << 32);
> +       if (hdr->pref_address != loaded_addr) {
> +               unsigned long loaded_addr_orig = loaded_addr;
> +
> +               status = efi_relocate_kernel(sys_table, &loaded_addr,
>                                              hdr->init_size, hdr->init_size,
>                                              hdr->pref_address,
>                                              hdr->kernel_alignment);
> @@ -1446,8 +1450,9 @@ struct boot_params *efi_main(struct efi_config *c,
>                         goto fail;
>                 }
>
> -               hdr->pref_address = hdr->code32_start;
> -               hdr->code32_start = bzimage_addr;
> +               hdr->pref_address = loaded_addr_orig;
> +               hdr->code32_start = loaded_addr & 0xffffffff;
> +               hdr->ext_code32_start = (unsigned long)((u64)loaded_addr >> 32);
>         }
>
>         status = exit_boot(boot_params, handle, is64);
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index 075bb15..ab52d2c 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -266,6 +266,8 @@ ENTRY(efi_pe_entry)
>         mov     %rax, %rsi
>         leaq    startup_32(%rip), %rax
>         movl    %eax, BP_code32_start(%rsi)
> +       shr     $32, %rax
> +       movl    %eax, BP_ext_code32_start(%rsi)
>         jmp     2f              /* Skip the relocation */
>
>  handover_entry:
> @@ -289,7 +291,10 @@ fail:
>         hlt
>         jmp     fail
>  2:
> -       movl    BP_code32_start(%esi), %eax
> +       movl    BP_code32_start(%rsi), %eax
> +       movl    BP_ext_code32_start(%rsi), %ebx
> +       shl     $32, %rbx
> +       orq     %rbx, %rax
>         leaq    preferred_addr(%rax), %rax
>         jmp     *%rax
>
> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> index 99204e5..09e7c69 100644
> --- a/arch/x86/boot/header.S
> +++ b/arch/x86/boot/header.S
> @@ -301,7 +301,7 @@ _start:
>         # Part 2 of the header, from the old setup.S
>
>                 .ascii  "HdrS"          # header signature
> -               .word   0x020d          # header version number (>= 0x0105)
> +               .word   0x020e          # header version number (>= 0x0105)
>                                         # or else old loadlin-1.5 will fail)
>                 .globl realmode_swtch
>  realmode_swtch:        .word   0, 0            # default_switch, SETUPSEG
> @@ -478,6 +478,7 @@ pref_address:               .quad LOAD_PHYSICAL_ADDR        # preferred load addr
>  #endif
>  init_size:             .long INIT_SIZE         # kernel initialization size
>  handover_offset:       .long 0                 # Filled in by build.c
> +ext_code32_start:      .long 0                 # werid one!

Comment is a typo?

>
>  # End of setup header #####################################################
>
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index ab456dc..bb9973d 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -84,6 +84,7 @@ struct setup_header {
>         __u64   pref_address;
>         __u32   init_size;
>         __u32   handover_offset;
> +       __u32   ext_code32_start;
>  } __attribute__((packed));
>
>  struct sys_desc_table {
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index d2e00bc..3f9789f 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -90,6 +90,7 @@ void common(void) {
>         OFFSET(BP_init_size, boot_params, hdr.init_size);
>         OFFSET(BP_pref_address, boot_params, hdr.pref_address);
>         OFFSET(BP_code32_start, boot_params, hdr.code32_start);
> +       OFFSET(BP_ext_code32_start, boot_params, hdr.ext_code32_start);
>
>         BLANK();
>         DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
> --
> 1.8.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 25/42] x86, boot: print compression suffix in decompress stage
  2015-07-07 20:20 ` [PATCH 25/42] x86, boot: print compression suffix in decompress stage Yinghai Lu
@ 2015-07-07 23:13   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:13 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/boot/compressed/misc.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index a428c03..9266f78 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -120,26 +120,32 @@ static int lines, cols;
>
>  #ifdef CONFIG_KERNEL_GZIP
>  #include "../../../../lib/decompress_inflate.c"
> +static char *suffix_str = "gz";
>  #endif
>
>  #ifdef CONFIG_KERNEL_BZIP2
>  #include "../../../../lib/decompress_bunzip2.c"
> +static char *suffix_str = "bz2";
>  #endif
>
>  #ifdef CONFIG_KERNEL_LZMA
>  #include "../../../../lib/decompress_unlzma.c"
> +static char *suffix_str = "lzma";
>  #endif
>
>  #ifdef CONFIG_KERNEL_XZ
>  #include "../../../../lib/decompress_unxz.c"
> +static char *suffix_str = "xz";
>  #endif
>
>  #ifdef CONFIG_KERNEL_LZO
>  #include "../../../../lib/decompress_unlzo.c"
> +static char *suffix_str = "lzo";
>  #endif
>
>  #ifdef CONFIG_KERNEL_LZ4
>  #include "../../../../lib/decompress_unlz4.c"
> +static char *suffix_str = "lz4";
>  #endif

I like the idea!

>
>  static void scroll(void)
> @@ -486,6 +492,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>                 (unsigned long)input_data,
>                 (unsigned long)input_data + input_len - 1);
>         debug_putstr("\nDecompressing Linux... ");
> +       debug_putstr(suffix_str);
> +       debug_putstr("... ");

I wouldn't repeat the "...". Maybe remove "..." from the Decompressing putstr?

-Kees

>         decompress(input_data, input_len, NULL, NULL, output, NULL, error);
>         parse_elf(output);
>         handle_relocations(output, output_len, virt_offset);
> --
> 1.8.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 26/42] x86: remove not needed clear_page calling
  2015-07-07 20:20 ` [PATCH 26/42] x86: remove not needed clear_page calling Yinghai Lu
@ 2015-07-07 23:14   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:14 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> remove not needed clear_page for init_level4_page in x86_64_start_kernel(),
> as it is with fill 512,8,0 already in head_64.S

Will all possible entry points have come through head_64.S?

-Kees

>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/kernel/head64.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 44dc63b..a9f0299 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -178,7 +178,6 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
>          */
>         load_ucode_bsp();
>
> -       clear_page(init_level4_pgt);
>         /* set init_level4_pgt kernel high mapping*/
>         init_level4_pgt[511] = early_level4_pgt[511];
>
> --
> 1.8.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges
  2015-07-07 20:20 ` [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges Yinghai Lu
@ 2015-07-07 23:17   ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:17 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:20 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> add cleanup_highmap_late to remove highmap for initmem, around rodata, and
> [_brk_end, all_end).
>
> Kernel Layout:
>
> [    0.000000]   .text: [0x01000000-0x0200df88]
> [    0.000000] .rodata: [0x02200000-0x02a1dfff]
> [    0.000000]   .data: [0x02c00000-0x02e510ff]
> [    0.000000]   .init: [0x02e53000-0x03213fff]
> [    0.000000]    .bss: [0x03222000-0x0437cfff]
> [    0.000000]    .brk: [0x0437d000-0x043a2fff]
>
> Actually used brk:
> [    0.270365] memblock_reserve: [0x0000000437d000-0x00000004383fff] flags 0x0 BRK
>
> Before patch:
> ---[ High Kernel Mapping ]---
> 0xffffffff80000000-0xffffffff81000000          16M                           pmd
> 0xffffffff81000000-0xffffffff82000000          16M     ro         PSE GLB x  pmd
> 0xffffffff82000000-0xffffffff82011000          68K     ro             GLB x  pte
> 0xffffffff82011000-0xffffffff82200000        1980K     RW             GLB x  pte

What change introduced this RW + x area? I don't see any of those
currently in my page tables.

-Kees

> 0xffffffff82200000-0xffffffff82a00000           8M     ro         PSE GLB NX pmd
> 0xffffffff82a00000-0xffffffff82a1e000         120K     ro             GLB NX pte
> 0xffffffff82a1e000-0xffffffff82c00000        1928K     RW             GLB NX pte
> 0xffffffff82c00000-0xffffffff82e00000           2M     RW         PSE GLB NX pmd
> 0xffffffff82e00000-0xffffffff83000000           2M     RW             GLB NX pte
> 0xffffffff83000000-0xffffffff83200000           2M     RW         PSE GLB NX pmd
> 0xffffffff83200000-0xffffffff83400000           2M     RW             GLB NX pte
> 0xffffffff83400000-0xffffffff84400000          16M     RW         PSE GLB NX pmd
> 0xffffffff84400000-0xffffffffa0000000         444M                           pmd
>
> After patch:
> ---[ High Kernel Mapping ]---
> 0xffffffff80000000-0xffffffff81000000          16M                           pmd
> 0xffffffff81000000-0xffffffff82000000          16M     ro         PSE GLB x  pmd
> 0xffffffff82000000-0xffffffff82012000          72K     ro             GLB x  pte
> 0xffffffff82012000-0xffffffff82200000        1976K                           pte
> 0xffffffff82200000-0xffffffff82a00000           8M     ro         PSE GLB NX pmd
> 0xffffffff82a00000-0xffffffff82a1e000         120K     ro             GLB NX pte
> 0xffffffff82a1e000-0xffffffff82c00000        1928K                           pte
> 0xffffffff82c00000-0xffffffff82e00000           2M     RW         PSE GLB NX pmd
> 0xffffffff82e00000-0xffffffff82e53000         332K     RW             GLB NX pte
> 0xffffffff82e53000-0xffffffff83000000        1716K                           pte
> 0xffffffff83000000-0xffffffff83200000           2M                           pmd
> 0xffffffff83200000-0xffffffff83214000          80K                           pte
> 0xffffffff83214000-0xffffffff83400000        1968K     RW             GLB NX pte
> 0xffffffff83400000-0xffffffff84200000          14M     RW         PSE GLB NX pmd
> 0xffffffff84200000-0xffffffff84384000        1552K     RW             GLB NX pte
> 0xffffffff84384000-0xffffffff84400000         496K                           pte
> 0xffffffff84400000-0xffffffffa0000000         444M                           pmd
>
> So remove some range around rodata.
>
> -v4: adapt it to all_end change.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/mm/init_64.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2507b98..38aa59c 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1010,6 +1010,61 @@ void __init mem_init(void)
>  }
>
>  #ifdef CONFIG_DEBUG_RODATA
> +static void remove_highmap_2m(unsigned long addr)
> +{
> +       pgd_t *pgd = pgd_offset_k(addr);
> +       pud_t *pud = (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
> +       pmd_t *pmd = (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
> +
> +       set_pmd(pmd, __pmd(0));
> +}
> +
> +static void remove_highmap_2m_partial(unsigned long addr, unsigned long end)
> +{
> +       int i;
> +       pgd_t *pgd = pgd_offset_k(addr);
> +       pud_t *pud = (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
> +       pmd_t *pmd = (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
> +       pte_t *pte = (pte_t *)pmd_page_vaddr(*pmd) + pte_index(addr);
> +
> +       for (i = pte_index(addr); i < pte_index(end - 1) + 1; i++, pte++)
> +               set_pte(pte, __pte(0));
> +}
> +
> +static void cleanup_highmap_late(unsigned long start, unsigned long end)
> +{
> +       unsigned long addr;
> +       unsigned long start_2m_aligned = roundup(start, PMD_SIZE);
> +       unsigned long end_2m_aligned = rounddown(end, PMD_SIZE);
> +
> +       start = PFN_ALIGN(start);
> +       end &= PAGE_MASK;
> +
> +       if (start >= end)
> +               return;
> +
> +       if (start < start_2m_aligned) {
> +               unsigned long tmp = min(start_2m_aligned, end);
> +
> +               set_memory_4k(start, (tmp - start) >> PAGE_SHIFT);
> +               remove_highmap_2m_partial(start, tmp);
> +       }
> +
> +       for (addr = start_2m_aligned; addr < end_2m_aligned; addr += PMD_SIZE)
> +               remove_highmap_2m(addr);
> +
> +       if (start <= end_2m_aligned && end_2m_aligned < end) {
> +               set_memory_4k(end_2m_aligned,
> +                               (end - end_2m_aligned) >> PAGE_SHIFT);
> +               remove_highmap_2m_partial(end_2m_aligned, end);
> +       }
> +
> +       subtract_range(pfn_highmapped, NR_RANGE,
> +                       __pa_symbol(start) >> PAGE_SHIFT,
> +                       __pa_symbol(end) >> PAGE_SHIFT);
> +       nr_pfn_highmapped = clean_sort_range(pfn_highmapped, NR_RANGE);
> +}
> +
>  const int rodata_test_data = 0xC3;
>  EXPORT_SYMBOL_GPL(rodata_test_data);
>
> @@ -1058,6 +1113,7 @@ void mark_rodata_ro(void)
>         unsigned long end = (unsigned long) &__end_rodata_hpage_align;
>         unsigned long text_end = PFN_ALIGN(&__stop___ex_table);
>         unsigned long rodata_end = PFN_ALIGN(&__end_rodata);
> +       unsigned long data_start = PFN_ALIGN(&_sdata);
>         unsigned long all_end;
>
>         printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
> @@ -1081,6 +1137,12 @@ void mark_rodata_ro(void)
>         all_end = roundup(_brk_end, PMD_SIZE);
>         set_memory_nx(rodata_start, (all_end - rodata_start) >> PAGE_SHIFT);
>
> +       cleanup_highmap_late(text_end, rodata_start);
> +       cleanup_highmap_late(rodata_end, data_start);
> +       cleanup_highmap_late(PFN_ALIGN(_brk_end), all_end);
> +       cleanup_highmap_late((unsigned long)(&__init_begin),
> +                               (unsigned long)(&__init_end));
> +
>         rodata_test();
>
>  #ifdef CONFIG_CPA_DEBUG
> --
> 1.8.4.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (41 preceding siblings ...)
  2015-07-07 20:20 ` [PATCH 42/42] x86: fix msr print again Yinghai Lu
@ 2015-07-07 23:21 ` Kees Cook
  2015-10-02 20:16   ` Kees Cook
  2015-07-08 10:51 ` Ingo Molnar
  43 siblings, 1 reply; 79+ messages in thread
From: Kees Cook @ 2015-07-07 23:21 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
> by Ingo on changelog.
>
> Kees Cook said that he would like to give a try to make improvement on changelog
> to get things moving.

Thanks for working on this! I think it might be best to split this
long series into shorter ones. It seems like there are several areas:

- fixing kASLR
- extended kASLR above 4G
- setup_data cleanup
- various other cleanups

It might make sense to keep them separate for easier review?

-Kees

>
> First part are kaslr related:
> 1. First put compressed kernel ZO near end of the buffer before decompressing
> so we can find the ZO position easily for kaslr buffer searchin
> 2. kill run_size calculation shell scripts.
> 3. create new ident mapping for kasl 64bit, so we can cover
>    above 4G random kernel base
> 4. 7 patches from He that support random random, as I already used his patches
>    to test the ident mapping code.
> 5. some debug patches for boot/kaslr.
>
> Second part are setup_data related:
> Now setup_data is reserved via memblock and e820 and different
> handlers have different ways, and it is confusing.
> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
>         have memory wasted.
> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
>         have memory wasted.
> 3. SETUP_DTB: is copied locally.
>         have memory wasted.
> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
> Also setup_data is exported to debugfs for debug purpose.
> Here will convert to let every handler to decide how to handle it.
> and will not reserve the setup_data generally, so will not
> waste memory and also make memblock/e820 keep page aligned.
> 1. not touch E820 anymore.
> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
> 3. SETUP_DTB: reserver and copy to local and free.
> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
> 5. export SETUP_PCI via sysfs.
>
> Third part are some small cleanup patches.
>
> put those patches at
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
>
> Thanks
>
> Yinghai
>
>
> Baoquan He (7):
>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>   x86, kaslr: Add two functions which will be used later
>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>   x86, kaslr: Randomize physical and virtual address of kernel separately
>   x86, kaslr: Add support of kernel physical address randomization above 4G
>   x86, kaslr: Remove useless codes
>
> Yinghai Lu (35):
>   x86, kasl: Remove not needed parameter for choose_kernel_location
>   x86, boot: Move compressed kernel to end of buffer before decompressing
>   x86, boot: Fix run_size calculation
>   x86, kaslr: Kill not needed and wrong run_size calculation code.
>   x86, kaslr: rename output_size to output_run_size
>   x86, kaslr: Consolidate mem_avoid array filling
>   x86, boot: Move z_extract_offset calculation to header.S
>   x86, kaslr: Get correct max_addr for relocs pointer
>   x86, boot: Split kernel_ident_mapping_init to another file
>   x86, 64bit: Set ident_mapping for kaslr
>   x86, boot: Add checking for memcpy
>   x86, kaslr: Allow random address could be below loaded address
>   x86, boot: Add printf support for early console in compressed/misc.c
>   x86, boot: Add more debug printout in compressed/misc.c
>   x86, setup: Check early serial console per string instead of one char
>   x86, setup: Use puts() instead of printf() in edd code
>   x86: Setup early console as early as possible in x86_start_kernel()
>   x86, boot: print compression suffix in decompress stage
>   x86: remove not needed clear_page calling
>   x86: restore end_of_ram to E820_RAM
>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
>   x86: Find correct 64 bit ramdisk address for microcode early update
>   x86: Kill E820_RESERVED_KERN
>   x86, efi: Copy SETUP_EFI data and access directly
>   x86, of: Let add_dtb reserve setup_data locally
>   x86, boot: Add add_pci handler for SETUP_PCI
>   x86: Kill not used setup_data handling code
>   x86, boot, PCI: Convert SETUP_PCI data to list
>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
>   x86, boot, PCI: Export SETUP_PCI data via sysfs
>   x86: Fix typo in mark_rodata_ro
>   x86, 64bit: add pfn_range_is_highmapped()
>   x86, 64bit: remove highmap for not needed ranges
>   x86, 64bit: Add __pa_high/__va_high
>   x86: fix msr print again
>
>  Documentation/x86/boot.txt                  |  19 ++
>  arch/x86/boot/Makefile                      |  13 +-
>  arch/x86/boot/compressed/Makefile           |  21 +-
>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
>  arch/x86/boot/compressed/eboot.c            |  15 +-
>  arch/x86/boot/compressed/head_32.S          |  14 +-
>  arch/x86/boot/compressed/head_64.S          |  22 +-
>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
>  arch/x86/boot/compressed/misc.h             |  41 +++-
>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
>  arch/x86/boot/compressed/printf.c           |   5 +
>  arch/x86/boot/compressed/string.c           |  28 ++-
>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
>  arch/x86/boot/edd.c                         |   4 +-
>  arch/x86/boot/header.S                      |  34 ++-
>  arch/x86/boot/tty.c                         |  14 +-
>  arch/x86/include/asm/boot.h                 |  19 ++
>  arch/x86/include/asm/efi.h                  |   2 +-
>  arch/x86/include/asm/page.h                 |   5 +
>  arch/x86/include/asm/pci.h                  |   4 +
>  arch/x86/include/asm/pgtable_64.h           |   2 +
>  arch/x86/include/asm/processor.h            |   1 -
>  arch/x86/include/asm/prom.h                 |   9 +-
>  arch/x86/include/asm/setup.h                |   5 +
>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
>  arch/x86/include/uapi/asm/e820.h            |   8 -
>  arch/x86/kernel/asm-offsets.c               |   2 +
>  arch/x86/kernel/cpu/common.c                |  61 +++---
>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
>  arch/x86/kernel/devicetree.c                |  39 ++--
>  arch/x86/kernel/e820.c                      |  18 +-
>  arch/x86/kernel/head.c                      |  26 +++
>  arch/x86/kernel/head32.c                    |   1 +
>  arch/x86/kernel/head64.c                    |  21 +-
>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
>  arch/x86/kernel/setup.c                     |  79 ++-----
>  arch/x86/kernel/tboot.c                     |   3 +-
>  arch/x86/kernel/vmlinux.lds.S               |   1 +
>  arch/x86/mm/ident_map.c                     |  74 +++++++
>  arch/x86/mm/init_64.c                       | 173 +++++++--------
>  arch/x86/mm/pageattr.c                      |  16 +-
>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
>  arch/x86/platform/efi/efi.c                 |  13 +-
>  arch/x86/platform/efi/efi_64.c              |  10 +-
>  arch/x86/platform/efi/quirks.c              |  23 +-
>  arch/x86/tools/calc_run_size.sh             |  42 ----
>  drivers/tty/serial/8250/8250_early.c        |  17 ++
>  kernel/printk/printk.c                      |  11 +-
>  50 files changed, 1235 insertions(+), 661 deletions(-)
>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
>  create mode 100644 arch/x86/boot/compressed/printf.c
>  create mode 100644 arch/x86/mm/ident_map.c
>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>
> --
> 1.8.4.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
                   ` (42 preceding siblings ...)
  2015-07-07 23:21 ` [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Kees Cook
@ 2015-07-08 10:51 ` Ingo Molnar
  43 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-07-08 10:51 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel,
	Thomas Gleixner


* Yinghai Lu <yinghai@kernel.org> wrote:

> Those patches are rebased on v4.2-rc1 that I sent before but were rejected by 
> Ingo on changelog.

What does 'on changelog' mean?

So I rejected them because the patches had poor organization and poor 
explanations. I just re-checked this series and the organization is still poor, 
and it has similar problems which I pointed out 3 months ago - which need to be 
fixed. (Due to that I couldn't even begin to evaluate any deeper merits (or 
problems) of the changes.)

> Kees Cook said that he would like to give a try to make improvement on changelog 
> to get things moving.

So if that means that Kees is willing to write proper changelogs, then in the mail 
below I'm pointing out the most glaring problem with your patches, but there are 
other details missing as well - as usual I have to (sadly) say.

But if Kees sends truly new changelogs that meet the technical requirements below 
I'll have a look - but the simple Acked-by's from Kees in this thread won't 
suffice to fix the deficiencies.


Thanks,

	Ingo

=========================================>

In your changelogs you are typically only talking only about the
low level code and about low level symptoms, while in contrast to
that the primary question with _any_ change to the kernel is always:

   Why are we changing the kernel, what bad high level behavior can a
   user observe if the bug or problem is not fixed?

Your descriptions totally ignore the high level effects of the bug on 
the system and on users of the machine, and you fail to describe them 
properly. You totally concentrate on the low level: your descriptions 
are missing the forest from all the trees.

That makes 90% of your patch descriptions useless.

In fact, 90% of your patches had that deficiency and had it for the 
past 4 years, non-stop, and maintainers were complaining to you about 
that, non-stop as well. Do you think maintainers enjoy complaining 
about deficiencies? Do you wonder why maintainers were forced to 
simply stop looking at any complex series from yours after you refused 
to change?

> will refresh the patchset.

So let me try this again, one very last time.

That sentence demonstrates your problem: it's not a 'refresh' your 
patches need, but a 'hard reboot', a totally new viewpoint that 
concentrates on what matters: that zooms out of the small details of 
the patch!

For any serious change to the Linux kernel, analyzing small details is 
fine and required as well, AFTER the high level has been discussed 
properly:

  - What happened, what high level concern motivates you to change the 
    Linux kernel?

       And no, starting a changelog with:

          commit e6023367d779 ("x86, kaslr: Prevent .bss from 
          overlaping initrd") introduced one run_size for kaslr.

       is not 'high level' in any way, it talks about code in the 
       first sentence! Talking about code, not talking about high 
       level concerns is a BUG().

  - What was the previous (often bad) high level behavior of the
    kernel?

       And no, 'KASLR will not find a proper area' is NOT a high level
       description, it's a very low level description! Not discussing 
       high level behavior of the kernel in a changelog is a BUG().

  - What new high level behavior do we want to happen instead?

       And no, saying that 'KASLR should be passed init_size instead
       of run_size' is not a description of desired new high level
       behavior! Not discussing the desired high level behavior of the 
       kernel in a changelog is a BUG().

  - What was the high level design of the old code, was that design
    fine, should it be changed, and if yes, in what way?

       Note that we describe the high level design even if we don't
       change it, partly to give context to the reader, partly to 
       prove that you know what you are doing, to build trust in your 
       patch! Not discussing the old (and new) design of that area of 
       the kernel is a BUG().

and only then do we:

  - Describe the old implementation, and describe how the new
    implementation works in all that context.

       Here is where 99.9% of your existing changelogs fit in.
       Put differently: your changelogs are missing the first 3 
       essential components of a changelog.

       And note, mentioning 'run_size' in a low level description is 
       fine. Mentioning it in a high level description is a BUG(): 
       that is why Boris was trying to change your changelogs into 
       spoken English, to recover some of that missing high level 
       context. Maintainers like Boris should not be forced to do 
       that, you are supposed to offer this with any significant 
       patch, as a passport to prove that your changes to the code are 
       worth looking at.

And yes, we do it in that order. Take a look at a recent single line 
change Linus made in 53da3bc2ba9e48, attached to this mail.

Check how the changelog is structured: it discusses high level 
concepts first. It's a _ONE LINER FIX_ from Linus, yet it's spread 
over 8 paragraphs and 50 lines, because the change justified that kind 
of description.

And observe the moment when Linus, in his 8 paragraphs, 50 lines 
description starts talking about low level implementational details, 
when he mentions lines of code, function names, such as 
do_numa_page(), 'pte_write()' or 'pte_dirty()'.

He doesnt!

It's not needed for a one-liner most of the time: but the high level 
concepts around that code are very much necessary to convey.

Now contrast that with your changelogs: your changelogs are stock full 
of non-English phrases resembling code more than a fluid description, 
they are stock full of low level details, mentioning of function 
names, variables and fields with no high level context conveyed 
whatsoever.

Let me try to put it to you in a way that might come across: when I 
run maintainer code over your changelogs it runs into an instant BUG() 
most of the time, forcing me to drop your patches.

High level discussion of old behavior, new behavior, old design and 
new design isn't just some detail to be slapped on a change after the 
fact, this is a serious and required technological aspect of the Linux 
kernel, and 90% of your patches are buggy in that respect.

In fact, I noticed that both your descriptions and your followups to 
them are totally lacking the high level viewpoint!

Either you:

   - refuse to think in high level concepts when you are writing 
     patches, in which case we need to keep your patches away from
     the Linux kernel,

   - or you are unwilling to document such high level thinking 
     processes, in which case we need to keep your patches away from 
     the Linux kernel as well.

If your appoach to writing kernel patches does not change then I will 
be forced to take action, currently you are this -->.<-- close to 
being filtered out from my default inbox for your total refusal to 
change the technology of how you are writing patches.

Thanks,

	Ingo

[ Sample 'good' changelog from Linus: ]

======================>
>From 53da3bc2ba9e4899f32707b5cd7d18421b943687 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 12 Mar 2015 08:45:46 -0700
Subject: [PATCH] mm: fix up numa read-only thread grouping logic

Dave Chinner reported that commit 4d9424669946 ("mm: convert
p[te|md]_mknonnuma and remaining page table manipulations") slowed down
his xfsrepair test enormously.  In particular, it was using more system
time due to extra TLB flushing.

The ultimate reason turns out to be how the change to use the regular
page table accessor functions broke the NUMA grouping logic.  The old
special mknuma/mknonnuma code accessed the page table present bit and
the magic NUMA bit directly, while the new code just changes the page
protections using PROT_NONE and the regular vma protections.

That sounds equivalent, and from a fault standpoint it really is, but a
subtle side effect is that the *other* protection bits of the page table
entries also change.  And the code to decide how to group the NUMA
entries together used the writable bit to decide whether a particular
page was likely to be shared read-only or not.

And with the change to make the NUMA handling use the regular permission
setting functions, that writable bit was basically always cleared for
private mappings due to COW.  So even if the page actually ends up being
written to in the end, the NUMA balancing would act as if it was always
shared RO.

This code is a heuristic anyway, so the fix - at least for now - is to
instead check whether the page is dirty rather than writable.  The bit
doesn't change with protection changes.

NOTE! This also adds a FIXME comment to revisit this issue,

Not only should we probably re-visit the whole "is this a shared
read-only page" heuristic (we might want to take the vma permissions
into account and base this more on those than the per-page ones, and
also look at whether the particular access that triggers it is a write
or not), but the whole COW issue shows that we should think about the
NUMA fault handling some more.

For example, maybe we should do the early-COW thing that a regular fault
does.  Or maybe we should accept that while using the same bits as
PROTNONE was a good thing (and got rid of the specual NUMA bit), we
might still want to just preseve the other protection bits across NUMA
faulting.

Those are bigger questions, left for later.  This just fixes up the
heuristic so that it at least approximates working again.  More analysis
and work needed.

Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/memory.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 8068893697bb..411144f977b1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3072,8 +3072,13 @@ static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * Avoid grouping on DSO/COW pages in specific and RO pages
 	 * in general, RO pages shouldn't hurt as much anyway since
 	 * they can be in shared cache state.
+	 *
+	 * FIXME! This checks "pmd_dirty()" as an approximation of
+	 * "is this a read-only page", since checking "pmd_write()"
+	 * is even more broken. We haven't actually turned this into
+	 * a writable page, so pmd_write() will always be false.
 	 */
-	if (!pte_write(pte))
+	if (!pte_dirty(pte))
 		flags |= TNF_NO_GROUP;
 
 	/*

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/42] x86: restore end_of_ram to E820_RAM
  2015-07-07 20:20 ` [PATCH 27/42] x86: restore end_of_ram to E820_RAM Yinghai Lu
@ 2015-07-08 17:44   ` Matt Fleming
  2015-07-09  1:41     ` Dan Williams
  2015-07-09  7:45     ` Christoph Hellwig
  0 siblings, 2 replies; 79+ messages in thread
From: Matt Fleming @ 2015-07-08 17:44 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel,
	Christoph Hellwig, Dan Williams

On Tue, 07 Jul, at 01:20:13PM, Yinghai Lu wrote:
> We don't need to create mapping for E820_PRAM.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/kernel/e820.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index a102564..46ec08d 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -753,7 +753,7 @@ u64 __init early_reserve_e820(u64 size, u64 align)
>  /*
>   * Find the highest page frame number we have available
>   */
> -static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
> +static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned type)
>  {
>  	int i;
>  	unsigned long last_pfn = 0;
> @@ -764,11 +764,7 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>  		unsigned long start_pfn;
>  		unsigned long end_pfn;
>  
> -		/*
> -		 * Persistent memory is accounted as ram for purposes of
> -		 * establishing max_pfn and mem_map.
> -		 */
> -		if (ei->type != E820_RAM && ei->type != E820_PRAM)
> +		if (ei->type != type)
>  			continue;
>  
>  		start_pfn = ei->addr >> PAGE_SHIFT;
> @@ -793,12 +789,12 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>  }
>  unsigned long __init e820_end_of_ram_pfn(void)
>  {
> -	return e820_end_pfn(MAX_ARCH_PFN);
> +	return e820_end_pfn(MAX_ARCH_PFN, E820_RAM);
>  }
>  
>  unsigned long __init e820_end_of_low_ram_pfn(void)
>  {
> -	return e820_end_pfn(1UL << (32-PAGE_SHIFT));
> +	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
>  }
>  
>  static void early_panic(char *msg)

Could you explain why you no longer want to allow pesistent memory to be
used in figuring out max_pfn? This partially reverts commit ec776ef6bbe1
("x86/mm: Add support for the non-standard protected e820 type").

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G
  2015-07-07 23:12   ` Kees Cook
@ 2015-07-08 18:00     ` Matt Fleming
  0 siblings, 0 replies; 79+ messages in thread
From: Matt Fleming @ 2015-07-08 18:00 UTC (permalink / raw)
  To: Kees Cook; +Cc: Yinghai Lu, H. Peter Anvin, Baoquan He, LKML

On Tue, 07 Jul, at 04:12:26PM, Kees Cook wrote:
> > @@ -301,7 +301,7 @@ _start:
> >         # Part 2 of the header, from the old setup.S
> >
> >                 .ascii  "HdrS"          # header signature
> > -               .word   0x020d          # header version number (>= 0x0105)
> > +               .word   0x020e          # header version number (>= 0x0105)
> >                                         # or else old loadlin-1.5 will fail)
> >                 .globl realmode_swtch
> >  realmode_swtch:        .word   0, 0            # default_switch, SETUPSEG
> > @@ -478,6 +478,7 @@ pref_address:               .quad LOAD_PHYSICAL_ADDR        # preferred load addr
> >  #endif
> >  init_size:             .long INIT_SIZE         # kernel initialization size
> >  handover_offset:       .long 0                 # Filled in by build.c
> > +ext_code32_start:      .long 0                 # werid one!
> 
> Comment is a typo?
 
Yeah, this was one of the things I pointed out the last time this patch
was sent and it's disheartening to see it wasn't fixed (as trivial as it
may be),

  https://lkml.kernel.org/r/20150224215501.GB9758@codeblueprint.co.uk

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/42] x86: restore end_of_ram to E820_RAM
  2015-07-08 17:44   ` Matt Fleming
@ 2015-07-09  1:41     ` Dan Williams
  2015-07-09  7:45     ` Christoph Hellwig
  1 sibling, 0 replies; 79+ messages in thread
From: Dan Williams @ 2015-07-09  1:41 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Yinghai Lu, Kees Cook, H. Peter Anvin, Baoquan He,
	linux-kernel@vger.kernel.org, Christoph Hellwig

On Wed, Jul 8, 2015 at 10:44 AM, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> On Tue, 07 Jul, at 01:20:13PM, Yinghai Lu wrote:
>> We don't need to create mapping for E820_PRAM.
>>
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> ---
>>  arch/x86/kernel/e820.c | 12 ++++--------
>>  1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
>> index a102564..46ec08d 100644
>> --- a/arch/x86/kernel/e820.c
>> +++ b/arch/x86/kernel/e820.c
>> @@ -753,7 +753,7 @@ u64 __init early_reserve_e820(u64 size, u64 align)
>>  /*
>>   * Find the highest page frame number we have available
>>   */
>> -static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>> +static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned type)
>>  {
>>       int i;
>>       unsigned long last_pfn = 0;
>> @@ -764,11 +764,7 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>>               unsigned long start_pfn;
>>               unsigned long end_pfn;
>>
>> -             /*
>> -              * Persistent memory is accounted as ram for purposes of
>> -              * establishing max_pfn and mem_map.
>> -              */
>> -             if (ei->type != E820_RAM && ei->type != E820_PRAM)
>> +             if (ei->type != type)
>>                       continue;
>>
>>               start_pfn = ei->addr >> PAGE_SHIFT;
>> @@ -793,12 +789,12 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>>  }
>>  unsigned long __init e820_end_of_ram_pfn(void)
>>  {
>> -     return e820_end_pfn(MAX_ARCH_PFN);
>> +     return e820_end_pfn(MAX_ARCH_PFN, E820_RAM);
>>  }
>>
>>  unsigned long __init e820_end_of_low_ram_pfn(void)
>>  {
>> -     return e820_end_pfn(1UL << (32-PAGE_SHIFT));
>> +     return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
>>  }
>>
>>  static void early_panic(char *msg)
>
> Could you explain why you no longer want to allow pesistent memory to be
> used in figuring out max_pfn? This partially reverts commit ec776ef6bbe1
> ("x86/mm: Add support for the non-standard protected e820 type").
>

pmem is accessed through the driver or through ->direct_access().
Existing NVDIMM devices are already pushing hundreds of gigabytes
which is too large to provide "struct page" coverage by default.
We're looking at other means to provide "struct page" for pmem.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/42] x86: restore end_of_ram to E820_RAM
  2015-07-08 17:44   ` Matt Fleming
  2015-07-09  1:41     ` Dan Williams
@ 2015-07-09  7:45     ` Christoph Hellwig
  2015-07-09 11:17       ` Matt Fleming
  1 sibling, 1 reply; 79+ messages in thread
From: Christoph Hellwig @ 2015-07-09  7:45 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Yinghai Lu, Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel,
	Christoph Hellwig, Dan Williams

Btw, where is this patch coming from?  It looks reasonable but I didn't
see it on any list.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/42] x86: restore end_of_ram to E820_RAM
  2015-07-09  7:45     ` Christoph Hellwig
@ 2015-07-09 11:17       ` Matt Fleming
  0 siblings, 0 replies; 79+ messages in thread
From: Matt Fleming @ 2015-07-09 11:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Yinghai Lu, Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel,
	Dan Williams

On Thu, 09 Jul, at 09:45:41AM, Christoph Hellwig wrote:
> Btw, where is this patch coming from?  It looks reasonable but I didn't
> see it on any list.

It's on LKML, original patch is here,

  https://lkml.kernel.org/r/1436300428-21163-28-git-send-email-yinghai@kernel.org

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI
  2015-07-07 20:20 ` [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI Yinghai Lu
@ 2015-07-14 22:30   ` Bjorn Helgaas
  0 siblings, 0 replies; 79+ messages in thread
From: Bjorn Helgaas @ 2015-07-14 22:30 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel, Matt Fleming,
	linux-pci

On Tue, Jul 07, 2015 at 01:20:19PM -0700, Yinghai Lu wrote:
> Let it reserve setup_data, and keep it's own list.

s/it's/its/

> Also clear the hdr.setup_data, as all handler now handle or
> reserve setup_data locally already.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: linux-pci@vger.kernel.org
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/include/asm/pci.h |  2 ++
>  arch/x86/kernel/setup.c    |  8 ++++++++
>  arch/x86/pci/common.c      | 42 ++++++++++++++++++++++++++++--------------
>  3 files changed, 38 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index 4625943..7d2468c 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -80,8 +80,10 @@ extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
>  
>  #ifdef CONFIG_PCI
>  extern void early_quirks(void);
> +void add_pci(u64 pa_data);
>  #else
>  static inline void early_quirks(void) { }
> +static inline void add_pci(u64 pa_data) { }
>  #endif
>  
>  extern void pci_iommu_alloc(void);
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index a3b65f1..de0f830 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -440,6 +440,8 @@ static void __init parse_setup_data(void)
>  		pa_next = data->next;
>  		early_memunmap(data, sizeof(*data));
>  
> +		printk(KERN_DEBUG "setup_data type: %d @ %#010llx\n",
> +				data_type, pa_data);
>  		switch (data_type) {
>  		case SETUP_E820_EXT:
>  			parse_e820_ext(pa_data, data_len);
> @@ -447,14 +449,20 @@ static void __init parse_setup_data(void)
>  		case SETUP_DTB:
>  			add_dtb(pa_data);
>  			break;
> +		case SETUP_PCI:
> +			add_pci(pa_data);
> +			break;
>  		case SETUP_EFI:
>  			parse_efi_setup(pa_data, data_len);
>  			break;
>  		default:
> +			pr_warn("Unknown setup_data type: %d @ %#010llx ignored!\n",
> +				data_type, pa_data);
>  			break;
>  		}
>  		pa_data = pa_next;
>  	}
> +	boot_params.hdr.setup_data = 0; /* all done */
>  }
>  
>  static void __init memblock_x86_reserve_range_setup_data(void)
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 8fd6f44..16ace12 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -9,6 +9,7 @@
>  #include <linux/pci-acpi.h>
>  #include <linux/ioport.h>
>  #include <linux/init.h>
> +#include <linux/memblock.h>
>  #include <linux/dmi.h>
>  #include <linux/slab.h>
>  
> @@ -641,31 +642,44 @@ unsigned int pcibios_assign_all_busses(void)
>  	return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0;
>  }
>  
> +static u64 pci_setup_data;
> +void __init add_pci(u64 pa_data)
> +{
> +	struct setup_data *data;
> +
> +	data = early_memremap(pa_data, sizeof(*data));
> +	memblock_reserve(pa_data, sizeof(*data) + data->len);
> +	data->next = pci_setup_data;
> +	pci_setup_data = pa_data;
> +	early_memunmap(data, sizeof(*data));
> +}
> +
>  int pcibios_add_device(struct pci_dev *dev)
>  {
>  	struct setup_data *data;
>  	struct pci_setup_rom *rom;
>  	u64 pa_data;
>  
> -	pa_data = boot_params.hdr.setup_data;
> +	pa_data = pci_setup_data;
>  	while (pa_data) {
>  		data = ioremap(pa_data, sizeof(*rom));
>  		if (!data)
>  			return -ENOMEM;
>  
> -		if (data->type == SETUP_PCI) {
> -			rom = (struct pci_setup_rom *)data;
> -
> -			if ((pci_domain_nr(dev->bus) == rom->segment) &&
> -			    (dev->bus->number == rom->bus) &&
> -			    (PCI_SLOT(dev->devfn) == rom->device) &&
> -			    (PCI_FUNC(dev->devfn) == rom->function) &&
> -			    (dev->vendor == rom->vendor) &&
> -			    (dev->device == rom->devid)) {
> -				dev->rom = pa_data +
> -				      offsetof(struct pci_setup_rom, romdata);
> -				dev->romlen = rom->pcilen;
> -			}
> +		rom = (struct pci_setup_rom *)data;
> +
> +		if ((pci_domain_nr(dev->bus) == rom->segment) &&
> +		    (dev->bus->number == rom->bus) &&
> +		    (PCI_SLOT(dev->devfn) == rom->device) &&
> +		    (PCI_FUNC(dev->devfn) == rom->function) &&
> +		    (dev->vendor == rom->vendor) &&
> +		    (dev->device == rom->devid)) {
> +			dev->rom = pa_data +
> +			      offsetof(struct pci_setup_rom, romdata);
> +			dev->romlen = rom->pcilen;
> +			dev_printk(KERN_DEBUG, &dev->dev, "set rom to [%#010lx, %#010lx] via SETUP_PCI\n",
> +				   (unsigned long)dev->rom,
> +				   (unsigned long)(dev->rom + dev->romlen - 1));

"set ROM to [mem %#010lx-%#010lx] via SETUP_PCI" so it matches the way we
print other MMIO ranges.

>  		}
>  		pa_data = data->next;
>  		iounmap(data);
> -- 
> 1.8.4.5
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list
  2015-07-07 20:20 ` [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list Yinghai Lu
@ 2015-07-14 22:35   ` Bjorn Helgaas
  2015-07-15  1:57     ` Yinghai Lu
  0 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2015-07-14 22:35 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel, linux-pci

On Tue, Jul 07, 2015 at 01:20:21PM -0700, Yinghai Lu wrote:
> So we could avoid ioremap every time later.

The changelog (not just the subject) should say what you're doing.

> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 16ace12..32d4f21 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c

> +struct firmware_setup_pci_entry {
> +	struct list_head list;
> +	uint16_t vendor;
> +	uint16_t devid;
> +	uint64_t pcilen;

Is there a reason to use uint16_t and uint64_t instead of u16 and u64?

> +	unsigned long segment;
> +	unsigned long bus;
> +	unsigned long device;
> +	unsigned long function;
> +	phys_addr_t romdata;
> +};

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list
  2015-07-14 22:35   ` Bjorn Helgaas
@ 2015-07-15  1:57     ` Yinghai Lu
  0 siblings, 0 replies; 79+ messages in thread
From: Yinghai Lu @ 2015-07-15  1:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, Linux Kernel Mailing List,
	linux-pci@vger.kernel.org

On Tue, Jul 14, 2015 at 3:35 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index 16ace12..32d4f21 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>
>> +struct firmware_setup_pci_entry {
>> +     struct list_head list;
>> +     uint16_t vendor;
>> +     uint16_t devid;
>> +     uint64_t pcilen;
>
> Is there a reason to use uint16_t and uint64_t instead of u16 and u64?

keep them same as arch/x86/include/asm/pci.h::pci_setup_rom.

and we have that from:

commit dd5fc854de5fd37adfcef8a366cd21a55aa01d3d
Author: Matthew Garrett <mjg@redhat.com>
Date:   Wed Dec 5 14:33:26 2012 -0700

    EFI: Stash ROMs if they're not in the PCI BAR

    EFI provides support for providing PCI ROMs via means other than the ROM
    BAR. This support vanishes after we've exited boot services, so add support
    for stashing copies of the ROMs in setup_data if they're not otherwise
    available.

    Signed-off-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Tested-by: Seth Forshee <seth.forshee@canonical.com>

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 6e41b93..dba7805 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -171,4 +171,16 @@ cpumask_of_pcibus(const struct pci_bus *bus)
 }
 #endif

+struct pci_setup_rom {
+       struct setup_data data;
+       uint16_t vendor;
+       uint16_t devid;
+       uint64_t pcilen;
+       unsigned long segment;
+       unsigned long bus;
+       unsigned long device;
+       unsigned long function;
+       uint8_t romdata[0];
+};
+
 #endif /* _ASM_X86_PCI_H */

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly
@ 2015-07-22 10:58     ` Matt Fleming
  0 siblings, 0 replies; 79+ messages in thread
From: Matt Fleming @ 2015-07-22 10:58 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel, Matt Fleming,
	linux-efi, Dave Young

(Pulling in Dave since he wrote the code)

On Tue, 07 Jul, at 01:20:17PM, Yinghai Lu wrote:
> The copy will be in __initdata, and it is small.
> 
> We can use pointer to access the setup_data instead of using early_memmap
> everywhere.
 
Is that the only advantage to copying the data, that we save ourselves
the overhead of calling early_memremap()? That seems fair enough, I just
want to be clear.

> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: linux-efi@vger.kernel.org
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  arch/x86/include/asm/efi.h     |  2 +-
>  arch/x86/platform/efi/efi.c    | 13 ++-----------
>  arch/x86/platform/efi/efi_64.c | 10 +++++++++-
>  arch/x86/platform/efi/quirks.c | 23 ++++++-----------------
>  4 files changed, 18 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
> index 155162e..a3e3aee 100644
> --- a/arch/x86/include/asm/efi.h
> +++ b/arch/x86/include/asm/efi.h
> @@ -116,7 +116,7 @@ struct efi_setup_data {
>  	u64 reserved[8];
>  };
>  
> -extern u64 efi_setup;
> +extern struct efi_setup_data *efi_setup;
>  
>  #ifdef CONFIG_EFI
>  
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index cfba30f..33036ce 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -68,7 +68,7 @@ static efi_config_table_type_t arch_tables[] __initdata = {
>  	{NULL_GUID, NULL, NULL},
>  };
>  
> -u64 efi_setup;		/* efi setup_data physical address */
> +struct efi_setup_data *efi_setup __initdata; /* cached efi setup_data pointer */
>  
>  static int add_efi_memmap __initdata;
>  static int __init setup_add_efi_memmap(char *arg)
> @@ -257,20 +257,13 @@ static int __init efi_systab_init(void *phys)
>  {
>  	if (efi_enabled(EFI_64BIT)) {
>  		efi_system_table_64_t *systab64;
> -		struct efi_setup_data *data = NULL;
> +		struct efi_setup_data *data = efi_setup;
>  		u64 tmp = 0;
>  
> -		if (efi_setup) {
> -			data = early_memremap(efi_setup, sizeof(*data));
> -			if (!data)
> -				return -ENOMEM;
> -		}
>  		systab64 = early_memremap((unsigned long)phys,
>  					 sizeof(*systab64));
>  		if (systab64 == NULL) {
>  			pr_err("Couldn't map the system table!\n");
> -			if (data)
> -				early_memunmap(data, sizeof(*data));
>  			return -ENOMEM;
>  		}
>  
> @@ -303,8 +296,6 @@ static int __init efi_systab_init(void *phys)
>  		tmp |= data ? data->tables : systab64->tables;
>  
>  		early_memunmap(systab64, sizeof(*systab64));
> -		if (data)
> -			early_memunmap(data, sizeof(*data));
>  #ifdef CONFIG_X86_32
>  		if (tmp >> 32) {
>  			pr_err("EFI data located above 4GB, disabling EFI.\n");
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index a0ac0f9..a255491 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -295,9 +295,17 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
>  	return (void __iomem *)__va(phys_addr);
>  }
>  
> +static struct efi_setup_data efi_setup_data __initdata;
> +
>  void __init parse_efi_setup(u64 phys_addr, u32 data_len)
>  {
> -	efi_setup = phys_addr + sizeof(struct setup_data);
> +	struct efi_setup_data *data;
> +
> +	data = early_memremap(phys_addr + sizeof(struct setup_data),
> +			      sizeof(*data));
> +	efi_setup_data = *data;
> +	early_memunmap(data, sizeof(*data));
> +	efi_setup = &efi_setup_data;
>  }
>  
>  void __init efi_runtime_mkexec(void)
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index 1c7380d..45fec7d 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -203,9 +203,8 @@ void __init efi_free_boot_services(void)
>   */
>  int __init efi_reuse_config(u64 tables, int nr_tables)
>  {
> -	int i, sz, ret = 0;
> +	int i, sz;
>  	void *p, *tablep;
> -	struct efi_setup_data *data;
>  
>  	if (!efi_setup)
>  		return 0;
> @@ -213,22 +212,15 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
>  	if (!efi_enabled(EFI_64BIT))
>  		return 0;
>  
> -	data = early_memremap(efi_setup, sizeof(*data));
> -	if (!data) {
> -		ret = -ENOMEM;
> -		goto out;
> -	}
> -
> -	if (!data->smbios)
> -		goto out_memremap;
> +	if (!efi_setup->smbios)
> +		return 0;
>  
>  	sz = sizeof(efi_config_table_64_t);
>  
>  	p = tablep = early_memremap(tables, nr_tables * sz);
>  	if (!p) {
>  		pr_err("Could not map Configuration table!\n");
> -		ret = -ENOMEM;
> -		goto out_memremap;
> +		return -ENOMEM;
>  	}
>  
>  	for (i = 0; i < efi.systab->nr_tables; i++) {
> @@ -237,15 +229,12 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
>  		guid = ((efi_config_table_64_t *)p)->guid;
>  
>  		if (!efi_guidcmp(guid, SMBIOS_TABLE_GUID))
> -			((efi_config_table_64_t *)p)->table = data->smbios;
> +			((efi_config_table_64_t *)p)->table = efi_setup->smbios;
>  		p += sz;
>  	}
>  	early_memunmap(tablep, nr_tables * sz);
>  
> -out_memremap:
> -	early_memunmap(data, sizeof(*data));
> -out:
> -	return ret;
> +	return 0;
>  }
>  
>  void __init efi_apply_memmap_quirks(void)
> -- 
> 1.8.4.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly
@ 2015-07-22 10:58     ` Matt Fleming
  0 siblings, 0 replies; 79+ messages in thread
From: Matt Fleming @ 2015-07-22 10:58 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Dave Young

(Pulling in Dave since he wrote the code)

On Tue, 07 Jul, at 01:20:17PM, Yinghai Lu wrote:
> The copy will be in __initdata, and it is small.
> 
> We can use pointer to access the setup_data instead of using early_memmap
> everywhere.
 
Is that the only advantage to copying the data, that we save ourselves
the overhead of calling early_memremap()? That seems fair enough, I just
want to be clear.

> Cc: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Signed-off-by: Yinghai Lu <yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  arch/x86/include/asm/efi.h     |  2 +-
>  arch/x86/platform/efi/efi.c    | 13 ++-----------
>  arch/x86/platform/efi/efi_64.c | 10 +++++++++-
>  arch/x86/platform/efi/quirks.c | 23 ++++++-----------------
>  4 files changed, 18 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
> index 155162e..a3e3aee 100644
> --- a/arch/x86/include/asm/efi.h
> +++ b/arch/x86/include/asm/efi.h
> @@ -116,7 +116,7 @@ struct efi_setup_data {
>  	u64 reserved[8];
>  };
>  
> -extern u64 efi_setup;
> +extern struct efi_setup_data *efi_setup;
>  
>  #ifdef CONFIG_EFI
>  
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index cfba30f..33036ce 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -68,7 +68,7 @@ static efi_config_table_type_t arch_tables[] __initdata = {
>  	{NULL_GUID, NULL, NULL},
>  };
>  
> -u64 efi_setup;		/* efi setup_data physical address */
> +struct efi_setup_data *efi_setup __initdata; /* cached efi setup_data pointer */
>  
>  static int add_efi_memmap __initdata;
>  static int __init setup_add_efi_memmap(char *arg)
> @@ -257,20 +257,13 @@ static int __init efi_systab_init(void *phys)
>  {
>  	if (efi_enabled(EFI_64BIT)) {
>  		efi_system_table_64_t *systab64;
> -		struct efi_setup_data *data = NULL;
> +		struct efi_setup_data *data = efi_setup;
>  		u64 tmp = 0;
>  
> -		if (efi_setup) {
> -			data = early_memremap(efi_setup, sizeof(*data));
> -			if (!data)
> -				return -ENOMEM;
> -		}
>  		systab64 = early_memremap((unsigned long)phys,
>  					 sizeof(*systab64));
>  		if (systab64 == NULL) {
>  			pr_err("Couldn't map the system table!\n");
> -			if (data)
> -				early_memunmap(data, sizeof(*data));
>  			return -ENOMEM;
>  		}
>  
> @@ -303,8 +296,6 @@ static int __init efi_systab_init(void *phys)
>  		tmp |= data ? data->tables : systab64->tables;
>  
>  		early_memunmap(systab64, sizeof(*systab64));
> -		if (data)
> -			early_memunmap(data, sizeof(*data));
>  #ifdef CONFIG_X86_32
>  		if (tmp >> 32) {
>  			pr_err("EFI data located above 4GB, disabling EFI.\n");
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index a0ac0f9..a255491 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -295,9 +295,17 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
>  	return (void __iomem *)__va(phys_addr);
>  }
>  
> +static struct efi_setup_data efi_setup_data __initdata;
> +
>  void __init parse_efi_setup(u64 phys_addr, u32 data_len)
>  {
> -	efi_setup = phys_addr + sizeof(struct setup_data);
> +	struct efi_setup_data *data;
> +
> +	data = early_memremap(phys_addr + sizeof(struct setup_data),
> +			      sizeof(*data));
> +	efi_setup_data = *data;
> +	early_memunmap(data, sizeof(*data));
> +	efi_setup = &efi_setup_data;
>  }
>  
>  void __init efi_runtime_mkexec(void)
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index 1c7380d..45fec7d 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -203,9 +203,8 @@ void __init efi_free_boot_services(void)
>   */
>  int __init efi_reuse_config(u64 tables, int nr_tables)
>  {
> -	int i, sz, ret = 0;
> +	int i, sz;
>  	void *p, *tablep;
> -	struct efi_setup_data *data;
>  
>  	if (!efi_setup)
>  		return 0;
> @@ -213,22 +212,15 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
>  	if (!efi_enabled(EFI_64BIT))
>  		return 0;
>  
> -	data = early_memremap(efi_setup, sizeof(*data));
> -	if (!data) {
> -		ret = -ENOMEM;
> -		goto out;
> -	}
> -
> -	if (!data->smbios)
> -		goto out_memremap;
> +	if (!efi_setup->smbios)
> +		return 0;
>  
>  	sz = sizeof(efi_config_table_64_t);
>  
>  	p = tablep = early_memremap(tables, nr_tables * sz);
>  	if (!p) {
>  		pr_err("Could not map Configuration table!\n");
> -		ret = -ENOMEM;
> -		goto out_memremap;
> +		return -ENOMEM;
>  	}
>  
>  	for (i = 0; i < efi.systab->nr_tables; i++) {
> @@ -237,15 +229,12 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
>  		guid = ((efi_config_table_64_t *)p)->guid;
>  
>  		if (!efi_guidcmp(guid, SMBIOS_TABLE_GUID))
> -			((efi_config_table_64_t *)p)->table = data->smbios;
> +			((efi_config_table_64_t *)p)->table = efi_setup->smbios;
>  		p += sz;
>  	}
>  	early_memunmap(tablep, nr_tables * sz);
>  
> -out_memremap:
> -	early_memunmap(data, sizeof(*data));
> -out:
> -	return ret;
> +	return 0;
>  }
>  
>  void __init efi_apply_memmap_quirks(void)
> -- 
> 1.8.4.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly
@ 2015-07-24  2:07     ` Dave Young
  0 siblings, 0 replies; 79+ messages in thread
From: Dave Young @ 2015-07-24  2:07 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He, linux-kernel, Matt Fleming,
	linux-efi

Hi,

On 07/07/15 at 01:20pm, Yinghai Lu wrote:
> The copy will be in __initdata, and it is small.
> 
> We can use pointer to access the setup_data instead of using early_memmap
> everywhere.

Looks good to me except one issue about missing checking memremap return value.
see the comment inline

> 
> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: linux-efi@vger.kernel.org
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---

[snip]

> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -295,9 +295,17 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
>  	return (void __iomem *)__va(phys_addr);
>  }
>  
> +static struct efi_setup_data efi_setup_data __initdata;
> +
>  void __init parse_efi_setup(u64 phys_addr, u32 data_len)
>  {
> -	efi_setup = phys_addr + sizeof(struct setup_data);
> +	struct efi_setup_data *data;
> +
> +	data = early_memremap(phys_addr + sizeof(struct setup_data),
> +			      sizeof(*data));

There should be a checking for return value here..

> +	efi_setup_data = *data;
> +	early_memunmap(data, sizeof(*data));
> +	efi_setup = &efi_setup_data;
>  }
>  

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly
@ 2015-07-24  2:07     ` Dave Young
  0 siblings, 0 replies; 79+ messages in thread
From: Dave Young @ 2015-07-24  2:07 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Kees Cook, H. Peter Anvin, Baoquan He,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	linux-efi-u79uwXL29TY76Z2rM5mHXA

Hi,

On 07/07/15 at 01:20pm, Yinghai Lu wrote:
> The copy will be in __initdata, and it is small.
> 
> We can use pointer to access the setup_data instead of using early_memmap
> everywhere.

Looks good to me except one issue about missing checking memremap return value.
see the comment inline

> 
> Cc: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Signed-off-by: Yinghai Lu <yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---

[snip]

> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -295,9 +295,17 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
>  	return (void __iomem *)__va(phys_addr);
>  }
>  
> +static struct efi_setup_data efi_setup_data __initdata;
> +
>  void __init parse_efi_setup(u64 phys_addr, u32 data_len)
>  {
> -	efi_setup = phys_addr + sizeof(struct setup_data);
> +	struct efi_setup_data *data;
> +
> +	data = early_memremap(phys_addr + sizeof(struct setup_data),
> +			      sizeof(*data));

There should be a checking for return value here..

> +	efi_setup_data = *data;
> +	early_memunmap(data, sizeof(*data));
> +	efi_setup = &efi_setup_data;
>  }
>  

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2015-07-07 23:21 ` [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Kees Cook
@ 2015-10-02 20:16   ` Kees Cook
  2016-02-06 11:50     ` Baoquan He
  0 siblings, 1 reply; 79+ messages in thread
From: Kees Cook @ 2015-10-02 20:16 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Baoquan He, LKML

Hi,

Has there been any more work on these series of patches? I asked many
questions in my earlier review, but nothing was answered.

Thanks,

-Kees

On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
>> by Ingo on changelog.
>>
>> Kees Cook said that he would like to give a try to make improvement on changelog
>> to get things moving.
>
> Thanks for working on this! I think it might be best to split this
> long series into shorter ones. It seems like there are several areas:
>
> - fixing kASLR
> - extended kASLR above 4G
> - setup_data cleanup
> - various other cleanups
>
> It might make sense to keep them separate for easier review?
>
> -Kees
>
>>
>> First part are kaslr related:
>> 1. First put compressed kernel ZO near end of the buffer before decompressing
>> so we can find the ZO position easily for kaslr buffer searchin
>> 2. kill run_size calculation shell scripts.
>> 3. create new ident mapping for kasl 64bit, so we can cover
>>    above 4G random kernel base
>> 4. 7 patches from He that support random random, as I already used his patches
>>    to test the ident mapping code.
>> 5. some debug patches for boot/kaslr.
>>
>> Second part are setup_data related:
>> Now setup_data is reserved via memblock and e820 and different
>> handlers have different ways, and it is confusing.
>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
>>         have memory wasted.
>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
>>         have memory wasted.
>> 3. SETUP_DTB: is copied locally.
>>         have memory wasted.
>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
>> Also setup_data is exported to debugfs for debug purpose.
>> Here will convert to let every handler to decide how to handle it.
>> and will not reserve the setup_data generally, so will not
>> waste memory and also make memblock/e820 keep page aligned.
>> 1. not touch E820 anymore.
>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
>> 3. SETUP_DTB: reserver and copy to local and free.
>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
>> 5. export SETUP_PCI via sysfs.
>>
>> Third part are some small cleanup patches.
>>
>> put those patches at
>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
>>
>> Thanks
>>
>> Yinghai
>>
>>
>> Baoquan He (7):
>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>>   x86, kaslr: Add two functions which will be used later
>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>>   x86, kaslr: Randomize physical and virtual address of kernel separately
>>   x86, kaslr: Add support of kernel physical address randomization above 4G
>>   x86, kaslr: Remove useless codes
>>
>> Yinghai Lu (35):
>>   x86, kasl: Remove not needed parameter for choose_kernel_location
>>   x86, boot: Move compressed kernel to end of buffer before decompressing
>>   x86, boot: Fix run_size calculation
>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
>>   x86, kaslr: rename output_size to output_run_size
>>   x86, kaslr: Consolidate mem_avoid array filling
>>   x86, boot: Move z_extract_offset calculation to header.S
>>   x86, kaslr: Get correct max_addr for relocs pointer
>>   x86, boot: Split kernel_ident_mapping_init to another file
>>   x86, 64bit: Set ident_mapping for kaslr
>>   x86, boot: Add checking for memcpy
>>   x86, kaslr: Allow random address could be below loaded address
>>   x86, boot: Add printf support for early console in compressed/misc.c
>>   x86, boot: Add more debug printout in compressed/misc.c
>>   x86, setup: Check early serial console per string instead of one char
>>   x86, setup: Use puts() instead of printf() in edd code
>>   x86: Setup early console as early as possible in x86_start_kernel()
>>   x86, boot: print compression suffix in decompress stage
>>   x86: remove not needed clear_page calling
>>   x86: restore end_of_ram to E820_RAM
>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
>>   x86: Find correct 64 bit ramdisk address for microcode early update
>>   x86: Kill E820_RESERVED_KERN
>>   x86, efi: Copy SETUP_EFI data and access directly
>>   x86, of: Let add_dtb reserve setup_data locally
>>   x86, boot: Add add_pci handler for SETUP_PCI
>>   x86: Kill not used setup_data handling code
>>   x86, boot, PCI: Convert SETUP_PCI data to list
>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
>>   x86: Fix typo in mark_rodata_ro
>>   x86, 64bit: add pfn_range_is_highmapped()
>>   x86, 64bit: remove highmap for not needed ranges
>>   x86, 64bit: Add __pa_high/__va_high
>>   x86: fix msr print again
>>
>>  Documentation/x86/boot.txt                  |  19 ++
>>  arch/x86/boot/Makefile                      |  13 +-
>>  arch/x86/boot/compressed/Makefile           |  21 +-
>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
>>  arch/x86/boot/compressed/eboot.c            |  15 +-
>>  arch/x86/boot/compressed/head_32.S          |  14 +-
>>  arch/x86/boot/compressed/head_64.S          |  22 +-
>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
>>  arch/x86/boot/compressed/misc.h             |  41 +++-
>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
>>  arch/x86/boot/compressed/printf.c           |   5 +
>>  arch/x86/boot/compressed/string.c           |  28 ++-
>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
>>  arch/x86/boot/edd.c                         |   4 +-
>>  arch/x86/boot/header.S                      |  34 ++-
>>  arch/x86/boot/tty.c                         |  14 +-
>>  arch/x86/include/asm/boot.h                 |  19 ++
>>  arch/x86/include/asm/efi.h                  |   2 +-
>>  arch/x86/include/asm/page.h                 |   5 +
>>  arch/x86/include/asm/pci.h                  |   4 +
>>  arch/x86/include/asm/pgtable_64.h           |   2 +
>>  arch/x86/include/asm/processor.h            |   1 -
>>  arch/x86/include/asm/prom.h                 |   9 +-
>>  arch/x86/include/asm/setup.h                |   5 +
>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
>>  arch/x86/include/uapi/asm/e820.h            |   8 -
>>  arch/x86/kernel/asm-offsets.c               |   2 +
>>  arch/x86/kernel/cpu/common.c                |  61 +++---
>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
>>  arch/x86/kernel/devicetree.c                |  39 ++--
>>  arch/x86/kernel/e820.c                      |  18 +-
>>  arch/x86/kernel/head.c                      |  26 +++
>>  arch/x86/kernel/head32.c                    |   1 +
>>  arch/x86/kernel/head64.c                    |  21 +-
>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
>>  arch/x86/kernel/setup.c                     |  79 ++-----
>>  arch/x86/kernel/tboot.c                     |   3 +-
>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
>>  arch/x86/mm/ident_map.c                     |  74 +++++++
>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
>>  arch/x86/mm/pageattr.c                      |  16 +-
>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
>>  arch/x86/platform/efi/efi.c                 |  13 +-
>>  arch/x86/platform/efi/efi_64.c              |  10 +-
>>  arch/x86/platform/efi/quirks.c              |  23 +-
>>  arch/x86/tools/calc_run_size.sh             |  42 ----
>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
>>  kernel/printk/printk.c                      |  11 +-
>>  50 files changed, 1235 insertions(+), 661 deletions(-)
>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
>>  create mode 100644 arch/x86/boot/compressed/printf.c
>>  create mode 100644 arch/x86/mm/ident_map.c
>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>>
>> --
>> 1.8.4.5
>>
>
>
>
> --
> Kees Cook
> Chrome OS Security



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2015-10-02 20:16   ` Kees Cook
@ 2016-02-06 11:50     ` Baoquan He
  2016-02-09  4:31         ` [kernel-hardening] " Kees Cook
  0 siblings, 1 reply; 79+ messages in thread
From: Baoquan He @ 2016-02-06 11:50 UTC (permalink / raw)
  To: Kees Cook; +Cc: Yinghai Lu, H. Peter Anvin, LKML, bp, mingo, luto, vgoyal

Hi,

Recently people using big box servers are also very interested in kaslr and want
to have it to enhance security. So allowing kaslr be able to randomize above 4G
makes much sense for different kinds of system. I would like to repost patches
realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
to understand and adjust the rest with yh and reviewers' help, then post. But
firstly I will focus on kaslr and try to make it merge into Linus's tree.

Since this patchset includes too many issues and people usually like reviewing
post which takes care of one main issue in one thread, I will start from below
thread. It mainly includes kaslr above 4G support and bug fixes and several clean
up patch. 

x86, boot: kaslr cleanup and 64bit kaslr support 
https://lwn.net/Articles/637115/

The following patch lists is taken from yh's cover letter of above patch thread.

************************
Baoquan He (7):
  x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
  x86, kaslr: Introduce struct slot_area to manage randomization slot info
  x86, kaslr: Add two functions which will be used later
  x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
  x86, kaslr: Randomize physical and virtual address of kernel separately
  x86, kaslr: Add support of kernel physical address randomization above 4G
  x86, kaslr: Remove useless codes

Jiri Kosina (1):
  x86, kaslr: Propagate base load address calculation v2
*** this one has been merged. Jiri posted firslty, later Boris reverted it and
handled in a better way in commit (78cac48 x86/mm/KASLR: Propagate KASLR status to kernel proper)

Yinghai Lu (11):
  x86, boot: Make data from decompress_kernel stage live longer
  x86, boot: Simplify run_size calculation
  x86, kaslr: Kill not used run_size related code.
  x86, kaslr: Use output_run_size
  x86, kaslr: Consolidate mem_avoid array filling
  x86, boot: Move z_extract_offset calculation to header.S
  x86, kaslr: Get correct max_addr for relocs pointer
  x86, boot: Split kernel_ident_mapping_init to another file
  x86, 64bit: Set ident_mapping for kaslr
  x86, boot: Add checking for memcpy
  x86, kaslr: Allow random address could be below loaded address

**************************
My plan is split them into 
1) kaslr above 4G support
  x86, boot: Split kernel_ident_mapping_init to another file
  x86, 64bit: Set ident_mapping for kaslr
  x86, boot: Add checking for memcpy
  x86, boot: Move z_extract_offset calculation to header.S
  x86, boot: Simplify run_size calculation
  x86, kaslr: Kill not used run_size related code.
  x86, kaslr: Use output_run_size
  x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
  x86, kaslr: Introduce struct slot_area to manage randomization slot info
  x86, kaslr: Add two functions which will be used later
  x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
  x86, kaslr: Randomize physical and virtual address of kernel separately
  x86, kaslr: Add support of kernel physical address randomization above 4G
  x86, kaslr: Remove useless codes
2) allow kaslr to choose slots below loaded address
  x86, kaslr: Consolidate mem_avoid array filling
  x86, kaslr: Allow random address could be below loaded address
3) Make data from decompress_kernel stage live longer (bug fix)
  x86, boot: Make data from decompress_kernel stage live longer
4) Get correct max_addr for relocs pointer (improvement)
  x86, kaslr: Get correct max_addr for relocs pointer

The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
above 4G support.

That's all I plan to do. Suggestion or comments are welcome.

Thanks
Baoquan

----- Original Message -----
From: "Kees Cook" <keescook@chromium.org>
To: "Yinghai Lu" <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>, "Baoquan He" <bhe@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>
Sent: Saturday, October 3, 2015 4:16:40 AM
Subject: Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3

Hi,

Has there been any more work on these series of patches? I asked many
questions in my earlier review, but nothing was answered.

Thanks,

-Kees

On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
>> by Ingo on changelog.
>>
>> Kees Cook said that he would like to give a try to make improvement on changelog
>> to get things moving.
>
> Thanks for working on this! I think it might be best to split this
> long series into shorter ones. It seems like there are several areas:
>
> - fixing kASLR
> - extended kASLR above 4G
> - setup_data cleanup
> - various other cleanups
>
> It might make sense to keep them separate for easier review?
>
> -Kees
>
>>
>> First part are kaslr related:
>> 1. First put compressed kernel ZO near end of the buffer before decompressing
>> so we can find the ZO position easily for kaslr buffer searchin
>> 2. kill run_size calculation shell scripts.
>> 3. create new ident mapping for kasl 64bit, so we can cover
>>    above 4G random kernel base
>> 4. 7 patches from He that support random random, as I already used his patches
>>    to test the ident mapping code.
>> 5. some debug patches for boot/kaslr.
>>
>> Second part are setup_data related:
>> Now setup_data is reserved via memblock and e820 and different
>> handlers have different ways, and it is confusing.
>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
>>         have memory wasted.
>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
>>         have memory wasted.
>> 3. SETUP_DTB: is copied locally.
>>         have memory wasted.
>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
>> Also setup_data is exported to debugfs for debug purpose.
>> Here will convert to let every handler to decide how to handle it.
>> and will not reserve the setup_data generally, so will not
>> waste memory and also make memblock/e820 keep page aligned.
>> 1. not touch E820 anymore.
>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
>> 3. SETUP_DTB: reserver and copy to local and free.
>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
>> 5. export SETUP_PCI via sysfs.
>>
>> Third part are some small cleanup patches.
>>
>> put those patches at
>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
>>
>> Thanks
>>
>> Yinghai
>>
>>
>> Baoquan He (7):
>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>>   x86, kaslr: Add two functions which will be used later
>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>>   x86, kaslr: Randomize physical and virtual address of kernel separately
>>   x86, kaslr: Add support of kernel physical address randomization above 4G
>>   x86, kaslr: Remove useless codes
>>
>> Yinghai Lu (35):
>>   x86, kasl: Remove not needed parameter for choose_kernel_location
>>   x86, boot: Move compressed kernel to end of buffer before decompressing
>>   x86, boot: Fix run_size calculation
>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
>>   x86, kaslr: rename output_size to output_run_size
>>   x86, kaslr: Consolidate mem_avoid array filling
>>   x86, boot: Move z_extract_offset calculation to header.S
>>   x86, kaslr: Get correct max_addr for relocs pointer
>>   x86, boot: Split kernel_ident_mapping_init to another file
>>   x86, 64bit: Set ident_mapping for kaslr
>>   x86, boot: Add checking for memcpy
>>   x86, kaslr: Allow random address could be below loaded address
>>   x86, boot: Add printf support for early console in compressed/misc.c
>>   x86, boot: Add more debug printout in compressed/misc.c
>>   x86, setup: Check early serial console per string instead of one char
>>   x86, setup: Use puts() instead of printf() in edd code
>>   x86: Setup early console as early as possible in x86_start_kernel()
>>   x86, boot: print compression suffix in decompress stage
>>   x86: remove not needed clear_page calling
>>   x86: restore end_of_ram to E820_RAM
>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
>>   x86: Find correct 64 bit ramdisk address for microcode early update
>>   x86: Kill E820_RESERVED_KERN
>>   x86, efi: Copy SETUP_EFI data and access directly
>>   x86, of: Let add_dtb reserve setup_data locally
>>   x86, boot: Add add_pci handler for SETUP_PCI
>>   x86: Kill not used setup_data handling code
>>   x86, boot, PCI: Convert SETUP_PCI data to list
>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
>>   x86: Fix typo in mark_rodata_ro
>>   x86, 64bit: add pfn_range_is_highmapped()
>>   x86, 64bit: remove highmap for not needed ranges
>>   x86, 64bit: Add __pa_high/__va_high
>>   x86: fix msr print again
>>
>>  Documentation/x86/boot.txt                  |  19 ++
>>  arch/x86/boot/Makefile                      |  13 +-
>>  arch/x86/boot/compressed/Makefile           |  21 +-
>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
>>  arch/x86/boot/compressed/eboot.c            |  15 +-
>>  arch/x86/boot/compressed/head_32.S          |  14 +-
>>  arch/x86/boot/compressed/head_64.S          |  22 +-
>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
>>  arch/x86/boot/compressed/misc.h             |  41 +++-
>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
>>  arch/x86/boot/compressed/printf.c           |   5 +
>>  arch/x86/boot/compressed/string.c           |  28 ++-
>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
>>  arch/x86/boot/edd.c                         |   4 +-
>>  arch/x86/boot/header.S                      |  34 ++-
>>  arch/x86/boot/tty.c                         |  14 +-
>>  arch/x86/include/asm/boot.h                 |  19 ++
>>  arch/x86/include/asm/efi.h                  |   2 +-
>>  arch/x86/include/asm/page.h                 |   5 +
>>  arch/x86/include/asm/pci.h                  |   4 +
>>  arch/x86/include/asm/pgtable_64.h           |   2 +
>>  arch/x86/include/asm/processor.h            |   1 -
>>  arch/x86/include/asm/prom.h                 |   9 +-
>>  arch/x86/include/asm/setup.h                |   5 +
>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
>>  arch/x86/include/uapi/asm/e820.h            |   8 -
>>  arch/x86/kernel/asm-offsets.c               |   2 +
>>  arch/x86/kernel/cpu/common.c                |  61 +++---
>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
>>  arch/x86/kernel/devicetree.c                |  39 ++--
>>  arch/x86/kernel/e820.c                      |  18 +-
>>  arch/x86/kernel/head.c                      |  26 +++
>>  arch/x86/kernel/head32.c                    |   1 +
>>  arch/x86/kernel/head64.c                    |  21 +-
>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
>>  arch/x86/kernel/setup.c                     |  79 ++-----
>>  arch/x86/kernel/tboot.c                     |   3 +-
>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
>>  arch/x86/mm/ident_map.c                     |  74 +++++++
>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
>>  arch/x86/mm/pageattr.c                      |  16 +-
>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
>>  arch/x86/platform/efi/efi.c                 |  13 +-
>>  arch/x86/platform/efi/efi_64.c              |  10 +-
>>  arch/x86/platform/efi/quirks.c              |  23 +-
>>  arch/x86/tools/calc_run_size.sh             |  42 ----
>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
>>  kernel/printk/printk.c                      |  11 +-
>>  50 files changed, 1235 insertions(+), 661 deletions(-)
>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
>>  create mode 100644 arch/x86/boot/compressed/printf.c
>>  create mode 100644 arch/x86/mm/ident_map.c
>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>>
>> --
>> 1.8.4.5
>>
>
>
>
> --
> Kees Cook
> Chrome OS Security



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2016-02-06 11:50     ` Baoquan He
@ 2016-02-09  4:31         ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-02-09  4:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
> Hi,
>
> Recently people using big box servers are also very interested in kaslr and want
> to have it to enhance security. So allowing kaslr be able to randomize above 4G
> makes much sense for different kinds of system. I would like to repost patches
> realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
> to understand and adjust the rest with yh and reviewers' help, then post. But
> firstly I will focus on kaslr and try to make it merge into Linus's tree.
>
> Since this patchset includes too many issues and people usually like reviewing
> post which takes care of one main issue in one thread, I will start from below
> thread. It mainly includes kaslr above 4G support and bug fixes and several clean
> up patch.
>
> x86, boot: kaslr cleanup and 64bit kaslr support
> https://lwn.net/Articles/637115/
>
> The following patch lists is taken from yh's cover letter of above patch thread.
>
> ************************
> Baoquan He (7):
>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>   x86, kaslr: Add two functions which will be used later
>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>   x86, kaslr: Randomize physical and virtual address of kernel separately
>   x86, kaslr: Add support of kernel physical address randomization above 4G
>   x86, kaslr: Remove useless codes
>
> Jiri Kosina (1):
>   x86, kaslr: Propagate base load address calculation v2
> *** this one has been merged. Jiri posted firslty, later Boris reverted it and
> handled in a better way in commit (78cac48 x86/mm/KASLR: Propagate KASLR status to kernel proper)
>
> Yinghai Lu (11):
>   x86, boot: Make data from decompress_kernel stage live longer
>   x86, boot: Simplify run_size calculation
>   x86, kaslr: Kill not used run_size related code.
>   x86, kaslr: Use output_run_size
>   x86, kaslr: Consolidate mem_avoid array filling
>   x86, boot: Move z_extract_offset calculation to header.S
>   x86, kaslr: Get correct max_addr for relocs pointer
>   x86, boot: Split kernel_ident_mapping_init to another file
>   x86, 64bit: Set ident_mapping for kaslr
>   x86, boot: Add checking for memcpy
>   x86, kaslr: Allow random address could be below loaded address
>
> **************************
> My plan is split them into
> 1) kaslr above 4G support
>   x86, boot: Split kernel_ident_mapping_init to another file
>   x86, 64bit: Set ident_mapping for kaslr
>   x86, boot: Add checking for memcpy
>   x86, boot: Move z_extract_offset calculation to header.S
>   x86, boot: Simplify run_size calculation
>   x86, kaslr: Kill not used run_size related code.
>   x86, kaslr: Use output_run_size
>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>   x86, kaslr: Add two functions which will be used later
>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>   x86, kaslr: Randomize physical and virtual address of kernel separately
>   x86, kaslr: Add support of kernel physical address randomization above 4G
>   x86, kaslr: Remove useless codes
> 2) allow kaslr to choose slots below loaded address
>   x86, kaslr: Consolidate mem_avoid array filling
>   x86, kaslr: Allow random address could be below loaded address
> 3) Make data from decompress_kernel stage live longer (bug fix)
>   x86, boot: Make data from decompress_kernel stage live longer
> 4) Get correct max_addr for relocs pointer (improvement)
>   x86, kaslr: Get correct max_addr for relocs pointer
>
> The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
> complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
> above 4G support.
>
> That's all I plan to do. Suggestion or comments are welcome.

That sounds great, thanks! Please check the rest of the thread where I
asked a number of questions that remain unanswered. If we can get some
clarification on those points, I think it would help move this along
more quickly.

Thanks for continuing this work!

-Kees

>
> Thanks
> Baoquan
>
> ----- Original Message -----
> From: "Kees Cook" <keescook@chromium.org>
> To: "Yinghai Lu" <yinghai@kernel.org>
> Cc: "H. Peter Anvin" <hpa@zytor.com>, "Baoquan He" <bhe@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>
> Sent: Saturday, October 3, 2015 4:16:40 AM
> Subject: Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
>
> Hi,
>
> Has there been any more work on these series of patches? I asked many
> questions in my earlier review, but nothing was answered.
>
> Thanks,
>
> -Kees
>
> On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
>>> by Ingo on changelog.
>>>
>>> Kees Cook said that he would like to give a try to make improvement on changelog
>>> to get things moving.
>>
>> Thanks for working on this! I think it might be best to split this
>> long series into shorter ones. It seems like there are several areas:
>>
>> - fixing kASLR
>> - extended kASLR above 4G
>> - setup_data cleanup
>> - various other cleanups
>>
>> It might make sense to keep them separate for easier review?
>>
>> -Kees
>>
>>>
>>> First part are kaslr related:
>>> 1. First put compressed kernel ZO near end of the buffer before decompressing
>>> so we can find the ZO position easily for kaslr buffer searchin
>>> 2. kill run_size calculation shell scripts.
>>> 3. create new ident mapping for kasl 64bit, so we can cover
>>>    above 4G random kernel base
>>> 4. 7 patches from He that support random random, as I already used his patches
>>>    to test the ident mapping code.
>>> 5. some debug patches for boot/kaslr.
>>>
>>> Second part are setup_data related:
>>> Now setup_data is reserved via memblock and e820 and different
>>> handlers have different ways, and it is confusing.
>>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
>>>         have memory wasted.
>>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
>>>         have memory wasted.
>>> 3. SETUP_DTB: is copied locally.
>>>         have memory wasted.
>>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
>>> Also setup_data is exported to debugfs for debug purpose.
>>> Here will convert to let every handler to decide how to handle it.
>>> and will not reserve the setup_data generally, so will not
>>> waste memory and also make memblock/e820 keep page aligned.
>>> 1. not touch E820 anymore.
>>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
>>> 3. SETUP_DTB: reserver and copy to local and free.
>>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
>>> 5. export SETUP_PCI via sysfs.
>>>
>>> Third part are some small cleanup patches.
>>>
>>> put those patches at
>>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
>>>
>>> Thanks
>>>
>>> Yinghai
>>>
>>>
>>> Baoquan He (7):
>>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>>>   x86, kaslr: Add two functions which will be used later
>>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>>>   x86, kaslr: Randomize physical and virtual address of kernel separately
>>>   x86, kaslr: Add support of kernel physical address randomization above 4G
>>>   x86, kaslr: Remove useless codes
>>>
>>> Yinghai Lu (35):
>>>   x86, kasl: Remove not needed parameter for choose_kernel_location
>>>   x86, boot: Move compressed kernel to end of buffer before decompressing
>>>   x86, boot: Fix run_size calculation
>>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
>>>   x86, kaslr: rename output_size to output_run_size
>>>   x86, kaslr: Consolidate mem_avoid array filling
>>>   x86, boot: Move z_extract_offset calculation to header.S
>>>   x86, kaslr: Get correct max_addr for relocs pointer
>>>   x86, boot: Split kernel_ident_mapping_init to another file
>>>   x86, 64bit: Set ident_mapping for kaslr
>>>   x86, boot: Add checking for memcpy
>>>   x86, kaslr: Allow random address could be below loaded address
>>>   x86, boot: Add printf support for early console in compressed/misc.c
>>>   x86, boot: Add more debug printout in compressed/misc.c
>>>   x86, setup: Check early serial console per string instead of one char
>>>   x86, setup: Use puts() instead of printf() in edd code
>>>   x86: Setup early console as early as possible in x86_start_kernel()
>>>   x86, boot: print compression suffix in decompress stage
>>>   x86: remove not needed clear_page calling
>>>   x86: restore end_of_ram to E820_RAM
>>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
>>>   x86: Find correct 64 bit ramdisk address for microcode early update
>>>   x86: Kill E820_RESERVED_KERN
>>>   x86, efi: Copy SETUP_EFI data and access directly
>>>   x86, of: Let add_dtb reserve setup_data locally
>>>   x86, boot: Add add_pci handler for SETUP_PCI
>>>   x86: Kill not used setup_data handling code
>>>   x86, boot, PCI: Convert SETUP_PCI data to list
>>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
>>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
>>>   x86: Fix typo in mark_rodata_ro
>>>   x86, 64bit: add pfn_range_is_highmapped()
>>>   x86, 64bit: remove highmap for not needed ranges
>>>   x86, 64bit: Add __pa_high/__va_high
>>>   x86: fix msr print again
>>>
>>>  Documentation/x86/boot.txt                  |  19 ++
>>>  arch/x86/boot/Makefile                      |  13 +-
>>>  arch/x86/boot/compressed/Makefile           |  21 +-
>>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
>>>  arch/x86/boot/compressed/eboot.c            |  15 +-
>>>  arch/x86/boot/compressed/head_32.S          |  14 +-
>>>  arch/x86/boot/compressed/head_64.S          |  22 +-
>>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
>>>  arch/x86/boot/compressed/misc.h             |  41 +++-
>>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
>>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
>>>  arch/x86/boot/compressed/printf.c           |   5 +
>>>  arch/x86/boot/compressed/string.c           |  28 ++-
>>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
>>>  arch/x86/boot/edd.c                         |   4 +-
>>>  arch/x86/boot/header.S                      |  34 ++-
>>>  arch/x86/boot/tty.c                         |  14 +-
>>>  arch/x86/include/asm/boot.h                 |  19 ++
>>>  arch/x86/include/asm/efi.h                  |   2 +-
>>>  arch/x86/include/asm/page.h                 |   5 +
>>>  arch/x86/include/asm/pci.h                  |   4 +
>>>  arch/x86/include/asm/pgtable_64.h           |   2 +
>>>  arch/x86/include/asm/processor.h            |   1 -
>>>  arch/x86/include/asm/prom.h                 |   9 +-
>>>  arch/x86/include/asm/setup.h                |   5 +
>>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
>>>  arch/x86/include/uapi/asm/e820.h            |   8 -
>>>  arch/x86/kernel/asm-offsets.c               |   2 +
>>>  arch/x86/kernel/cpu/common.c                |  61 +++---
>>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
>>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
>>>  arch/x86/kernel/devicetree.c                |  39 ++--
>>>  arch/x86/kernel/e820.c                      |  18 +-
>>>  arch/x86/kernel/head.c                      |  26 +++
>>>  arch/x86/kernel/head32.c                    |   1 +
>>>  arch/x86/kernel/head64.c                    |  21 +-
>>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
>>>  arch/x86/kernel/setup.c                     |  79 ++-----
>>>  arch/x86/kernel/tboot.c                     |   3 +-
>>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
>>>  arch/x86/mm/ident_map.c                     |  74 +++++++
>>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
>>>  arch/x86/mm/pageattr.c                      |  16 +-
>>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
>>>  arch/x86/platform/efi/efi.c                 |  13 +-
>>>  arch/x86/platform/efi/efi_64.c              |  10 +-
>>>  arch/x86/platform/efi/quirks.c              |  23 +-
>>>  arch/x86/tools/calc_run_size.sh             |  42 ----
>>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
>>>  kernel/printk/printk.c                      |  11 +-
>>>  50 files changed, 1235 insertions(+), 661 deletions(-)
>>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
>>>  create mode 100644 arch/x86/boot/compressed/printf.c
>>>  create mode 100644 arch/x86/mm/ident_map.c
>>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>>>
>>> --
>>> 1.8.4.5
>>>
>>
>>
>>
>> --
>> Kees Cook
>> Chrome OS Security
>
>
>
> --
> Kees Cook
> Chrome OS Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
@ 2016-02-09  4:31         ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-02-09  4:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
> Hi,
>
> Recently people using big box servers are also very interested in kaslr and want
> to have it to enhance security. So allowing kaslr be able to randomize above 4G
> makes much sense for different kinds of system. I would like to repost patches
> realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
> to understand and adjust the rest with yh and reviewers' help, then post. But
> firstly I will focus on kaslr and try to make it merge into Linus's tree.
>
> Since this patchset includes too many issues and people usually like reviewing
> post which takes care of one main issue in one thread, I will start from below
> thread. It mainly includes kaslr above 4G support and bug fixes and several clean
> up patch.
>
> x86, boot: kaslr cleanup and 64bit kaslr support
> https://lwn.net/Articles/637115/
>
> The following patch lists is taken from yh's cover letter of above patch thread.
>
> ************************
> Baoquan He (7):
>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>   x86, kaslr: Add two functions which will be used later
>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>   x86, kaslr: Randomize physical and virtual address of kernel separately
>   x86, kaslr: Add support of kernel physical address randomization above 4G
>   x86, kaslr: Remove useless codes
>
> Jiri Kosina (1):
>   x86, kaslr: Propagate base load address calculation v2
> *** this one has been merged. Jiri posted firslty, later Boris reverted it and
> handled in a better way in commit (78cac48 x86/mm/KASLR: Propagate KASLR status to kernel proper)
>
> Yinghai Lu (11):
>   x86, boot: Make data from decompress_kernel stage live longer
>   x86, boot: Simplify run_size calculation
>   x86, kaslr: Kill not used run_size related code.
>   x86, kaslr: Use output_run_size
>   x86, kaslr: Consolidate mem_avoid array filling
>   x86, boot: Move z_extract_offset calculation to header.S
>   x86, kaslr: Get correct max_addr for relocs pointer
>   x86, boot: Split kernel_ident_mapping_init to another file
>   x86, 64bit: Set ident_mapping for kaslr
>   x86, boot: Add checking for memcpy
>   x86, kaslr: Allow random address could be below loaded address
>
> **************************
> My plan is split them into
> 1) kaslr above 4G support
>   x86, boot: Split kernel_ident_mapping_init to another file
>   x86, 64bit: Set ident_mapping for kaslr
>   x86, boot: Add checking for memcpy
>   x86, boot: Move z_extract_offset calculation to header.S
>   x86, boot: Simplify run_size calculation
>   x86, kaslr: Kill not used run_size related code.
>   x86, kaslr: Use output_run_size
>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>   x86, kaslr: Add two functions which will be used later
>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>   x86, kaslr: Randomize physical and virtual address of kernel separately
>   x86, kaslr: Add support of kernel physical address randomization above 4G
>   x86, kaslr: Remove useless codes
> 2) allow kaslr to choose slots below loaded address
>   x86, kaslr: Consolidate mem_avoid array filling
>   x86, kaslr: Allow random address could be below loaded address
> 3) Make data from decompress_kernel stage live longer (bug fix)
>   x86, boot: Make data from decompress_kernel stage live longer
> 4) Get correct max_addr for relocs pointer (improvement)
>   x86, kaslr: Get correct max_addr for relocs pointer
>
> The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
> complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
> above 4G support.
>
> That's all I plan to do. Suggestion or comments are welcome.

That sounds great, thanks! Please check the rest of the thread where I
asked a number of questions that remain unanswered. If we can get some
clarification on those points, I think it would help move this along
more quickly.

Thanks for continuing this work!

-Kees

>
> Thanks
> Baoquan
>
> ----- Original Message -----
> From: "Kees Cook" <keescook@chromium.org>
> To: "Yinghai Lu" <yinghai@kernel.org>
> Cc: "H. Peter Anvin" <hpa@zytor.com>, "Baoquan He" <bhe@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>
> Sent: Saturday, October 3, 2015 4:16:40 AM
> Subject: Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
>
> Hi,
>
> Has there been any more work on these series of patches? I asked many
> questions in my earlier review, but nothing was answered.
>
> Thanks,
>
> -Kees
>
> On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
>>> by Ingo on changelog.
>>>
>>> Kees Cook said that he would like to give a try to make improvement on changelog
>>> to get things moving.
>>
>> Thanks for working on this! I think it might be best to split this
>> long series into shorter ones. It seems like there are several areas:
>>
>> - fixing kASLR
>> - extended kASLR above 4G
>> - setup_data cleanup
>> - various other cleanups
>>
>> It might make sense to keep them separate for easier review?
>>
>> -Kees
>>
>>>
>>> First part are kaslr related:
>>> 1. First put compressed kernel ZO near end of the buffer before decompressing
>>> so we can find the ZO position easily for kaslr buffer searchin
>>> 2. kill run_size calculation shell scripts.
>>> 3. create new ident mapping for kasl 64bit, so we can cover
>>>    above 4G random kernel base
>>> 4. 7 patches from He that support random random, as I already used his patches
>>>    to test the ident mapping code.
>>> 5. some debug patches for boot/kaslr.
>>>
>>> Second part are setup_data related:
>>> Now setup_data is reserved via memblock and e820 and different
>>> handlers have different ways, and it is confusing.
>>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
>>>         have memory wasted.
>>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
>>>         have memory wasted.
>>> 3. SETUP_DTB: is copied locally.
>>>         have memory wasted.
>>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
>>> Also setup_data is exported to debugfs for debug purpose.
>>> Here will convert to let every handler to decide how to handle it.
>>> and will not reserve the setup_data generally, so will not
>>> waste memory and also make memblock/e820 keep page aligned.
>>> 1. not touch E820 anymore.
>>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
>>> 3. SETUP_DTB: reserver and copy to local and free.
>>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
>>> 5. export SETUP_PCI via sysfs.
>>>
>>> Third part are some small cleanup patches.
>>>
>>> put those patches at
>>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
>>>
>>> Thanks
>>>
>>> Yinghai
>>>
>>>
>>> Baoquan He (7):
>>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>>>   x86, kaslr: Add two functions which will be used later
>>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>>>   x86, kaslr: Randomize physical and virtual address of kernel separately
>>>   x86, kaslr: Add support of kernel physical address randomization above 4G
>>>   x86, kaslr: Remove useless codes
>>>
>>> Yinghai Lu (35):
>>>   x86, kasl: Remove not needed parameter for choose_kernel_location
>>>   x86, boot: Move compressed kernel to end of buffer before decompressing
>>>   x86, boot: Fix run_size calculation
>>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
>>>   x86, kaslr: rename output_size to output_run_size
>>>   x86, kaslr: Consolidate mem_avoid array filling
>>>   x86, boot: Move z_extract_offset calculation to header.S
>>>   x86, kaslr: Get correct max_addr for relocs pointer
>>>   x86, boot: Split kernel_ident_mapping_init to another file
>>>   x86, 64bit: Set ident_mapping for kaslr
>>>   x86, boot: Add checking for memcpy
>>>   x86, kaslr: Allow random address could be below loaded address
>>>   x86, boot: Add printf support for early console in compressed/misc.c
>>>   x86, boot: Add more debug printout in compressed/misc.c
>>>   x86, setup: Check early serial console per string instead of one char
>>>   x86, setup: Use puts() instead of printf() in edd code
>>>   x86: Setup early console as early as possible in x86_start_kernel()
>>>   x86, boot: print compression suffix in decompress stage
>>>   x86: remove not needed clear_page calling
>>>   x86: restore end_of_ram to E820_RAM
>>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
>>>   x86: Find correct 64 bit ramdisk address for microcode early update
>>>   x86: Kill E820_RESERVED_KERN
>>>   x86, efi: Copy SETUP_EFI data and access directly
>>>   x86, of: Let add_dtb reserve setup_data locally
>>>   x86, boot: Add add_pci handler for SETUP_PCI
>>>   x86: Kill not used setup_data handling code
>>>   x86, boot, PCI: Convert SETUP_PCI data to list
>>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
>>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
>>>   x86: Fix typo in mark_rodata_ro
>>>   x86, 64bit: add pfn_range_is_highmapped()
>>>   x86, 64bit: remove highmap for not needed ranges
>>>   x86, 64bit: Add __pa_high/__va_high
>>>   x86: fix msr print again
>>>
>>>  Documentation/x86/boot.txt                  |  19 ++
>>>  arch/x86/boot/Makefile                      |  13 +-
>>>  arch/x86/boot/compressed/Makefile           |  21 +-
>>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
>>>  arch/x86/boot/compressed/eboot.c            |  15 +-
>>>  arch/x86/boot/compressed/head_32.S          |  14 +-
>>>  arch/x86/boot/compressed/head_64.S          |  22 +-
>>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
>>>  arch/x86/boot/compressed/misc.h             |  41 +++-
>>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
>>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
>>>  arch/x86/boot/compressed/printf.c           |   5 +
>>>  arch/x86/boot/compressed/string.c           |  28 ++-
>>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
>>>  arch/x86/boot/edd.c                         |   4 +-
>>>  arch/x86/boot/header.S                      |  34 ++-
>>>  arch/x86/boot/tty.c                         |  14 +-
>>>  arch/x86/include/asm/boot.h                 |  19 ++
>>>  arch/x86/include/asm/efi.h                  |   2 +-
>>>  arch/x86/include/asm/page.h                 |   5 +
>>>  arch/x86/include/asm/pci.h                  |   4 +
>>>  arch/x86/include/asm/pgtable_64.h           |   2 +
>>>  arch/x86/include/asm/processor.h            |   1 -
>>>  arch/x86/include/asm/prom.h                 |   9 +-
>>>  arch/x86/include/asm/setup.h                |   5 +
>>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
>>>  arch/x86/include/uapi/asm/e820.h            |   8 -
>>>  arch/x86/kernel/asm-offsets.c               |   2 +
>>>  arch/x86/kernel/cpu/common.c                |  61 +++---
>>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
>>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
>>>  arch/x86/kernel/devicetree.c                |  39 ++--
>>>  arch/x86/kernel/e820.c                      |  18 +-
>>>  arch/x86/kernel/head.c                      |  26 +++
>>>  arch/x86/kernel/head32.c                    |   1 +
>>>  arch/x86/kernel/head64.c                    |  21 +-
>>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
>>>  arch/x86/kernel/setup.c                     |  79 ++-----
>>>  arch/x86/kernel/tboot.c                     |   3 +-
>>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
>>>  arch/x86/mm/ident_map.c                     |  74 +++++++
>>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
>>>  arch/x86/mm/pageattr.c                      |  16 +-
>>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
>>>  arch/x86/platform/efi/efi.c                 |  13 +-
>>>  arch/x86/platform/efi/efi_64.c              |  10 +-
>>>  arch/x86/platform/efi/quirks.c              |  23 +-
>>>  arch/x86/tools/calc_run_size.sh             |  42 ----
>>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
>>>  kernel/printk/printk.c                      |  11 +-
>>>  50 files changed, 1235 insertions(+), 661 deletions(-)
>>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
>>>  create mode 100644 arch/x86/boot/compressed/printf.c
>>>  create mode 100644 arch/x86/mm/ident_map.c
>>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
>>>
>>> --
>>> 1.8.4.5
>>>
>>
>>
>>
>> --
>> Kees Cook
>> Chrome OS Security
>
>
>
> --
> Kees Cook
> Chrome OS Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2016-02-09  4:31         ` [kernel-hardening] " Kees Cook
@ 2016-02-15  7:29           ` Baoquan He
  -1 siblings, 0 replies; 79+ messages in thread
From: Baoquan He @ 2016-02-15  7:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On 02/08/16 at 08:31pm, Kees Cook wrote:
> On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
> > Hi,
> >
> > Recently people using big box servers are also very interested in kaslr and want
> > to have it to enhance security. So allowing kaslr be able to randomize above 4G
> > makes much sense for different kinds of system. I would like to repost patches
> > realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
> > to understand and adjust the rest with yh and reviewers' help, then post. But
> > firstly I will focus on kaslr and try to make it merge into Linus's tree.
> >
> > Since this patchset includes too many issues and people usually like reviewing
> > post which takes care of one main issue in one thread, I will start from below
> > thread. It mainly includes kaslr above 4G support and bug fixes and several clean
> > up patch.
> >
> > x86, boot: kaslr cleanup and 64bit kaslr support
> > https://lwn.net/Articles/637115/
> >
> > The following patch lists is taken from yh's cover letter of above patch thread.
> >
> > **************************
> > My plan is split them into
> > 1) kaslr above 4G support
> >   x86, boot: Split kernel_ident_mapping_init to another file
> >   x86, 64bit: Set ident_mapping for kaslr
> >   x86, boot: Add checking for memcpy
> >   x86, boot: Move z_extract_offset calculation to header.S
> >   x86, boot: Simplify run_size calculation
> >   x86, kaslr: Kill not used run_size related code.
> >   x86, kaslr: Use output_run_size
> >   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
> >   x86, kaslr: Introduce struct slot_area to manage randomization slot info
> >   x86, kaslr: Add two functions which will be used later
> >   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
> >   x86, kaslr: Randomize physical and virtual address of kernel separately
> >   x86, kaslr: Add support of kernel physical address randomization above 4G
> >   x86, kaslr: Remove useless codes
> > 2) allow kaslr to choose slots below loaded address
> >   x86, kaslr: Consolidate mem_avoid array filling
> >   x86, kaslr: Allow random address could be below loaded address
> > 3) Make data from decompress_kernel stage live longer (bug fix)
> >   x86, boot: Make data from decompress_kernel stage live longer
> > 4) Get correct max_addr for relocs pointer (improvement)
> >   x86, kaslr: Get correct max_addr for relocs pointer
> >
> > The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
> > complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
> > above 4G support.
> >
> > That's all I plan to do. Suggestion or comments are welcome.
> 
> That sounds great, thanks! Please check the rest of the thread where I
> asked a number of questions that remain unanswered. If we can get some
> clarification on those points, I think it would help move this along
> more quickly.

Hi Kees,

Thanks for your suggestion. I am trying to understand all patches and
make some adjustment, meanwhile adjust patch log with my understanding.
And your questions help me understand it deeper. I will post after
updating. Hope you, Yinghai and other experts can help review and give
precious comments and suggestions.

Thanks
Baoquan

> 
> -Kees
> 
> >
> > Thanks
> > Baoquan
> >
> > ----- Original Message -----
> > From: "Kees Cook" <keescook@chromium.org>
> > To: "Yinghai Lu" <yinghai@kernel.org>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>, "Baoquan He" <bhe@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>
> > Sent: Saturday, October 3, 2015 4:16:40 AM
> > Subject: Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
> >
> > Hi,
> >
> > Has there been any more work on these series of patches? I asked many
> > questions in my earlier review, but nothing was answered.
> >
> > Thanks,
> >
> > -Kees
> >
> > On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
> >> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
> >>> by Ingo on changelog.
> >>>
> >>> Kees Cook said that he would like to give a try to make improvement on changelog
> >>> to get things moving.
> >>
> >> Thanks for working on this! I think it might be best to split this
> >> long series into shorter ones. It seems like there are several areas:
> >>
> >> - fixing kASLR
> >> - extended kASLR above 4G
> >> - setup_data cleanup
> >> - various other cleanups
> >>
> >> It might make sense to keep them separate for easier review?
> >>
> >> -Kees
> >>
> >>>
> >>> First part are kaslr related:
> >>> 1. First put compressed kernel ZO near end of the buffer before decompressing
> >>> so we can find the ZO position easily for kaslr buffer searchin
> >>> 2. kill run_size calculation shell scripts.
> >>> 3. create new ident mapping for kasl 64bit, so we can cover
> >>>    above 4G random kernel base
> >>> 4. 7 patches from He that support random random, as I already used his patches
> >>>    to test the ident mapping code.
> >>> 5. some debug patches for boot/kaslr.
> >>>
> >>> Second part are setup_data related:
> >>> Now setup_data is reserved via memblock and e820 and different
> >>> handlers have different ways, and it is confusing.
> >>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
> >>>         have memory wasted.
> >>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
> >>>         have memory wasted.
> >>> 3. SETUP_DTB: is copied locally.
> >>>         have memory wasted.
> >>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
> >>> Also setup_data is exported to debugfs for debug purpose.
> >>> Here will convert to let every handler to decide how to handle it.
> >>> and will not reserve the setup_data generally, so will not
> >>> waste memory and also make memblock/e820 keep page aligned.
> >>> 1. not touch E820 anymore.
> >>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
> >>> 3. SETUP_DTB: reserver and copy to local and free.
> >>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
> >>> 5. export SETUP_PCI via sysfs.
> >>>
> >>> Third part are some small cleanup patches.
> >>>
> >>> put those patches at
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
> >>>
> >>> Thanks
> >>>
> >>> Yinghai
> >>>
> >>>
> >>> Baoquan He (7):
> >>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
> >>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
> >>>   x86, kaslr: Add two functions which will be used later
> >>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
> >>>   x86, kaslr: Randomize physical and virtual address of kernel separately
> >>>   x86, kaslr: Add support of kernel physical address randomization above 4G
> >>>   x86, kaslr: Remove useless codes
> >>>
> >>> Yinghai Lu (35):
> >>>   x86, kasl: Remove not needed parameter for choose_kernel_location
> >>>   x86, boot: Move compressed kernel to end of buffer before decompressing
> >>>   x86, boot: Fix run_size calculation
> >>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
> >>>   x86, kaslr: rename output_size to output_run_size
> >>>   x86, kaslr: Consolidate mem_avoid array filling
> >>>   x86, boot: Move z_extract_offset calculation to header.S
> >>>   x86, kaslr: Get correct max_addr for relocs pointer
> >>>   x86, boot: Split kernel_ident_mapping_init to another file
> >>>   x86, 64bit: Set ident_mapping for kaslr
> >>>   x86, boot: Add checking for memcpy
> >>>   x86, kaslr: Allow random address could be below loaded address
> >>>   x86, boot: Add printf support for early console in compressed/misc.c
> >>>   x86, boot: Add more debug printout in compressed/misc.c
> >>>   x86, setup: Check early serial console per string instead of one char
> >>>   x86, setup: Use puts() instead of printf() in edd code
> >>>   x86: Setup early console as early as possible in x86_start_kernel()
> >>>   x86, boot: print compression suffix in decompress stage
> >>>   x86: remove not needed clear_page calling
> >>>   x86: restore end_of_ram to E820_RAM
> >>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
> >>>   x86: Find correct 64 bit ramdisk address for microcode early update
> >>>   x86: Kill E820_RESERVED_KERN
> >>>   x86, efi: Copy SETUP_EFI data and access directly
> >>>   x86, of: Let add_dtb reserve setup_data locally
> >>>   x86, boot: Add add_pci handler for SETUP_PCI
> >>>   x86: Kill not used setup_data handling code
> >>>   x86, boot, PCI: Convert SETUP_PCI data to list
> >>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
> >>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
> >>>   x86: Fix typo in mark_rodata_ro
> >>>   x86, 64bit: add pfn_range_is_highmapped()
> >>>   x86, 64bit: remove highmap for not needed ranges
> >>>   x86, 64bit: Add __pa_high/__va_high
> >>>   x86: fix msr print again
> >>>
> >>>  Documentation/x86/boot.txt                  |  19 ++
> >>>  arch/x86/boot/Makefile                      |  13 +-
> >>>  arch/x86/boot/compressed/Makefile           |  21 +-
> >>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
> >>>  arch/x86/boot/compressed/eboot.c            |  15 +-
> >>>  arch/x86/boot/compressed/head_32.S          |  14 +-
> >>>  arch/x86/boot/compressed/head_64.S          |  22 +-
> >>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
> >>>  arch/x86/boot/compressed/misc.h             |  41 +++-
> >>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
> >>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
> >>>  arch/x86/boot/compressed/printf.c           |   5 +
> >>>  arch/x86/boot/compressed/string.c           |  28 ++-
> >>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
> >>>  arch/x86/boot/edd.c                         |   4 +-
> >>>  arch/x86/boot/header.S                      |  34 ++-
> >>>  arch/x86/boot/tty.c                         |  14 +-
> >>>  arch/x86/include/asm/boot.h                 |  19 ++
> >>>  arch/x86/include/asm/efi.h                  |   2 +-
> >>>  arch/x86/include/asm/page.h                 |   5 +
> >>>  arch/x86/include/asm/pci.h                  |   4 +
> >>>  arch/x86/include/asm/pgtable_64.h           |   2 +
> >>>  arch/x86/include/asm/processor.h            |   1 -
> >>>  arch/x86/include/asm/prom.h                 |   9 +-
> >>>  arch/x86/include/asm/setup.h                |   5 +
> >>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
> >>>  arch/x86/include/uapi/asm/e820.h            |   8 -
> >>>  arch/x86/kernel/asm-offsets.c               |   2 +
> >>>  arch/x86/kernel/cpu/common.c                |  61 +++---
> >>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
> >>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
> >>>  arch/x86/kernel/devicetree.c                |  39 ++--
> >>>  arch/x86/kernel/e820.c                      |  18 +-
> >>>  arch/x86/kernel/head.c                      |  26 +++
> >>>  arch/x86/kernel/head32.c                    |   1 +
> >>>  arch/x86/kernel/head64.c                    |  21 +-
> >>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
> >>>  arch/x86/kernel/setup.c                     |  79 ++-----
> >>>  arch/x86/kernel/tboot.c                     |   3 +-
> >>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
> >>>  arch/x86/mm/ident_map.c                     |  74 +++++++
> >>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
> >>>  arch/x86/mm/pageattr.c                      |  16 +-
> >>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
> >>>  arch/x86/platform/efi/efi.c                 |  13 +-
> >>>  arch/x86/platform/efi/efi_64.c              |  10 +-
> >>>  arch/x86/platform/efi/quirks.c              |  23 +-
> >>>  arch/x86/tools/calc_run_size.sh             |  42 ----
> >>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
> >>>  kernel/printk/printk.c                      |  11 +-
> >>>  50 files changed, 1235 insertions(+), 661 deletions(-)
> >>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
> >>>  create mode 100644 arch/x86/boot/compressed/printf.c
> >>>  create mode 100644 arch/x86/mm/ident_map.c
> >>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
> >>>
> >>> --
> >>> 1.8.4.5
> >>>
> >>
> >>
> >>
> >> --
> >> Kees Cook
> >> Chrome OS Security
> >
> >
> >
> > --
> > Kees Cook
> > Chrome OS Security
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> -- 
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
@ 2016-02-15  7:29           ` Baoquan He
  0 siblings, 0 replies; 79+ messages in thread
From: Baoquan He @ 2016-02-15  7:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On 02/08/16 at 08:31pm, Kees Cook wrote:
> On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
> > Hi,
> >
> > Recently people using big box servers are also very interested in kaslr and want
> > to have it to enhance security. So allowing kaslr be able to randomize above 4G
> > makes much sense for different kinds of system. I would like to repost patches
> > realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
> > to understand and adjust the rest with yh and reviewers' help, then post. But
> > firstly I will focus on kaslr and try to make it merge into Linus's tree.
> >
> > Since this patchset includes too many issues and people usually like reviewing
> > post which takes care of one main issue in one thread, I will start from below
> > thread. It mainly includes kaslr above 4G support and bug fixes and several clean
> > up patch.
> >
> > x86, boot: kaslr cleanup and 64bit kaslr support
> > https://lwn.net/Articles/637115/
> >
> > The following patch lists is taken from yh's cover letter of above patch thread.
> >
> > **************************
> > My plan is split them into
> > 1) kaslr above 4G support
> >   x86, boot: Split kernel_ident_mapping_init to another file
> >   x86, 64bit: Set ident_mapping for kaslr
> >   x86, boot: Add checking for memcpy
> >   x86, boot: Move z_extract_offset calculation to header.S
> >   x86, boot: Simplify run_size calculation
> >   x86, kaslr: Kill not used run_size related code.
> >   x86, kaslr: Use output_run_size
> >   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
> >   x86, kaslr: Introduce struct slot_area to manage randomization slot info
> >   x86, kaslr: Add two functions which will be used later
> >   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
> >   x86, kaslr: Randomize physical and virtual address of kernel separately
> >   x86, kaslr: Add support of kernel physical address randomization above 4G
> >   x86, kaslr: Remove useless codes
> > 2) allow kaslr to choose slots below loaded address
> >   x86, kaslr: Consolidate mem_avoid array filling
> >   x86, kaslr: Allow random address could be below loaded address
> > 3) Make data from decompress_kernel stage live longer (bug fix)
> >   x86, boot: Make data from decompress_kernel stage live longer
> > 4) Get correct max_addr for relocs pointer (improvement)
> >   x86, kaslr: Get correct max_addr for relocs pointer
> >
> > The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
> > complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
> > above 4G support.
> >
> > That's all I plan to do. Suggestion or comments are welcome.
> 
> That sounds great, thanks! Please check the rest of the thread where I
> asked a number of questions that remain unanswered. If we can get some
> clarification on those points, I think it would help move this along
> more quickly.

Hi Kees,

Thanks for your suggestion. I am trying to understand all patches and
make some adjustment, meanwhile adjust patch log with my understanding.
And your questions help me understand it deeper. I will post after
updating. Hope you, Yinghai and other experts can help review and give
precious comments and suggestions.

Thanks
Baoquan

> 
> -Kees
> 
> >
> > Thanks
> > Baoquan
> >
> > ----- Original Message -----
> > From: "Kees Cook" <keescook@chromium.org>
> > To: "Yinghai Lu" <yinghai@kernel.org>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>, "Baoquan He" <bhe@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>
> > Sent: Saturday, October 3, 2015 4:16:40 AM
> > Subject: Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
> >
> > Hi,
> >
> > Has there been any more work on these series of patches? I asked many
> > questions in my earlier review, but nothing was answered.
> >
> > Thanks,
> >
> > -Kees
> >
> > On Tue, Jul 7, 2015 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote:
> >> On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >>> Those patches are rebased on v4.2-rc1 that I sent before but were rejected
> >>> by Ingo on changelog.
> >>>
> >>> Kees Cook said that he would like to give a try to make improvement on changelog
> >>> to get things moving.
> >>
> >> Thanks for working on this! I think it might be best to split this
> >> long series into shorter ones. It seems like there are several areas:
> >>
> >> - fixing kASLR
> >> - extended kASLR above 4G
> >> - setup_data cleanup
> >> - various other cleanups
> >>
> >> It might make sense to keep them separate for easier review?
> >>
> >> -Kees
> >>
> >>>
> >>> First part are kaslr related:
> >>> 1. First put compressed kernel ZO near end of the buffer before decompressing
> >>> so we can find the ZO position easily for kaslr buffer searchin
> >>> 2. kill run_size calculation shell scripts.
> >>> 3. create new ident mapping for kasl 64bit, so we can cover
> >>>    above 4G random kernel base
> >>> 4. 7 patches from He that support random random, as I already used his patches
> >>>    to test the ident mapping code.
> >>> 5. some debug patches for boot/kaslr.
> >>>
> >>> Second part are setup_data related:
> >>> Now setup_data is reserved via memblock and e820 and different
> >>> handlers have different ways, and it is confusing.
> >>> 1. SETUP_E820_EXT: is consumed early and will not copy or access again.
> >>>         have memory wasted.
> >>> 2. SETUP_EFI: is accessed via ioremap every time at early stage.
> >>>         have memory wasted.
> >>> 3. SETUP_DTB: is copied locally.
> >>>         have memory wasted.
> >>> 4. SETUP_PCI: is accessed via ioremap for every pci devices, even run-time.
> >>> Also setup_data is exported to debugfs for debug purpose.
> >>> Here will convert to let every handler to decide how to handle it.
> >>> and will not reserve the setup_data generally, so will not
> >>> waste memory and also make memblock/e820 keep page aligned.
> >>> 1. not touch E820 anymore.
> >>> 2. copy SETUP_EFI to __initdata variable and access it without ioremap.
> >>> 3. SETUP_DTB: reserver and copy to local and free.
> >>> 4. SETUP_PCI: reverve localy and convert to list, to avoid keeping ioremap.
> >>> 5. export SETUP_PCI via sysfs.
> >>>
> >>> Third part are some small cleanup patches.
> >>>
> >>> put those patches at
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-v4.3-next
> >>>
> >>> Thanks
> >>>
> >>> Yinghai
> >>>
> >>>
> >>> Baoquan He (7):
> >>>   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
> >>>   x86, kaslr: Introduce struct slot_area to manage randomization slot info
> >>>   x86, kaslr: Add two functions which will be used later
> >>>   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
> >>>   x86, kaslr: Randomize physical and virtual address of kernel separately
> >>>   x86, kaslr: Add support of kernel physical address randomization above 4G
> >>>   x86, kaslr: Remove useless codes
> >>>
> >>> Yinghai Lu (35):
> >>>   x86, kasl: Remove not needed parameter for choose_kernel_location
> >>>   x86, boot: Move compressed kernel to end of buffer before decompressing
> >>>   x86, boot: Fix run_size calculation
> >>>   x86, kaslr: Kill not needed and wrong run_size calculation code.
> >>>   x86, kaslr: rename output_size to output_run_size
> >>>   x86, kaslr: Consolidate mem_avoid array filling
> >>>   x86, boot: Move z_extract_offset calculation to header.S
> >>>   x86, kaslr: Get correct max_addr for relocs pointer
> >>>   x86, boot: Split kernel_ident_mapping_init to another file
> >>>   x86, 64bit: Set ident_mapping for kaslr
> >>>   x86, boot: Add checking for memcpy
> >>>   x86, kaslr: Allow random address could be below loaded address
> >>>   x86, boot: Add printf support for early console in compressed/misc.c
> >>>   x86, boot: Add more debug printout in compressed/misc.c
> >>>   x86, setup: Check early serial console per string instead of one char
> >>>   x86, setup: Use puts() instead of printf() in edd code
> >>>   x86: Setup early console as early as possible in x86_start_kernel()
> >>>   x86, boot: print compression suffix in decompress stage
> >>>   x86: remove not needed clear_page calling
> >>>   x86: restore end_of_ram to E820_RAM
> >>>   x86, boot: Allow 64bit EFI kernel to be loaded above 4G
> >>>   x86: Find correct 64 bit ramdisk address for microcode early update
> >>>   x86: Kill E820_RESERVED_KERN
> >>>   x86, efi: Copy SETUP_EFI data and access directly
> >>>   x86, of: Let add_dtb reserve setup_data locally
> >>>   x86, boot: Add add_pci handler for SETUP_PCI
> >>>   x86: Kill not used setup_data handling code
> >>>   x86, boot, PCI: Convert SETUP_PCI data to list
> >>>   x86, boot, PCI: Copy SETUP_PCI rom to kernel space
> >>>   x86, boot, PCI: Export SETUP_PCI data via sysfs
> >>>   x86: Fix typo in mark_rodata_ro
> >>>   x86, 64bit: add pfn_range_is_highmapped()
> >>>   x86, 64bit: remove highmap for not needed ranges
> >>>   x86, 64bit: Add __pa_high/__va_high
> >>>   x86: fix msr print again
> >>>
> >>>  Documentation/x86/boot.txt                  |  19 ++
> >>>  arch/x86/boot/Makefile                      |  13 +-
> >>>  arch/x86/boot/compressed/Makefile           |  21 +-
> >>>  arch/x86/boot/compressed/aslr.c             | 258 ++++++++++++++++-------
> >>>  arch/x86/boot/compressed/eboot.c            |  15 +-
> >>>  arch/x86/boot/compressed/head_32.S          |  14 +-
> >>>  arch/x86/boot/compressed/head_64.S          |  22 +-
> >>>  arch/x86/boot/compressed/misc.c             | 129 +++++++++---
> >>>  arch/x86/boot/compressed/misc.h             |  41 +++-
> >>>  arch/x86/boot/compressed/misc_pgt.c         |  91 ++++++++
> >>>  arch/x86/boot/compressed/mkpiggy.c          |  28 +--
> >>>  arch/x86/boot/compressed/printf.c           |   5 +
> >>>  arch/x86/boot/compressed/string.c           |  28 ++-
> >>>  arch/x86/boot/compressed/vmlinux.lds.S      |   1 +
> >>>  arch/x86/boot/edd.c                         |   4 +-
> >>>  arch/x86/boot/header.S                      |  34 ++-
> >>>  arch/x86/boot/tty.c                         |  14 +-
> >>>  arch/x86/include/asm/boot.h                 |  19 ++
> >>>  arch/x86/include/asm/efi.h                  |   2 +-
> >>>  arch/x86/include/asm/page.h                 |   5 +
> >>>  arch/x86/include/asm/pci.h                  |   4 +
> >>>  arch/x86/include/asm/pgtable_64.h           |   2 +
> >>>  arch/x86/include/asm/processor.h            |   1 -
> >>>  arch/x86/include/asm/prom.h                 |   9 +-
> >>>  arch/x86/include/asm/setup.h                |   5 +
> >>>  arch/x86/include/uapi/asm/bootparam.h       |   1 +
> >>>  arch/x86/include/uapi/asm/e820.h            |   8 -
> >>>  arch/x86/kernel/asm-offsets.c               |   2 +
> >>>  arch/x86/kernel/cpu/common.c                |  61 +++---
> >>>  arch/x86/kernel/cpu/microcode/amd_early.c   |  10 +-
> >>>  arch/x86/kernel/cpu/microcode/intel_early.c |   8 +-
> >>>  arch/x86/kernel/devicetree.c                |  39 ++--
> >>>  arch/x86/kernel/e820.c                      |  18 +-
> >>>  arch/x86/kernel/head.c                      |  26 +++
> >>>  arch/x86/kernel/head32.c                    |   1 +
> >>>  arch/x86/kernel/head64.c                    |  21 +-
> >>>  arch/x86/kernel/kdebugfs.c                  | 142 -------------
> >>>  arch/x86/kernel/setup.c                     |  79 ++-----
> >>>  arch/x86/kernel/tboot.c                     |   3 +-
> >>>  arch/x86/kernel/vmlinux.lds.S               |   1 +
> >>>  arch/x86/mm/ident_map.c                     |  74 +++++++
> >>>  arch/x86/mm/init_64.c                       | 173 +++++++--------
> >>>  arch/x86/mm/pageattr.c                      |  16 +-
> >>>  arch/x86/pci/common.c                       | 313 ++++++++++++++++++++++++++--
> >>>  arch/x86/platform/efi/efi.c                 |  13 +-
> >>>  arch/x86/platform/efi/efi_64.c              |  10 +-
> >>>  arch/x86/platform/efi/quirks.c              |  23 +-
> >>>  arch/x86/tools/calc_run_size.sh             |  42 ----
> >>>  drivers/tty/serial/8250/8250_early.c        |  17 ++
> >>>  kernel/printk/printk.c                      |  11 +-
> >>>  50 files changed, 1235 insertions(+), 661 deletions(-)
> >>>  create mode 100644 arch/x86/boot/compressed/misc_pgt.c
> >>>  create mode 100644 arch/x86/boot/compressed/printf.c
> >>>  create mode 100644 arch/x86/mm/ident_map.c
> >>>  delete mode 100644 arch/x86/tools/calc_run_size.sh
> >>>
> >>> --
> >>> 1.8.4.5
> >>>
> >>
> >>
> >>
> >> --
> >> Kees Cook
> >> Chrome OS Security
> >
> >
> >
> > --
> > Kees Cook
> > Chrome OS Security
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> -- 
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
  2016-02-15  7:29           ` [kernel-hardening] " Baoquan He
@ 2016-02-16 23:50             ` Kees Cook
  -1 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-02-16 23:50 UTC (permalink / raw)
  To: Baoquan He
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On Sun, Feb 14, 2016 at 11:29 PM, Baoquan He <bhe@redhat.com> wrote:
> On 02/08/16 at 08:31pm, Kees Cook wrote:
>> On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
>> > Hi,
>> >
>> > Recently people using big box servers are also very interested in kaslr and want
>> > to have it to enhance security. So allowing kaslr be able to randomize above 4G
>> > makes much sense for different kinds of system. I would like to repost patches
>> > realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
>> > to understand and adjust the rest with yh and reviewers' help, then post. But
>> > firstly I will focus on kaslr and try to make it merge into Linus's tree.
>> >
>> > Since this patchset includes too many issues and people usually like reviewing
>> > post which takes care of one main issue in one thread, I will start from below
>> > thread. It mainly includes kaslr above 4G support and bug fixes and several clean
>> > up patch.
>> >
>> > x86, boot: kaslr cleanup and 64bit kaslr support
>> > https://lwn.net/Articles/637115/
>> >
>> > The following patch lists is taken from yh's cover letter of above patch thread.
>> >
>> > **************************
>> > My plan is split them into
>> > 1) kaslr above 4G support
>> >   x86, boot: Split kernel_ident_mapping_init to another file
>> >   x86, 64bit: Set ident_mapping for kaslr
>> >   x86, boot: Add checking for memcpy
>> >   x86, boot: Move z_extract_offset calculation to header.S
>> >   x86, boot: Simplify run_size calculation
>> >   x86, kaslr: Kill not used run_size related code.
>> >   x86, kaslr: Use output_run_size
>> >   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>> >   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>> >   x86, kaslr: Add two functions which will be used later
>> >   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>> >   x86, kaslr: Randomize physical and virtual address of kernel separately
>> >   x86, kaslr: Add support of kernel physical address randomization above 4G
>> >   x86, kaslr: Remove useless codes
>> > 2) allow kaslr to choose slots below loaded address
>> >   x86, kaslr: Consolidate mem_avoid array filling
>> >   x86, kaslr: Allow random address could be below loaded address
>> > 3) Make data from decompress_kernel stage live longer (bug fix)
>> >   x86, boot: Make data from decompress_kernel stage live longer
>> > 4) Get correct max_addr for relocs pointer (improvement)
>> >   x86, kaslr: Get correct max_addr for relocs pointer
>> >
>> > The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
>> > complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
>> > above 4G support.
>> >
>> > That's all I plan to do. Suggestion or comments are welcome.
>>
>> That sounds great, thanks! Please check the rest of the thread where I
>> asked a number of questions that remain unanswered. If we can get some
>> clarification on those points, I think it would help move this along
>> more quickly.
>
> Hi Kees,
>
> Thanks for your suggestion. I am trying to understand all patches and
> make some adjustment, meanwhile adjust patch log with my understanding.
> And your questions help me understand it deeper. I will post after
> updating. Hope you, Yinghai and other experts can help review and give
> precious comments and suggestions.

Sounds great! I look forward to them. :)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3
@ 2016-02-16 23:50             ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-02-16 23:50 UTC (permalink / raw)
  To: Baoquan He
  Cc: Yinghai Lu, H. Peter Anvin, LKML, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Vivek Goyal, kernel-hardening@lists.openwall.com

On Sun, Feb 14, 2016 at 11:29 PM, Baoquan He <bhe@redhat.com> wrote:
> On 02/08/16 at 08:31pm, Kees Cook wrote:
>> On Sat, Feb 6, 2016 at 3:50 AM, Baoquan He <bhe@redhat.com> wrote:
>> > Hi,
>> >
>> > Recently people using big box servers are also very interested in kaslr and want
>> > to have it to enhance security. So allowing kaslr be able to randomize above 4G
>> > makes much sense for different kinds of system. I would like to repost patches
>> > realted to kaslr in this patchset, and leave the rest to Yinghai. Or I can try
>> > to understand and adjust the rest with yh and reviewers' help, then post. But
>> > firstly I will focus on kaslr and try to make it merge into Linus's tree.
>> >
>> > Since this patchset includes too many issues and people usually like reviewing
>> > post which takes care of one main issue in one thread, I will start from below
>> > thread. It mainly includes kaslr above 4G support and bug fixes and several clean
>> > up patch.
>> >
>> > x86, boot: kaslr cleanup and 64bit kaslr support
>> > https://lwn.net/Articles/637115/
>> >
>> > The following patch lists is taken from yh's cover letter of above patch thread.
>> >
>> > **************************
>> > My plan is split them into
>> > 1) kaslr above 4G support
>> >   x86, boot: Split kernel_ident_mapping_init to another file
>> >   x86, 64bit: Set ident_mapping for kaslr
>> >   x86, boot: Add checking for memcpy
>> >   x86, boot: Move z_extract_offset calculation to header.S
>> >   x86, boot: Simplify run_size calculation
>> >   x86, kaslr: Kill not used run_size related code.
>> >   x86, kaslr: Use output_run_size
>> >   x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G
>> >   x86, kaslr: Introduce struct slot_area to manage randomization slot info
>> >   x86, kaslr: Add two functions which will be used later
>> >   x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address
>> >   x86, kaslr: Randomize physical and virtual address of kernel separately
>> >   x86, kaslr: Add support of kernel physical address randomization above 4G
>> >   x86, kaslr: Remove useless codes
>> > 2) allow kaslr to choose slots below loaded address
>> >   x86, kaslr: Consolidate mem_avoid array filling
>> >   x86, kaslr: Allow random address could be below loaded address
>> > 3) Make data from decompress_kernel stage live longer (bug fix)
>> >   x86, boot: Make data from decompress_kernel stage live longer
>> > 4) Get correct max_addr for relocs pointer (improvement)
>> >   x86, kaslr: Get correct max_addr for relocs pointer
>> >
>> > The 2) could be added into 1) post. I take it out because the mem_avoid issue is very
>> > complicated, can be discussed in a separate thread. And 1) post only focus the kaslr
>> > above 4G support.
>> >
>> > That's all I plan to do. Suggestion or comments are welcome.
>>
>> That sounds great, thanks! Please check the rest of the thread where I
>> asked a number of questions that remain unanswered. If we can get some
>> clarification on those points, I think it would help move this along
>> more quickly.
>
> Hi Kees,
>
> Thanks for your suggestion. I am trying to understand all patches and
> make some adjustment, meanwhile adjust patch log with my understanding.
> And your questions help me understand it deeper. I will post after
> updating. Hope you, Yinghai and other experts can help review and give
> precious comments and suggestions.

Sounds great! I look forward to them. :)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2016-02-16 23:50 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-07 20:19 [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Yinghai Lu
2015-07-07 20:19 ` [PATCH 01/42] x86, kasl: Remove not needed parameter for choose_kernel_location Yinghai Lu
2015-07-07 20:57   ` Kees Cook
2015-07-07 20:19 ` [PATCH 02/42] x86, boot: Move compressed kernel to end of buffer before decompressing Yinghai Lu
2015-07-07 21:22   ` Kees Cook
2015-07-07 20:19 ` [PATCH 03/42] x86, boot: Fix run_size calculation Yinghai Lu
2015-07-07 22:15   ` Kees Cook
2015-07-07 20:19 ` [PATCH 04/42] x86, kaslr: Kill not needed and wrong run_size calculation code Yinghai Lu
2015-07-07 22:18   ` Kees Cook
2015-07-07 20:19 ` [PATCH 05/42] x86, kaslr: rename output_size to output_run_size Yinghai Lu
2015-07-07 20:19 ` [PATCH 06/42] x86, kaslr: Consolidate mem_avoid array filling Yinghai Lu
2015-07-07 22:36   ` Kees Cook
2015-07-07 20:19 ` [PATCH 07/42] x86, boot: Move z_extract_offset calculation to header.S Yinghai Lu
2015-07-07 20:19 ` [PATCH 08/42] x86, kaslr: Get correct max_addr for relocs pointer Yinghai Lu
2015-07-07 22:40   ` Kees Cook
2015-07-07 20:19 ` [PATCH 09/42] x86, boot: Split kernel_ident_mapping_init to another file Yinghai Lu
2015-07-07 20:19 ` [PATCH 10/42] x86, 64bit: Set ident_mapping for kaslr Yinghai Lu
2015-07-07 20:19 ` [PATCH 11/42] x86, boot: Add checking for memcpy Yinghai Lu
2015-07-07 20:19 ` [PATCH 12/42] x86, kaslr: Fix a bug that relocation can not be handled when kernel is loaded above 2G Yinghai Lu
2015-07-07 22:42   ` Kees Cook
2015-07-07 20:19 ` [PATCH 13/42] x86, kaslr: Introduce struct slot_area to manage randomization slot info Yinghai Lu
2015-07-07 20:20 ` [PATCH 14/42] x86, kaslr: Add two functions which will be used later Yinghai Lu
2015-07-07 20:20 ` [PATCH 15/42] x86, kaslr: Introduce fetch_random_virt_offset to randomize the kernel text mapping address Yinghai Lu
2015-07-07 20:20 ` [PATCH 16/42] x86, kaslr: Randomize physical and virtual address of kernel separately Yinghai Lu
2015-07-07 20:20 ` [PATCH 17/42] x86, kaslr: Add support of kernel physical address randomization above 4G Yinghai Lu
2015-07-07 20:20 ` [PATCH 18/42] x86, kaslr: Remove useless codes Yinghai Lu
2015-07-07 20:20 ` [PATCH 19/42] x86, kaslr: Allow random address could be below loaded address Yinghai Lu
2015-07-07 20:20 ` [PATCH 20/42] x86, boot: Add printf support for early console in compressed/misc.c Yinghai Lu
2015-07-07 20:20 ` [PATCH 21/42] x86, boot: Add more debug printout " Yinghai Lu
2015-07-07 20:20 ` [PATCH 22/42] x86, setup: Check early serial console per string instead of one char Yinghai Lu
2015-07-07 22:59   ` Kees Cook
2015-07-07 20:20 ` [PATCH 23/42] x86, setup: Use puts() instead of printf() in edd code Yinghai Lu
2015-07-07 20:20 ` [PATCH 24/42] x86: Setup early console as early as possible in x86_start_kernel() Yinghai Lu
2015-07-07 20:20 ` [PATCH 25/42] x86, boot: print compression suffix in decompress stage Yinghai Lu
2015-07-07 23:13   ` Kees Cook
2015-07-07 20:20 ` [PATCH 26/42] x86: remove not needed clear_page calling Yinghai Lu
2015-07-07 23:14   ` Kees Cook
2015-07-07 20:20 ` [PATCH 27/42] x86: restore end_of_ram to E820_RAM Yinghai Lu
2015-07-08 17:44   ` Matt Fleming
2015-07-09  1:41     ` Dan Williams
2015-07-09  7:45     ` Christoph Hellwig
2015-07-09 11:17       ` Matt Fleming
2015-07-07 20:20 ` [PATCH 28/42] x86, boot: Allow 64bit EFI kernel to be loaded above 4G Yinghai Lu
2015-07-07 23:12   ` Kees Cook
2015-07-08 18:00     ` Matt Fleming
2015-07-07 20:20 ` [PATCH 29/42] x86: Find correct 64 bit ramdisk address for microcode early update Yinghai Lu
2015-07-07 23:08   ` Kees Cook
2015-07-07 20:20 ` [PATCH 30/42] x86: Kill E820_RESERVED_KERN Yinghai Lu
2015-07-07 20:20 ` [PATCH 31/42] x86, efi: Copy SETUP_EFI data and access directly Yinghai Lu
2015-07-22 10:58   ` Matt Fleming
2015-07-22 10:58     ` Matt Fleming
2015-07-24  2:07   ` Dave Young
2015-07-24  2:07     ` Dave Young
2015-07-07 20:20 ` [PATCH 32/42] x86, of: Let add_dtb reserve setup_data locally Yinghai Lu
2015-07-07 20:20 ` [PATCH 33/42] x86, boot: Add add_pci handler for SETUP_PCI Yinghai Lu
2015-07-14 22:30   ` Bjorn Helgaas
2015-07-07 20:20 ` [PATCH 34/42] x86: Kill not used setup_data handling code Yinghai Lu
2015-07-07 20:20 ` [PATCH 35/42] x86, boot, PCI: Convert SETUP_PCI data to list Yinghai Lu
2015-07-14 22:35   ` Bjorn Helgaas
2015-07-15  1:57     ` Yinghai Lu
2015-07-07 20:20 ` [PATCH 36/42] x86, boot, PCI: Copy SETUP_PCI rom to kernel space Yinghai Lu
2015-07-07 20:20 ` [PATCH 37/42] x86, boot, PCI: Export SETUP_PCI data via sysfs Yinghai Lu
2015-07-07 20:20 ` [PATCH 38/42] x86: Fix typo in mark_rodata_ro Yinghai Lu
2015-07-07 23:05   ` Kees Cook
2015-07-07 20:20 ` [PATCH 39/42] x86, 64bit: add pfn_range_is_highmapped() Yinghai Lu
2015-07-07 20:20 ` [PATCH 40/42] x86, 64bit: remove highmap for not needed ranges Yinghai Lu
2015-07-07 23:17   ` Kees Cook
2015-07-07 20:20 ` [PATCH 41/42] x86, 64bit: Add __pa_high/__va_high Yinghai Lu
2015-07-07 20:20 ` [PATCH 42/42] x86: fix msr print again Yinghai Lu
2015-07-07 23:21 ` [PATCH 00/42] x86: updated patches for kaslr and setup_data etc for v4.3 Kees Cook
2015-10-02 20:16   ` Kees Cook
2016-02-06 11:50     ` Baoquan He
2016-02-09  4:31       ` Kees Cook
2016-02-09  4:31         ` [kernel-hardening] " Kees Cook
2016-02-15  7:29         ` Baoquan He
2016-02-15  7:29           ` [kernel-hardening] " Baoquan He
2016-02-16 23:50           ` Kees Cook
2016-02-16 23:50             ` [kernel-hardening] " Kees Cook
2015-07-08 10:51 ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.