From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755973AbbIBXfz (ORCPT ); Wed, 2 Sep 2015 19:35:55 -0400 Received: from mga01.intel.com ([192.55.52.88]:50192 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754849AbbIBXb3 (ORCPT ); Wed, 2 Sep 2015 19:31:29 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,457,1437462000"; d="scan'208";a="761194983" Subject: [PATCH 03/15] x86, fpu: remove XSTATE_RESERVE To: dave@sr71.net Cc: dave.hansen@linux.intel.com, mingo@redhat.com, x86@kernel.org, bp@alien8.de, fenghua.yu@intel.com, tim.c.chen@linux.intel.com, linux-kernel@vger.kernel.org From: Dave Hansen Date: Wed, 02 Sep 2015 16:31:25 -0700 References: <20150902233123.3A7E5FB0@viggo.jf.intel.com> In-Reply-To: <20150902233123.3A7E5FB0@viggo.jf.intel.com> Message-Id: <20150902233125.D948D475@viggo.jf.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dave Hansen The original purpose of XSTATE_RESERVE was to carve out space to store all of the possible extended state components that get saved with the XSAVE instruction(s). However, we are now almost entirely dynamically allocating the buffers we use for XSAVE by placing them at the end of the task_struct and them sizing them at boot. The one exception for that is the init_task. The maximum extended state component size that we have today is on systems with space for AVX-512 and Memory Protection Keys: 2696 bytes. We have reserved a PAGE_SIZE buffer in the init_task via fpregs_state->__padding. This check ensures that even if the component sizes or layout were changed (which we do not expect), that we will still not overflow the init_task's buffer. In the case that we detect we might overflow the buffer, we completely disable XSAVE support in the kernel and try to boot as if we had 'legacy x87 FPU' support in place. This is a crippled state without any of the XSAVE-enabled features (MPX, AVX, etc...). But, it at least let us boot safely. Signed-off-by: Dave Hansen Cc: Ingo Molnar Cc: x86@kernel.org Cc: Borislav Petkov Cc: Fenghua Yu Cc: Tim Chen Cc: linux-kernel@vger.kernel.org --- b/arch/x86/include/asm/fpu/types.h | 15 ++++----- b/arch/x86/kernel/fpu/xstate.c | 60 ++++++++++++++++++++++++++++++++----- 2 files changed, 59 insertions(+), 16 deletions(-) diff -puN arch/x86/include/asm/fpu/types.h~remove-xstate_reserve arch/x86/include/asm/fpu/types.h --- a/arch/x86/include/asm/fpu/types.h~remove-xstate_reserve 2015-09-02 15:52:48.449823403 -0700 +++ b/arch/x86/include/asm/fpu/types.h 2015-09-02 16:26:23.562389888 -0700 @@ -159,22 +159,19 @@ struct xstate_header { u64 reserved[6]; } __attribute__((packed)); -/* New processor state extensions should be added here: */ -#define XSTATE_RESERVE (sizeof(struct ymmh_struct) + \ - sizeof(struct lwp_struct) + \ - sizeof(struct mpx_struct) ) /* * This is our most modern FPU state format, as saved by the XSAVE * and restored by the XRSTOR instructions. * * It consists of a legacy fxregs portion, an xstate header and - * subsequent fixed size areas as defined by the xstate header. - * Not all CPUs support all the extensions. + * subsequent areas as defined by the xstate header. Not all CPUs + * support all the extensions, so the size of the extended area + * can vary quite a bit between CPUs. */ struct xregs_state { struct fxregs_state i387; struct xstate_header header; - u8 __reserved[XSTATE_RESERVE]; + u8 extended_state_area[0]; } __attribute__ ((packed, aligned (64))); /* @@ -182,7 +179,9 @@ struct xregs_state { * put together, so that we can pick the right one runtime. * * The size of the structure is determined by the largest - * member - which is the xsave area: + * member - which is the xsave area. The padding is there + * to ensure that statically-allocated task_structs (just + * the init_task today) have enough space. */ union fpregs_state { struct fregs_state fsave; diff -puN arch/x86/kernel/fpu/xstate.c~remove-xstate_reserve arch/x86/kernel/fpu/xstate.c --- a/arch/x86/kernel/fpu/xstate.c~remove-xstate_reserve 2015-09-02 15:52:48.451823494 -0700 +++ b/arch/x86/kernel/fpu/xstate.c 2015-09-02 16:26:19.740216279 -0700 @@ -312,24 +312,64 @@ static void __init setup_init_fpu_buf(vo /* * Calculate total size of enabled xstates in XCR0/xfeatures_mask. */ -static void __init init_xstate_size(void) +static unsigned int __init calculate_xstate_size(void) { unsigned int eax, ebx, ecx, edx; + unsigned int calculated_xstate_size; int i; if (!cpu_has_xsaves) { cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx); - xstate_size = ebx; - return; + calculated_xstate_size = ebx; + return calculated_xstate_size; } - xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE; + calculated_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE; for (i = 2; i < 64; i++) { if (test_bit(i, (unsigned long *)&xfeatures_mask)) { cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx); - xstate_size += eax; + calculated_xstate_size += eax; } } + return calculated_xstate_size; +} + +/* + * Will the runtime-enumerated 'xstate_size' fit in the init + * task's statically-allocated buffer? + */ +static bool is_supported_xstate_size(unsigned int test_xstate_size) +{ + if (test_xstate_size <= sizeof(union fpregs_state)) + return true; + + pr_warn("x86/fpu: xstate buffer too small (%zu < %d), disabling xsave\n", + sizeof(union fpregs_state), test_xstate_size); + return false; +} + +static int init_xstate_size(void) +{ + /* Recompute the context size for enabled features: */ + unsigned int possible_xstate_size = calculate_xstate_size(); + + /* Ensure we have the space to store all enabled: */ + if (!is_supported_xstate_size(possible_xstate_size)) + return -EINVAL; + + /* + * The size is OK, we are definitely going to use xsave, + * make it known to the world that we need more space. + */ + xstate_size = possible_xstate_size; + return 0; +} + +void fpu__init_disable_system_xstate(void) +{ + xfeatures_mask = 0; + cr4_clear_bits(X86_CR4_OSXSAVE); + fpu__xstate_clear_all_cpu_caps(); } /* @@ -340,6 +380,7 @@ void __init fpu__init_system_xstate(void { unsigned int eax, ebx, ecx, edx; static int on_boot_cpu = 1; + int err; WARN_ON_FPU(!on_boot_cpu); on_boot_cpu = 0; @@ -367,9 +408,12 @@ void __init fpu__init_system_xstate(void /* Enable xstate instructions to be able to continue with initialization: */ fpu__init_cpu_xstate(); - - /* Recompute the context size for enabled features: */ - init_xstate_size(); + err = init_xstate_size(); + if (err) { + /* something went wrong, boot without any XSAVE support */ + fpu__init_disable_system_xstate(); + return; + } update_regset_xstate_info(xstate_size, xfeatures_mask); fpu__init_prepare_fx_sw_frame(); _