From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753343AbbGQTcs (ORCPT ); Fri, 17 Jul 2015 15:32:48 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:33277 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751990AbbGQTcr (ORCPT ); Fri, 17 Jul 2015 15:32:47 -0400 Date: Fri, 17 Jul 2015 21:32:42 +0200 From: Ingo Molnar To: Dave Hansen Cc: linux-kernel@vger.kernel.org, Andy Lutomirski , Borislav Petkov , Fenghua Yu , "H. Peter Anvin" , Linus Torvalds , Oleg Nesterov , Thomas Gleixner , Ross Zwisler Subject: Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework Message-ID: <20150717193242.GA26148@gmail.com> References: <1430848300-27877-1-git-send-email-mingo@kernel.org> <1430848300-27877-19-git-send-email-mingo@kernel.org> <55A56709.6020201@linux.intel.com> <20150715110726.GA26611@gmail.com> <55A6FC31.5010102@linux.intel.com> <20150717074555.GA31873@gmail.com> <55A9341B.2050401@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55A9341B.2050401@linux.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dave Hansen wrote: > On 07/17/2015 12:45 AM, Ingo Molnar wrote: > > Just curious: does any released hardware have AVX-512? I went by Wikipedia, which > > seems to list pre-release hw: > > > >> We might know the size and composition of the individual components, but we do > >> not know the size of the buffer. Different implementations of a given feature > >> are quite free to have different data stored in the buffer, or even to rearrange > >> or pad it. That's why the sizes are not explicitly called out by the > >> architecture and why we enumerated them before your patch that caused this > >> regression. > > > > But we _have_ to know their structure and layout of the XSAVE context for any > > reasonable ptrace and signal frame support. > > There are two different things here. One is the structure and layout inside of > the state components. That obviously needs full kernel knowledge and can not be > opaque, especially when the kernel needs to go looking at it (like with MPX's > BNDCSR for instance). > > But, the relative layout of the components is up for grabs. The CPU is > completely free (architecturally) to pad components or rearrange things. > > It's not opaque (it's fully enumerated in CPUID), but it's far from something > which is static or which we can realistically represent in a static structure. Ok, agreed. > > Can you set/get AVX-512 registers via ptrace? MPX state? > > The xsave buffer is just copied out to userspace with REGSET_XSTATE. Userspace > needs to do the same song and dance with CPUID to parse it that the kernel does. Indeed - I missed REGSET_XSTATE and its interaction with update_regset_xstate_info(). Good - I have no other complaints. > > This needs some (very minor) changes to kernel/fork.c to allow an architecture > > to determine the full task_struct size dynamically - but looks very doable and > > clean. Wanna try this, or should I? > > I think you already did this later in the thread. Yeah, wanted to get a fix for the regression to Linus ASAP. If we go changing core code in kernel/fork.c we better have it in -rc3. So right now I have these two applied: 0f6df268588f x86/fpu, sched: Dynamically allocate 'struct fpu' 218d096a24b4 x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86 ... do we need any of the other patches you sent to get working AVX512 support? I think we should be fine, but I don't have the hardware. Thanks, Ingo