From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF962C433EF for ; Mon, 27 Dec 2021 20:13:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 56F7049E3A; Mon, 27 Dec 2021 15:13:36 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@kernel.org Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l7PqLEZZL4Zd; Mon, 27 Dec 2021 15:13:34 -0500 (EST) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 11EFC49E3B; Mon, 27 Dec 2021 15:13:34 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 1BF1F49E36 for ; Mon, 27 Dec 2021 15:13:33 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U6W58Iq+Ntnz for ; Mon, 27 Dec 2021 15:13:31 -0500 (EST) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id B439749E39 for ; Mon, 27 Dec 2021 15:13:31 -0500 (EST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F1DAFB81142; Mon, 27 Dec 2021 20:13:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE192C36AE7; Mon, 27 Dec 2021 20:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1640636008; bh=uwbPNY5tEL2J9KworijcI/BzXuaiHuUEjSWMvWKe3MM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=G080XVmiH+EjNpkz80n9k88wC37Rl3tmAfbtR7tGsAcT0+til4usH4SA9E3NpBTOx oPzYccOpS04ej7Pol7x5adC6FidLRUFrU7uAa2olJpbl4Bo4r2qgCKD0ujlJySiNYz xHes4kVFrxhqOEmAFtcjrqY+ohEPM2eQzUcvNxQ+MkL2MhV8WluFrzgBeWTiSzrMsE CXEs23JjGM0QaPoG5UVELhc3pTu+bv7TqcCVMgEXDyxVBh3srTdRE7hwIH17N9k1fd Nl8VbcVt7tDHj41/WMqQq4beShmxzTr1A2SLosIMleyggn7aN+7sZkbIlnd4hpEHLr eiRaFwxh28rrA== Received: from cfbb000407.r.cam.camfibre.uk ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1n1wN4-00EcmG-Ic; Mon, 27 Dec 2021 20:13:26 +0000 Date: Mon, 27 Dec 2021 20:13:25 +0000 Message-ID: <87mtklztzu.wl-maz@kernel.org> From: Marc Zyngier To: Andrew Jones Subject: Re: [PATCH v2 4/5] hw/arm/virt: Use the PA range to compute the memory map In-Reply-To: <20211004101110.imtfcufnrdwhneev@gator> References: <20211003164605.3116450-1-maz@kernel.org> <20211003164605.3116450-5-maz@kernel.org> <20211004101110.imtfcufnrdwhneev@gator> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: drjones@redhat.com, qemu-devel@nongnu.org, eric.auger@redhat.com, peter.maydell@linaro.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, kernel-team@android.com, kvmarm@lists.cs.columbia.edu X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Mon, 04 Oct 2021 11:11:10 +0100, Andrew Jones wrote: > > On Sun, Oct 03, 2021 at 05:46:04PM +0100, Marc Zyngier wrote: > > The highmem attribute is nothing but another way to express the > > PA range of a VM. To support HW that has a smaller PA range then > > what QEMU assumes, pass this PA range to the virt_set_memmap() > > function, allowing it to correctly exclude highmem devices > > if they are outside of the PA range. > > > > Signed-off-by: Marc Zyngier > > --- > > hw/arm/virt.c | 46 +++++++++++++++++++++++++++++++++++----------- > > 1 file changed, 35 insertions(+), 11 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 9d2abdbd5f..a572e0c9d9 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1610,10 +1610,10 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx) > > return arm_cpu_mp_affinity(idx, clustersz); > > } > > > > -static void virt_set_memmap(VirtMachineState *vms) > > +static void virt_set_memmap(VirtMachineState *vms, int pa_bits) > > { > > MachineState *ms = MACHINE(vms); > > - hwaddr base, device_memory_base, device_memory_size; > > + hwaddr base, device_memory_base, device_memory_size, memtop; > > int i; > > > > vms->memmap = extended_memmap; > > @@ -1628,9 +1628,12 @@ static void virt_set_memmap(VirtMachineState *vms) > > exit(EXIT_FAILURE); > > } > > > > - if (!vms->highmem && > > - vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > > - error_report("highmem=off, but memory crosses the 4GiB limit\n"); > > + if (!vms->highmem) > > + pa_bits = 32; > > + > > + if (vms->memmap[VIRT_MEM].base + ms->maxram_size > BIT_ULL(pa_bits)) { > > + error_report("Addressing limited to %d bits, but memory exceeds it by %llu bytes\n", > > + pa_bits, vms->memmap[VIRT_MEM].base + ms->maxram_size - BIT_ULL(pa_bits)); > > exit(EXIT_FAILURE); > > } > > /* > > @@ -1645,7 +1648,7 @@ static void virt_set_memmap(VirtMachineState *vms) > > device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB; > > > > /* Base address of the high IO region */ > > - base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > + memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > if (base < device_memory_base) { > > error_report("maxmem/slots too huge"); > > exit(EXIT_FAILURE); > > @@ -1662,9 +1665,17 @@ static void virt_set_memmap(VirtMachineState *vms) > > vms->memmap[i].size = size; > > base += size; > > } > > - vms->highest_gpa = (vms->highmem ? > > - base : > > - vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > > + > > + /* > > + * If base fits within pa_bits, all good. If it doesn't, limit it > > + * to the end of RAM, which is guaranteed to fit within pa_bits. > > We tested that > > vms->memmap[VIRT_MEM].base + ms->maxram_size > > fits within pa_bits, but here we're setting highest_gpa to > > ROUND_UP(vms->memmap[VIRT_MEM].base + ms->ram_size, GiB) + > ROUND_UP(ms->maxram_size - ms->ram_size + ms->ram_slots * GiB, GiB) > > which will be larger. Shouldn't we test memtop instead to make this > guarantee? Yes, well spotted. > > > > + */ > > + if (base <= BIT_ULL(pa_bits)) { > > + vms->highest_gpa = base -1; > > + } else { > > + vms->highest_gpa = memtop - 1; > > + } > > + > > if (device_memory_size > 0) { > > ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); > > ms->device_memory->base = device_memory_base; > > @@ -1860,7 +1871,20 @@ static void machvirt_init(MachineState *machine) > > * to create a VM with the right number of IPA bits. > > */ > > if (!vms->memmap) { > > - virt_set_memmap(vms); > > + ARMCPU *armcpu = ARM_CPU(first_cpu); > > > I think it's too early to use first_cpu here (although, I'll admit I'm > always confused as to what gets initialized when...) Assuming we need to > realize the cpus first, then we don't do that until a bit further down > in this function. I wonder if it's possible to delay this memmap setup > until after cpu realization. I see the memmap getting used prior when > calculating virt_max_cpus, but that looks like it needs to be updated > anyway to take highmem into account as to whether or not we should > consider the high gicv3 redist region in the calculation. OK, this is nothing short of total hell. You can't create the memory map later, as MTE and the secure world both get in the way (they really want a valid memory map). And as you pointed out, using first_cpu is not appropriate here (obviously, I didn't test this nearly enough). I could split the creation of the CPUs in two sequences with the memory map creation in between, but this quickly becomes quite invasive. My current approach is to keep the current flow, but to create a temporary CPU, find whatever I need to know about it, and free it. Yes, this is a bit overkill, but it solves the chicken and egg issue simply enough. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9476BC433F5 for ; Mon, 27 Dec 2021 20:16:09 +0000 (UTC) Received: from localhost ([::1]:45366 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n1wPg-0002rI-AC for qemu-devel@archiver.kernel.org; Mon, 27 Dec 2021 15:16:08 -0500 Received: from eggs.gnu.org ([209.51.188.92]:42704) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n1wNG-0000wP-87 for qemu-devel@nongnu.org; Mon, 27 Dec 2021 15:13:38 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:59288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n1wND-0002GP-S1 for qemu-devel@nongnu.org; Mon, 27 Dec 2021 15:13:37 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F1DAFB81142; Mon, 27 Dec 2021 20:13:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE192C36AE7; Mon, 27 Dec 2021 20:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1640636008; bh=uwbPNY5tEL2J9KworijcI/BzXuaiHuUEjSWMvWKe3MM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=G080XVmiH+EjNpkz80n9k88wC37Rl3tmAfbtR7tGsAcT0+til4usH4SA9E3NpBTOx oPzYccOpS04ej7Pol7x5adC6FidLRUFrU7uAa2olJpbl4Bo4r2qgCKD0ujlJySiNYz xHes4kVFrxhqOEmAFtcjrqY+ohEPM2eQzUcvNxQ+MkL2MhV8WluFrzgBeWTiSzrMsE CXEs23JjGM0QaPoG5UVELhc3pTu+bv7TqcCVMgEXDyxVBh3srTdRE7hwIH17N9k1fd Nl8VbcVt7tDHj41/WMqQq4beShmxzTr1A2SLosIMleyggn7aN+7sZkbIlnd4hpEHLr eiRaFwxh28rrA== Received: from cfbb000407.r.cam.camfibre.uk ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1n1wN4-00EcmG-Ic; Mon, 27 Dec 2021 20:13:26 +0000 Date: Mon, 27 Dec 2021 20:13:25 +0000 Message-ID: <87mtklztzu.wl-maz@kernel.org> From: Marc Zyngier To: Andrew Jones Subject: Re: [PATCH v2 4/5] hw/arm/virt: Use the PA range to compute the memory map In-Reply-To: <20211004101110.imtfcufnrdwhneev@gator> References: <20211003164605.3116450-1-maz@kernel.org> <20211003164605.3116450-5-maz@kernel.org> <20211004101110.imtfcufnrdwhneev@gator> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: drjones@redhat.com, qemu-devel@nongnu.org, eric.auger@redhat.com, peter.maydell@linaro.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Received-SPF: pass client-ip=145.40.68.75; envelope-from=maz@kernel.org; helo=ams.source.kernel.org X-Spam_score_int: -26 X-Spam_score: -2.7 X-Spam_bar: -- X-Spam_report: (-2.7 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.575, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , kvm@vger.kernel.org, qemu-devel@nongnu.org, Eric Auger , kernel-team@android.com, kvmarm@lists.cs.columbia.edu Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Mon, 04 Oct 2021 11:11:10 +0100, Andrew Jones wrote: > > On Sun, Oct 03, 2021 at 05:46:04PM +0100, Marc Zyngier wrote: > > The highmem attribute is nothing but another way to express the > > PA range of a VM. To support HW that has a smaller PA range then > > what QEMU assumes, pass this PA range to the virt_set_memmap() > > function, allowing it to correctly exclude highmem devices > > if they are outside of the PA range. > > > > Signed-off-by: Marc Zyngier > > --- > > hw/arm/virt.c | 46 +++++++++++++++++++++++++++++++++++----------- > > 1 file changed, 35 insertions(+), 11 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 9d2abdbd5f..a572e0c9d9 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1610,10 +1610,10 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx) > > return arm_cpu_mp_affinity(idx, clustersz); > > } > > > > -static void virt_set_memmap(VirtMachineState *vms) > > +static void virt_set_memmap(VirtMachineState *vms, int pa_bits) > > { > > MachineState *ms = MACHINE(vms); > > - hwaddr base, device_memory_base, device_memory_size; > > + hwaddr base, device_memory_base, device_memory_size, memtop; > > int i; > > > > vms->memmap = extended_memmap; > > @@ -1628,9 +1628,12 @@ static void virt_set_memmap(VirtMachineState *vms) > > exit(EXIT_FAILURE); > > } > > > > - if (!vms->highmem && > > - vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > > - error_report("highmem=off, but memory crosses the 4GiB limit\n"); > > + if (!vms->highmem) > > + pa_bits = 32; > > + > > + if (vms->memmap[VIRT_MEM].base + ms->maxram_size > BIT_ULL(pa_bits)) { > > + error_report("Addressing limited to %d bits, but memory exceeds it by %llu bytes\n", > > + pa_bits, vms->memmap[VIRT_MEM].base + ms->maxram_size - BIT_ULL(pa_bits)); > > exit(EXIT_FAILURE); > > } > > /* > > @@ -1645,7 +1648,7 @@ static void virt_set_memmap(VirtMachineState *vms) > > device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB; > > > > /* Base address of the high IO region */ > > - base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > + memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > if (base < device_memory_base) { > > error_report("maxmem/slots too huge"); > > exit(EXIT_FAILURE); > > @@ -1662,9 +1665,17 @@ static void virt_set_memmap(VirtMachineState *vms) > > vms->memmap[i].size = size; > > base += size; > > } > > - vms->highest_gpa = (vms->highmem ? > > - base : > > - vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > > + > > + /* > > + * If base fits within pa_bits, all good. If it doesn't, limit it > > + * to the end of RAM, which is guaranteed to fit within pa_bits. > > We tested that > > vms->memmap[VIRT_MEM].base + ms->maxram_size > > fits within pa_bits, but here we're setting highest_gpa to > > ROUND_UP(vms->memmap[VIRT_MEM].base + ms->ram_size, GiB) + > ROUND_UP(ms->maxram_size - ms->ram_size + ms->ram_slots * GiB, GiB) > > which will be larger. Shouldn't we test memtop instead to make this > guarantee? Yes, well spotted. > > > > + */ > > + if (base <= BIT_ULL(pa_bits)) { > > + vms->highest_gpa = base -1; > > + } else { > > + vms->highest_gpa = memtop - 1; > > + } > > + > > if (device_memory_size > 0) { > > ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); > > ms->device_memory->base = device_memory_base; > > @@ -1860,7 +1871,20 @@ static void machvirt_init(MachineState *machine) > > * to create a VM with the right number of IPA bits. > > */ > > if (!vms->memmap) { > > - virt_set_memmap(vms); > > + ARMCPU *armcpu = ARM_CPU(first_cpu); > > > I think it's too early to use first_cpu here (although, I'll admit I'm > always confused as to what gets initialized when...) Assuming we need to > realize the cpus first, then we don't do that until a bit further down > in this function. I wonder if it's possible to delay this memmap setup > until after cpu realization. I see the memmap getting used prior when > calculating virt_max_cpus, but that looks like it needs to be updated > anyway to take highmem into account as to whether or not we should > consider the high gicv3 redist region in the calculation. OK, this is nothing short of total hell. You can't create the memory map later, as MTE and the secure world both get in the way (they really want a valid memory map). And as you pointed out, using first_cpu is not appropriate here (obviously, I didn't test this nearly enough). I could split the creation of the CPUs in two sequences with the memory map creation in between, but this quickly becomes quite invasive. My current approach is to keep the current flow, but to create a temporary CPU, find whatever I need to know about it, and free it. Yes, this is a bit overkill, but it solves the chicken and egg issue simply enough. Thanks, M. -- Without deviation from the norm, progress is not possible. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ECF7C433F5 for ; Mon, 27 Dec 2021 20:16:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232753AbhL0UNe (ORCPT ); Mon, 27 Dec 2021 15:13:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230053AbhL0UNd (ORCPT ); Mon, 27 Dec 2021 15:13:33 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FA8CC06173E for ; Mon, 27 Dec 2021 12:13:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1645FB81145 for ; Mon, 27 Dec 2021 20:13:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE192C36AE7; Mon, 27 Dec 2021 20:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1640636008; bh=uwbPNY5tEL2J9KworijcI/BzXuaiHuUEjSWMvWKe3MM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=G080XVmiH+EjNpkz80n9k88wC37Rl3tmAfbtR7tGsAcT0+til4usH4SA9E3NpBTOx oPzYccOpS04ej7Pol7x5adC6FidLRUFrU7uAa2olJpbl4Bo4r2qgCKD0ujlJySiNYz xHes4kVFrxhqOEmAFtcjrqY+ohEPM2eQzUcvNxQ+MkL2MhV8WluFrzgBeWTiSzrMsE CXEs23JjGM0QaPoG5UVELhc3pTu+bv7TqcCVMgEXDyxVBh3srTdRE7hwIH17N9k1fd Nl8VbcVt7tDHj41/WMqQq4beShmxzTr1A2SLosIMleyggn7aN+7sZkbIlnd4hpEHLr eiRaFwxh28rrA== Received: from cfbb000407.r.cam.camfibre.uk ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1n1wN4-00EcmG-Ic; Mon, 27 Dec 2021 20:13:26 +0000 Date: Mon, 27 Dec 2021 20:13:25 +0000 Message-ID: <87mtklztzu.wl-maz@kernel.org> From: Marc Zyngier To: Andrew Jones Cc: qemu-devel@nongnu.org, Eric Auger , Peter Maydell , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v2 4/5] hw/arm/virt: Use the PA range to compute the memory map In-Reply-To: <20211004101110.imtfcufnrdwhneev@gator> References: <20211003164605.3116450-1-maz@kernel.org> <20211003164605.3116450-5-maz@kernel.org> <20211004101110.imtfcufnrdwhneev@gator> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: drjones@redhat.com, qemu-devel@nongnu.org, eric.auger@redhat.com, peter.maydell@linaro.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Mon, 04 Oct 2021 11:11:10 +0100, Andrew Jones wrote: > > On Sun, Oct 03, 2021 at 05:46:04PM +0100, Marc Zyngier wrote: > > The highmem attribute is nothing but another way to express the > > PA range of a VM. To support HW that has a smaller PA range then > > what QEMU assumes, pass this PA range to the virt_set_memmap() > > function, allowing it to correctly exclude highmem devices > > if they are outside of the PA range. > > > > Signed-off-by: Marc Zyngier > > --- > > hw/arm/virt.c | 46 +++++++++++++++++++++++++++++++++++----------- > > 1 file changed, 35 insertions(+), 11 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 9d2abdbd5f..a572e0c9d9 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1610,10 +1610,10 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx) > > return arm_cpu_mp_affinity(idx, clustersz); > > } > > > > -static void virt_set_memmap(VirtMachineState *vms) > > +static void virt_set_memmap(VirtMachineState *vms, int pa_bits) > > { > > MachineState *ms = MACHINE(vms); > > - hwaddr base, device_memory_base, device_memory_size; > > + hwaddr base, device_memory_base, device_memory_size, memtop; > > int i; > > > > vms->memmap = extended_memmap; > > @@ -1628,9 +1628,12 @@ static void virt_set_memmap(VirtMachineState *vms) > > exit(EXIT_FAILURE); > > } > > > > - if (!vms->highmem && > > - vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > > - error_report("highmem=off, but memory crosses the 4GiB limit\n"); > > + if (!vms->highmem) > > + pa_bits = 32; > > + > > + if (vms->memmap[VIRT_MEM].base + ms->maxram_size > BIT_ULL(pa_bits)) { > > + error_report("Addressing limited to %d bits, but memory exceeds it by %llu bytes\n", > > + pa_bits, vms->memmap[VIRT_MEM].base + ms->maxram_size - BIT_ULL(pa_bits)); > > exit(EXIT_FAILURE); > > } > > /* > > @@ -1645,7 +1648,7 @@ static void virt_set_memmap(VirtMachineState *vms) > > device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB; > > > > /* Base address of the high IO region */ > > - base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > + memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB); > > if (base < device_memory_base) { > > error_report("maxmem/slots too huge"); > > exit(EXIT_FAILURE); > > @@ -1662,9 +1665,17 @@ static void virt_set_memmap(VirtMachineState *vms) > > vms->memmap[i].size = size; > > base += size; > > } > > - vms->highest_gpa = (vms->highmem ? > > - base : > > - vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > > + > > + /* > > + * If base fits within pa_bits, all good. If it doesn't, limit it > > + * to the end of RAM, which is guaranteed to fit within pa_bits. > > We tested that > > vms->memmap[VIRT_MEM].base + ms->maxram_size > > fits within pa_bits, but here we're setting highest_gpa to > > ROUND_UP(vms->memmap[VIRT_MEM].base + ms->ram_size, GiB) + > ROUND_UP(ms->maxram_size - ms->ram_size + ms->ram_slots * GiB, GiB) > > which will be larger. Shouldn't we test memtop instead to make this > guarantee? Yes, well spotted. > > > > + */ > > + if (base <= BIT_ULL(pa_bits)) { > > + vms->highest_gpa = base -1; > > + } else { > > + vms->highest_gpa = memtop - 1; > > + } > > + > > if (device_memory_size > 0) { > > ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); > > ms->device_memory->base = device_memory_base; > > @@ -1860,7 +1871,20 @@ static void machvirt_init(MachineState *machine) > > * to create a VM with the right number of IPA bits. > > */ > > if (!vms->memmap) { > > - virt_set_memmap(vms); > > + ARMCPU *armcpu = ARM_CPU(first_cpu); > > > I think it's too early to use first_cpu here (although, I'll admit I'm > always confused as to what gets initialized when...) Assuming we need to > realize the cpus first, then we don't do that until a bit further down > in this function. I wonder if it's possible to delay this memmap setup > until after cpu realization. I see the memmap getting used prior when > calculating virt_max_cpus, but that looks like it needs to be updated > anyway to take highmem into account as to whether or not we should > consider the high gicv3 redist region in the calculation. OK, this is nothing short of total hell. You can't create the memory map later, as MTE and the secure world both get in the way (they really want a valid memory map). And as you pointed out, using first_cpu is not appropriate here (obviously, I didn't test this nearly enough). I could split the creation of the CPUs in two sequences with the memory map creation in between, but this quickly becomes quite invasive. My current approach is to keep the current flow, but to create a temporary CPU, find whatever I need to know about it, and free it. Yes, this is a bit overkill, but it solves the chicken and egg issue simply enough. Thanks, M. -- Without deviation from the norm, progress is not possible.