From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17F7B36B08 for ; Thu, 11 Apr 2024 21:44:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712871871; cv=none; b=Bo/iCgRlT+lTMMsGTjRdBg/QASGbpvpwgJsBArD1nq2qBuCGyKFKcX7AwE5n3SZquDFjIMDfufELvzApFkPusZPjstbiHz36hxxDXJYdW1NBeD7PwfP7pbxs/P0UJCNG7764wTINXGkq1eHm1tIza359+uEqmSxkpWl13CbEJHw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712871871; c=relaxed/simple; bh=YVpKfdoQDQiSnmdlw+IYx/A52nBEvf8d7yMnWLvVQgU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AJvVvdhj266SAbRKJ/v+QGRyNM7Aahn6alnsNEEH52uJ9aZNWCvQWyv9S9UPH8lnVUfB3ilme+t87u10z3udsRHjMofz5LQQ+o8t3/jETfjflepD39G4ixvCh9+tHEKTWk+NZyHBmNXFYyVR2hmACXyJhciyWxfWkVyElw3O2uM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qjKweQFK; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qjKweQFK" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-1e42b701219so2784555ad.0 for ; Thu, 11 Apr 2024 14:44:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712871868; x=1713476668; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EuGdop3yTqRLN7ifNY9EpYHLJpBPWVklj7vsl1p/r30=; b=qjKweQFKUrF6gAZAQBlJoM7efycbDVnlecEDl0Ijti3CFksZqhlbzFPWjK/tOxJOrS JCmbYalcNVQg6ExRdku8vcGmCQnGVUtTZKvo1ugRl9bicrfWWYxxB1LsbZkCJcpwHYj+ 7dDDkCO4ev+ZCaJfrCZg9QGKpheYbk7s0bsV0EZPKJI1rttH13K1fzPbHBr3oTmMLt66 IkwM7jeW9RvHbssytCLwUptY8pjPlQ5py7oYvKwJSmbcU14ooNvKslO8yIOesK3AGFRf BeKUjsHXMuv+tA4WE/ver8JNuKEJYeoPVGaBsiI/ashRUUmYPCmXrEstPJwJruPGmdqd oZXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712871868; x=1713476668; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EuGdop3yTqRLN7ifNY9EpYHLJpBPWVklj7vsl1p/r30=; b=wvKuJS9ngNoJtucc/3W28XJj1suiSEmXeSOPIwVo6kXY1Wld6h3uD3jKuwhWG9tiSb 959gb6bs0KM929X9ogB2Dj1WYFi6TjsB8UDEDU0VbaytSaaYY9pzovGP+7UOCr90M06c +2QcA25bfRbrpiajcZf7CuW74w8qOGCAIF5SfmqCAyNC8Rs6rXKxqbfiFjDMStOqI6Pg RDo45GKg/FSL8jZCIez+THtY8QsqX767LZsmmYIn1+WTAA0B+dHrKLr52BeTe12yTIi6 EJTuvvHGQqEmwKSr1awzvJVtXO/nWJg3nXOf0lkfqeXur0lAjWbWIGii7ke7j1IAOXm6 fAUQ== X-Forwarded-Encrypted: i=1; AJvYcCW9PUS6Kcmpto90ME26/OHlNRiu+2IFIfoPS5LR69ifeiy0evlNIy6q5801AHKCc9Ln0uvaPFvLswKJxV4kaUo5iaXeO0BlhISzXVP6 X-Gm-Message-State: AOJu0YxkG6v6bKM2cw2gejwQKxXIh3b5WAhBJflizHybbBsX4dWeMgdi A+HZa2IAQ5+sn/ZvaUVTd4dBNx1+76g1BBoRTAvRaDtxsX+u4OUUdyV49B2N3ymSPus4g4JDByF UVQ== X-Google-Smtp-Source: AGHT+IF5whDuHMFxrAcSEfb13bVXG1WK0cQWdVo6WBDc/2PFDOw//LVqK7jwShMlKPYYsVuAL00g1h8e1Kc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:234b:b0:1e2:ba13:ab90 with SMTP id c11-20020a170903234b00b001e2ba13ab90mr2185plh.1.1712871868196; Thu, 11 Apr 2024 14:44:28 -0700 (PDT) Date: Thu, 11 Apr 2024 14:44:26 -0700 In-Reply-To: <20240126085444.324918-24-xiong.y.zhang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240126085444.324918-1-xiong.y.zhang@linux.intel.com> <20240126085444.324918-24-xiong.y.zhang@linux.intel.com> Message-ID: Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU From: Sean Christopherson To: Xiong Zhang Cc: pbonzini@redhat.com, peterz@infradead.org, mizhang@google.com, kan.liang@intel.com, zhenyuw@linux.intel.com, dapeng1.mi@linux.intel.com, jmattson@google.com, kvm@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, zhiyuan.lv@intel.com, eranian@google.com, irogers@google.com, samantha.alt@intel.com, like.xu.linux@gmail.com, chao.gao@intel.com Content-Type: text/plain; charset="us-ascii" On Fri, Jan 26, 2024, Xiong Zhang wrote: > From: Dapeng Mi > > Implement the save/restore of PMU state for pasthrough PMU in Intel. In > passthrough mode, KVM owns exclusively the PMU HW when control flow goes to > the scope of passthrough PMU. Thus, KVM needs to save the host PMU state > and gains the full HW PMU ownership. On the contrary, host regains the > ownership of PMU HW from KVM when control flow leaves the scope of > passthrough PMU. > > Implement PMU context switches for Intel CPUs and opptunistically use > rdpmcl() instead of rdmsrl() when reading counters since the former has > lower latency in Intel CPUs. > > Co-developed-by: Mingwei Zhang > Signed-off-by: Mingwei Zhang > Signed-off-by: Dapeng Mi > --- > arch/x86/kvm/vmx/pmu_intel.c | 73 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > index 0d58fe7d243e..f79bebe7093d 100644 > --- a/arch/x86/kvm/vmx/pmu_intel.c > +++ b/arch/x86/kvm/vmx/pmu_intel.c > @@ -823,10 +823,83 @@ void intel_passthrough_pmu_msrs(struct kvm_vcpu *vcpu) > > static void intel_save_pmu_context(struct kvm_vcpu *vcpu) I would prefer there be a "guest" in there somewhere, e.g. intel_save_guest_pmu_context(). > { > + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); > + struct kvm_pmc *pmc; > + u32 i; > + > + if (pmu->version != 2) { > + pr_warn("only PerfMon v2 is supported for passthrough PMU"); > + return; > + } > + > + /* Global ctrl register is already saved at VM-exit. */ > + rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, pmu->global_status); > + /* Clear hardware MSR_CORE_PERF_GLOBAL_STATUS MSR, if non-zero. */ > + if (pmu->global_status) > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, pmu->global_status); > + > + for (i = 0; i < pmu->nr_arch_gp_counters; i++) { > + pmc = &pmu->gp_counters[i]; > + rdpmcl(i, pmc->counter); > + rdmsrl(i + MSR_ARCH_PERFMON_EVENTSEL0, pmc->eventsel); > + /* > + * Clear hardware PERFMON_EVENTSELx and its counter to avoid > + * leakage and also avoid this guest GP counter get accidentally > + * enabled during host running when host enable global ctrl. > + */ > + if (pmc->eventsel) > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, 0); > + if (pmc->counter) > + wrmsrl(MSR_IA32_PMC0 + i, 0); This doesn't make much sense. The kernel already has full access to the guest, I don't see what is gained by zeroing out the MSRs just to hide them from perf. Similarly, if perf enables a counter if PERF_GLOBAL_CTRL without first restoring the event selector, we gots problems. Same thing for the fixed counters below. Can't this just be? for (i = 0; i < pmu->nr_arch_gp_counters; i++) rdpmcl(i, pmu->gp_counters[i].counter); for (i = 0; i < pmu->nr_arch_fixed_counters; i++) rdpmcl(INTEL_PMC_FIXED_RDPMC_BASE | i, pmu->fixed_counters[i].counter); > + } > + > + rdmsrl(MSR_CORE_PERF_FIXED_CTR_CTRL, pmu->fixed_ctr_ctrl); > + /* > + * Clear hardware FIXED_CTR_CTRL MSR to avoid information leakage and > + * also avoid these guest fixed counters get accidentially enabled > + * during host running when host enable global ctrl. > + */ > + if (pmu->fixed_ctr_ctrl) > + wrmsrl(MSR_CORE_PERF_FIXED_CTR_CTRL, 0); > + for (i = 0; i < pmu->nr_arch_fixed_counters; i++) { > + pmc = &pmu->fixed_counters[i]; > + rdpmcl(INTEL_PMC_FIXED_RDPMC_BASE | i, pmc->counter); > + if (pmc->counter) > + wrmsrl(MSR_CORE_PERF_FIXED_CTR0 + i, 0); > + } > } > > static void intel_restore_pmu_context(struct kvm_vcpu *vcpu) > { > + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); > + struct kvm_pmc *pmc; > + u64 global_status; > + int i; > + > + if (pmu->version != 2) { > + pr_warn("only PerfMon v2 is supported for passthrough PMU"); > + return; > + } > + > + /* Clear host global_ctrl and global_status MSR if non-zero. */ > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0); Why? PERF_GLOBAL_CTRL will be auto-loaded at VM-Enter, why do it now? > + rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, global_status); > + if (global_status) > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, global_status); This seems especially silly, isn't the full MSR being written below? Or am I misunderstanding how these things work? > + wrmsrl(MSR_CORE_PERF_GLOBAL_STATUS_SET, pmu->global_status); > + > + for (i = 0; i < pmu->nr_arch_gp_counters; i++) { > + pmc = &pmu->gp_counters[i]; > + wrmsrl(MSR_IA32_PMC0 + i, pmc->counter); > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, pmc->eventsel); > + } > + > + wrmsrl(MSR_CORE_PERF_FIXED_CTR_CTRL, pmu->fixed_ctr_ctrl); > + for (i = 0; i < pmu->nr_arch_fixed_counters; i++) { > + pmc = &pmu->fixed_counters[i]; > + wrmsrl(MSR_CORE_PERF_FIXED_CTR0 + i, pmc->counter); > + } > } > > struct kvm_pmu_ops intel_pmu_ops __initdata = { > -- > 2.34.1 >