LKML Archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Mingwei Zhang <mizhang@google.com>,
	"Mi, Dapeng" <dapeng1.mi@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>,
	maobibo <maobibo@loongson.cn>,
	Xiong Zhang <xiong.y.zhang@linux.intel.com>,
	pbonzini@redhat.com, peterz@infradead.org, kan.liang@intel.com,
	zhenyuw@linux.intel.com, jmattson@google.com,
	kvm@vger.kernel.org, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, zhiyuan.lv@intel.com,
	eranian@google.com, irogers@google.com, samantha.alt@intel.com,
	like.xu.linux@gmail.com, chao.gao@intel.com
Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU
Date: Fri, 26 Apr 2024 10:09:52 -0400	[thread overview]
Message-ID: <ff4a4229-04ac-4cbf-8aea-c84ccfa96e0b@linux.intel.com> (raw)
In-Reply-To: <CAL715WJCHJD_wcJ+r4TyWfvmk9uNT_kPy7Pt=CHkB-Sf0D4Rqw@mail.gmail.com>



On 2024-04-25 11:12 p.m., Mingwei Zhang wrote:
>>>> the perf_guest_enter(). The stale information is saved in the KVM. Perf
>>>> will schedule the event in the next perf_guest_exit(). KVM will not know it.
>>> Ya, the creation of an event on a CPU that currently has guest PMU state loaded
>>> is what I had in mind when I suggested a callback in my sketch:
>>>
>>>   :  D. Add a perf callback that is invoked from IRQ context when perf wants to
>>>   :     configure a new PMU-based events, *before* actually programming the MSRs,
>>>   :     and have KVM's callback put the guest PMU state
>>
>> when host creates a perf event with exclude_guest attribute which is
>> used to profile KVM/VMM user space, the vCPU process could work at three
>> places.
>>
>> 1. in guest state (non-root mode)
>>
>> 2. inside vcpu-loop
>>
>> 3. outside vcpu-loop
>>
>> Since the PMU state has already been switched to host state, we don't
>> need to consider the case 3 and only care about cases 1 and 2.
>>
>> when host creates a perf event with exclude_guest attribute to profile
>> KVM/VMM user space,  an IPI is triggered to enable the perf event
>> eventually like the following code shows.
>>
>> event_function_call(event, __perf_event_enable, NULL);
>>
>> For case 1,  a vm-exit is triggered and KVM starts to process the
>> vm-exit and then run IPI irq handler, exactly speaking
>> __perf_event_enable() to enable the perf event.
>>
>> For case 2, the IPI irq handler would preempt the vcpu-loop and call
>> __perf_event_enable() to enable the perf event.
>>
>> So IMO KVM just needs to provide a callback to switch guest/host PMU
>> state, and __perf_event_enable() calls this callback before really
>> touching PMU MSRs.
> ok, in this case, do we still need KVM to query perf if there are
> active exclude_guest events? yes? Because there is an ordering issue.
> The above suggests that the host-level perf profiling comes when a VM
> is already running, there is an IPI that can invoke the callback and
> trigger preemption. In this case, KVM should switch the context from
> guest to host. What if it is the other way around, ie., host-level
> profiling runs first and then VM runs?
> 
> In this case, just before entering the vcpu loop, kvm should check
> whether there is an active host event and save that into a pmu data
> structure. 

KVM doesn't need to save/restore the host state. Host perf has the
information and will reload the values whenever the host events are
rescheduled. But I think KVM should clear the registers used by the host
to prevent the value leaks to the guest.

> If none, do the context switch early (so that KVM saves a
> huge amount of unnecessary PMU context switches in the future).
> Otherwise, keep the host PMU context until vm-enter. At the time of
> vm-exit, do the check again using the data stored in pmu structure. If
> there is an active event do the context switch to the host PMU,
> otherwise defer that until exiting the vcpu loop. Of course, in the
> meantime, if there is any perf profiling started causing the IPI, the
> irq handler calls the callback, preempting the guest PMU context. If
> that happens, at the time of exiting the vcpu boundary, PMU context
> switch is skipped since it is already done. Of course, note that the
> irq could come at any time, so the PMU context switch in all 4
> locations need to check the state flag (and skip the context switch if
> needed).
> 
> So this requires vcpu->pmu has two pieces of state information: 1) the
> flag similar to TIF_NEED_FPU_LOAD; 2) host perf context info (phase #1
> just a boolean; phase #2, bitmap of occupied counters).
> 
> This is a non-trivial optimization on the PMU context switch. I am
> thinking about splitting them into the following phases:
> 
> 1) lazy PMU context switch, i.e., wait until the guest touches PMU MSR
> for the 1st time.
> 2) fast PMU context switch on KVM side, i.e., KVM checking event
> selector value (enable/disable) and selectively switch PMU state
> (reducing rd/wr msrs)
> 3) dynamic PMU context boundary, ie., KVM can dynamically choose PMU
> context switch boundary depending on existing active host-level
> events.
> 3.1) more accurate dynamic PMU context switch, ie., KVM checking
> host-level counter position and further reduces the number of msr
> accesses.
> 4) guest PMU context preemption, i.e., any new host-level perf
> profiling can immediately preempt the guest PMU in the vcpu loop
> (instead of waiting for the next PMU context switch in KVM).

I'm not quit sure about the 4.
The new host-level perf must be an exclude_guest event. It should not be
scheduled when a guest is using the PMU. Why do we want to preempt the
guest PMU? The current implementation in perf doesn't schedule any
exclude_guest events when a guest is running.

Thanks,
Kan

  parent reply	other threads:[~2024-04-26 14:09 UTC|newest]

Thread overview: 181+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26  8:54 [RFC PATCH 00/41] KVM: x86/pmu: Introduce passthrough vPM Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 01/41] perf: x86/intel: Support PERF_PMU_CAP_VPMU_PASSTHROUGH Xiong Zhang
2024-04-11 17:04   ` Sean Christopherson
2024-04-11 17:21     ` Liang, Kan
2024-04-11 17:24       ` Jim Mattson
2024-04-11 17:46         ` Sean Christopherson
2024-04-11 19:13           ` Liang, Kan
2024-04-11 20:43             ` Sean Christopherson
2024-04-11 21:04               ` Liang, Kan
2024-04-11 19:32           ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 02/41] perf: Support guest enter/exit interfaces Xiong Zhang
2024-03-20 16:40   ` Raghavendra Rao Ananta
2024-03-20 17:12     ` Liang, Kan
2024-04-11 18:06   ` Sean Christopherson
2024-04-11 19:53     ` Liang, Kan
2024-04-12 19:17       ` Sean Christopherson
2024-04-12 20:56         ` Liang, Kan
2024-04-15 16:03           ` Liang, Kan
2024-04-16  5:34             ` Zhang, Xiong Y
2024-04-16 12:48               ` Liang, Kan
2024-04-17  9:42                 ` Zhang, Xiong Y
2024-04-18 16:11                   ` Sean Christopherson
2024-04-19  1:37                     ` Zhang, Xiong Y
2024-04-26  4:09       ` Zhang, Xiong Y
2024-01-26  8:54 ` [RFC PATCH 03/41] perf: Set exclude_guest onto nmi_watchdog Xiong Zhang
2024-04-11 18:56   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 04/41] perf: core/x86: Add support to register a new vector for PMI handling Xiong Zhang
2024-04-11 17:10   ` Sean Christopherson
2024-04-11 19:05     ` Sean Christopherson
2024-04-12  3:56     ` Zhang, Xiong Y
2024-04-13  1:17       ` Mi, Dapeng
2024-01-26  8:54 ` [RFC PATCH 05/41] KVM: x86/pmu: Register PMI handler for passthrough PMU Xiong Zhang
2024-04-11 19:07   ` Sean Christopherson
2024-04-12  5:44     ` Zhang, Xiong Y
2024-01-26  8:54 ` [RFC PATCH 06/41] perf: x86: Add function to switch PMI handler Xiong Zhang
2024-04-11 19:17   ` Sean Christopherson
2024-04-11 19:34     ` Sean Christopherson
2024-04-12  6:03       ` Zhang, Xiong Y
2024-04-12  5:57     ` Zhang, Xiong Y
2024-01-26  8:54 ` [RFC PATCH 07/41] perf/x86: Add interface to reflect virtual LVTPC_MASK bit onto HW Xiong Zhang
2024-04-11 19:21   ` Sean Christopherson
2024-04-12  6:17     ` Zhang, Xiong Y
2024-01-26  8:54 ` [RFC PATCH 08/41] KVM: x86/pmu: Add get virtual LVTPC_MASK bit function Xiong Zhang
2024-04-11 19:22   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 09/41] perf: core/x86: Forbid PMI handler when guest own PMU Xiong Zhang
2024-04-11 19:26   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 10/41] perf: core/x86: Plumb passthrough PMU capability from x86_pmu to x86_pmu_cap Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 11/41] KVM: x86/pmu: Introduce enable_passthrough_pmu module parameter and propage to KVM instance Xiong Zhang
2024-04-11 20:54   ` Sean Christopherson
2024-04-11 21:03   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 12/41] KVM: x86/pmu: Plumb through passthrough PMU to vcpu for Intel CPUs Xiong Zhang
2024-04-11 20:57   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 13/41] KVM: x86/pmu: Add a helper to check if passthrough PMU is enabled Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 14/41] KVM: x86/pmu: Allow RDPMC pass through Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 15/41] KVM: x86/pmu: Manage MSR interception for IA32_PERF_GLOBAL_CTRL Xiong Zhang
2024-04-11 21:21   ` Sean Christopherson
2024-04-11 22:30     ` Jim Mattson
2024-04-11 23:27       ` Sean Christopherson
2024-04-13  2:10       ` Mi, Dapeng
2024-01-26  8:54 ` [RFC PATCH 16/41] KVM: x86/pmu: Create a function prototype to disable MSR interception Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 17/41] KVM: x86/pmu: Implement pmu function for Intel CPU " Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 18/41] KVM: x86/pmu: Intercept full-width GP counter MSRs by checking with perf capabilities Xiong Zhang
2024-04-11 21:23   ` Sean Christopherson
2024-04-11 21:50     ` Jim Mattson
2024-04-12 16:01       ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 19/41] KVM: x86/pmu: Whitelist PMU MSRs for passthrough PMU Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 20/41] KVM: x86/pmu: Introduce PMU operation prototypes for save/restore PMU context Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 21/41] KVM: x86/pmu: Introduce function prototype for Intel CPU to " Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 22/41] x86: Introduce MSR_CORE_PERF_GLOBAL_STATUS_SET for passthrough PMU Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU Xiong Zhang
2024-04-11 21:26   ` Sean Christopherson
2024-04-13  2:29     ` Mi, Dapeng
2024-04-11 21:44   ` Sean Christopherson
2024-04-11 22:19     ` Jim Mattson
2024-04-11 23:31       ` Sean Christopherson
2024-04-13  3:19         ` Mi, Dapeng
2024-04-13  3:03     ` Mi, Dapeng
2024-04-13  3:34       ` Mingwei Zhang
2024-04-13  4:25         ` Mi, Dapeng
2024-04-15  6:06           ` Mingwei Zhang
2024-04-15 10:04             ` Mi, Dapeng
2024-04-15 16:44               ` Mingwei Zhang
2024-04-15 17:38                 ` Sean Christopherson
2024-04-15 17:54                   ` Mingwei Zhang
2024-04-15 22:45                     ` Sean Christopherson
2024-04-22  2:14                       ` maobibo
2024-04-22 17:01                         ` Sean Christopherson
2024-04-23  1:01                           ` maobibo
2024-04-23  2:44                             ` Mi, Dapeng
2024-04-23  2:53                               ` maobibo
2024-04-23  3:13                                 ` Mi, Dapeng
2024-04-23  3:26                                   ` maobibo
2024-04-23  3:59                                     ` Mi, Dapeng
2024-04-23  3:55                                   ` maobibo
2024-04-23  4:23                                     ` Mingwei Zhang
2024-04-23  6:08                                       ` maobibo
2024-04-23  6:45                                         ` Mi, Dapeng
2024-04-23  7:10                                           ` Mingwei Zhang
2024-04-23  8:24                                             ` Mi, Dapeng
2024-04-23  8:51                                               ` maobibo
2024-04-23 16:50                                               ` Mingwei Zhang
2024-04-23 12:12                                       ` maobibo
2024-04-23 17:02                                         ` Mingwei Zhang
2024-04-24  1:07                                           ` maobibo
2024-04-24  8:18                                           ` Mi, Dapeng
2024-04-24 15:00                                             ` Sean Christopherson
2024-04-25  3:55                                               ` Mi, Dapeng
2024-04-25  4:24                                                 ` Mingwei Zhang
2024-04-25 16:13                                                   ` Liang, Kan
2024-04-25 20:16                                                     ` Mingwei Zhang
2024-04-25 20:43                                                       ` Liang, Kan
2024-04-25 21:46                                                         ` Sean Christopherson
2024-04-26  1:46                                                           ` Mi, Dapeng
2024-04-26  3:12                                                             ` Mingwei Zhang
2024-04-26  4:02                                                               ` Mi, Dapeng
2024-04-26  4:46                                                                 ` Mingwei Zhang
2024-04-26 14:09                                                               ` Liang, Kan [this message]
2024-04-26 18:41                                                                 ` Mingwei Zhang
2024-04-26 19:06                                                                   ` Liang, Kan
2024-04-26 19:46                                                                     ` Sean Christopherson
2024-04-27  3:04                                                                       ` Mingwei Zhang
2024-04-28  0:58                                                                         ` Mi, Dapeng
2024-04-28  6:01                                                                           ` Mingwei Zhang
2024-04-29 17:44                                                                             ` Sean Christopherson
2024-05-01 17:43                                                                               ` Mingwei Zhang
2024-05-01 18:00                                                                                 ` Liang, Kan
2024-05-01 20:36                                                                                 ` Sean Christopherson
2024-04-29 13:08                                                                         ` Liang, Kan
2024-04-26 13:53                                                           ` Liang, Kan
2024-04-26  1:50                                                         ` Mi, Dapeng
2024-04-18 21:21                   ` Mingwei Zhang
2024-04-18 21:41                     ` Mingwei Zhang
2024-04-19  1:02                     ` Mi, Dapeng
2024-01-26  8:54 ` [RFC PATCH 24/41] KVM: x86/pmu: Zero out unexposed Counters/Selectors to avoid information leakage Xiong Zhang
2024-04-11 21:36   ` Sean Christopherson
2024-04-11 21:56     ` Jim Mattson
2024-01-26  8:54 ` [RFC PATCH 25/41] KVM: x86/pmu: Introduce macro PMU_CAP_PERF_METRICS Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 26/41] KVM: x86/pmu: Add host_perf_cap field in kvm_caps to record host PMU capability Xiong Zhang
2024-04-11 21:49   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 27/41] KVM: x86/pmu: Clear PERF_METRICS MSR for guest Xiong Zhang
2024-04-11 21:50   ` Sean Christopherson
2024-04-13  3:29     ` Mi, Dapeng
2024-01-26  8:54 ` [RFC PATCH 28/41] KVM: x86/pmu: Switch IA32_PERF_GLOBAL_CTRL at VM boundary Xiong Zhang
2024-04-11 21:54   ` Sean Christopherson
2024-04-11 22:10     ` Jim Mattson
2024-04-11 22:54       ` Sean Christopherson
2024-04-11 23:08         ` Jim Mattson
2024-01-26  8:54 ` [RFC PATCH 29/41] KVM: x86/pmu: Exclude existing vLBR logic from the passthrough PMU Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 30/41] KVM: x86/pmu: Switch PMI handler at KVM context switch boundary Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 31/41] KVM: x86/pmu: Call perf_guest_enter() at PMU context switch Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 32/41] KVM: x86/pmu: Add support for PMU context switch at VM-exit/enter Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 33/41] KVM: x86/pmu: Make check_pmu_event_filter() an exported function Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 34/41] KVM: x86/pmu: Intercept EVENT_SELECT MSR Xiong Zhang
2024-04-11 21:55   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 35/41] KVM: x86/pmu: Allow writing to event selector for GP counters if event is allowed Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 36/41] KVM: x86/pmu: Intercept FIXED_CTR_CTRL MSR Xiong Zhang
2024-04-11 21:56   ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 37/41] KVM: x86/pmu: Allow writing to fixed counter selector if counter is exposed Xiong Zhang
2024-04-11 22:03   ` Sean Christopherson
2024-04-13  4:12     ` Mi, Dapeng
2024-01-26  8:54 ` [RFC PATCH 38/41] KVM: x86/pmu: Introduce PMU helper to increment counter Xiong Zhang
2024-01-26  8:54 ` [RFC PATCH 39/41] KVM: x86/pmu: Implement emulated counter increment for passthrough PMU Xiong Zhang
2024-04-11 23:12   ` Sean Christopherson
2024-04-11 23:17     ` Sean Christopherson
2024-01-26  8:54 ` [RFC PATCH 40/41] KVM: x86/pmu: Separate passthrough PMU logic in set/get_msr() from non-passthrough vPMU Xiong Zhang
2024-04-11 23:18   ` Sean Christopherson
2024-04-18 21:54     ` Mingwei Zhang
2024-01-26  8:54 ` [RFC PATCH 41/41] KVM: nVMX: Add nested virtualization support for passthrough PMU Xiong Zhang
2024-04-11 23:21   ` Sean Christopherson
2024-04-11 17:03 ` [RFC PATCH 00/41] KVM: x86/pmu: Introduce passthrough vPM Sean Christopherson
2024-04-12  2:19   ` Zhang, Xiong Y
2024-04-12 18:32     ` Sean Christopherson
2024-04-15  1:06       ` Zhang, Xiong Y
2024-04-15 15:05         ` Sean Christopherson
2024-04-16  5:11           ` Zhang, Xiong Y
2024-04-18 20:46   ` Mingwei Zhang
2024-04-18 21:52     ` Mingwei Zhang
2024-04-19 19:14     ` Sean Christopherson
2024-04-19 22:02       ` Mingwei Zhang
2024-04-11 23:25 ` Sean Christopherson
2024-04-11 23:56   ` Mingwei Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff4a4229-04ac-4cbf-8aea-c84ccfa96e0b@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jmattson@google.com \
    --cc=kan.liang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=samantha.alt@intel.com \
    --cc=seanjc@google.com \
    --cc=xiong.y.zhang@linux.intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).