From: Isaku Yamahata <isaku.yamahata@intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
seanjc@google.com, michael.roth@amd.com,
isaku.yamahata@intel.com, isaku.yamahata@linux.intel.com
Subject: Re: [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data
Date: Tue, 23 Apr 2024 16:50:13 -0700 [thread overview]
Message-ID: <20240423235013.GO3596705@ls.amr.corp.intel.com> (raw)
In-Reply-To: <20240404185034.3184582-10-pbonzini@redhat.com>
On Thu, Apr 04, 2024 at 02:50:31PM -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> During guest run-time, kvm_arch_gmem_prepare() is issued as needed to
> prepare newly-allocated gmem pages prior to mapping them into the guest.
> In the case of SEV-SNP, this mainly involves setting the pages to
> private in the RMP table.
>
> However, for the GPA ranges comprising the initial guest payload, which
> are encrypted/measured prior to starting the guest, the gmem pages need
> to be accessed prior to setting them to private in the RMP table so they
> can be initialized with the userspace-provided data. Additionally, an
> SNP firmware call is needed afterward to encrypt them in-place and
> measure the contents into the guest's launch digest.
>
> While it is possible to bypass the kvm_arch_gmem_prepare() hooks so that
> this handling can be done in an open-coded/vendor-specific manner, this
> may expose more gmem-internal state/dependencies to external callers
> than necessary. Try to avoid this by implementing an interface that
> tries to handle as much of the common functionality inside gmem as
> possible, while also making it generic enough to potentially be
> usable/extensible for TDX as well.
I explored how TDX will use this hook. However, it resulted in not using this
hook, and instead used kvm_tdp_mmu_get_walk() with a twist. The patch is below.
Because SEV-SNP manages the RMP that is not tied to NPT directly, SEV-SNP can
ignore TDP MMU page tables when updating RMP.
On the other hand, TDX essentially updates Secure-EPT when it adds a page to
the guest by TDH.MEM.PAGE.ADD(). It needs to protect KVM TDP MMU page tables
with mmu_lock, not guest memfd file mapping with invalidate_lock. The hook
doesn't apply to TDX well. The resulted KVM_TDX_INIT_MEM_REGION logic is as
follows.
get_user_pages_fast(source addr)
read_lock(mmu_lock)
kvm_tdp_mmu_get_walk_private_pfn(vcpu, gpa, &pfn);
if the page table doesn't map gpa, error.
TDH.MEM.PAGE.ADD()
TDH.MR.EXTEND()
read_unlock(mmu_lock)
put_page()
From 7d4024049b51969a2431805c2117992fc7ec0981 Mon Sep 17 00:00:00 2001
Message-ID: <7d4024049b51969a2431805c2117992fc7ec0981.1713913379.git.isaku.yamahata@intel.com>
In-Reply-To: <cover.1713913379.git.isaku.yamahata@intel.com>
References: <cover.1713913379.git.isaku.yamahata@intel.com>
From: Isaku Yamahata <isaku.yamahata@intel.com>
Date: Tue, 23 Apr 2024 11:33:44 -0700
Subject: [PATCH] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU
KVM_TDX_INIT_MEM_REGION needs to check if the given GFN is already
populated. Add wrapping logic to kvm_tdp_mmu_get_walk() to export it.
Alternatives are as follows. Choose the approach of this patch as the
least intrusive change.
- Refactor kvm page fault handler. Populating part and unlock function.
The page fault handler to populate with keeping lock, TDH.MEM.PAGE.ADD(),
unlock.
- Add a callback function to struct kvm_page_fault and call it
after the page fault handler before unlocking mmu_lock and releasing PFN.
Based on the feedback of
https://lore.kernel.org/kvm/ZfBkle1eZFfjPI8l@google.com/
https://lore.kernel.org/kvm/Zh8DHbb8FzoVErgX@google.com/
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
arch/x86/kvm/mmu.h | 3 +++
arch/x86/kvm/mmu/tdp_mmu.c | 44 ++++++++++++++++++++++++++++++++------
2 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 712e9408f634..4f61f4b9fd64 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -287,6 +287,9 @@ extern bool tdp_mmu_enabled;
#define tdp_mmu_enabled false
#endif
+int kvm_tdp_mmu_get_walk_private_pfn(struct kvm_vcpu *vcpu, u64 gpa,
+ kvm_pfn_t *pfn);
+
static inline bool kvm_memslots_have_rmaps(struct kvm *kvm)
{
return !tdp_mmu_enabled || kvm_shadow_root_allocated(kvm);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 3592ae4e485f..bafcd8aeb3b3 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -2035,14 +2035,25 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
*
* Must be called between kvm_tdp_mmu_walk_lockless_{begin,end}.
*/
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
- int *root_level)
+static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+ bool is_private)
{
struct tdp_iter iter;
struct kvm_mmu *mmu = vcpu->arch.mmu;
gfn_t gfn = addr >> PAGE_SHIFT;
int leaf = -1;
+ tdp_mmu_for_each_pte(iter, mmu, is_private, gfn, gfn + 1) {
+ leaf = iter.level;
+ sptes[leaf] = iter.old_spte;
+ }
+
+ return leaf;
+}
+
+int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+ int *root_level)
+{
*root_level = vcpu->arch.mmu->root_role.level;
/*
@@ -2050,15 +2061,34 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
* instructions in protected guest memory can't be parsed by VMM.
*/
if (WARN_ON_ONCE(kvm_gfn_shared_mask(vcpu->kvm)))
- return leaf;
+ return -1;
- tdp_mmu_for_each_pte(iter, mmu, false, gfn, gfn + 1) {
- leaf = iter.level;
- sptes[leaf] = iter.old_spte;
+ return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, false);
+}
+
+int kvm_tdp_mmu_get_walk_private_pfn(struct kvm_vcpu *vcpu, u64 gpa,
+ kvm_pfn_t *pfn)
+{
+ u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte;
+ int leaf;
+
+ lockdep_assert_held(&vcpu->kvm->mmu_lock);
+
+ kvm_tdp_mmu_walk_lockless_begin();
+ leaf = __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, true);
+ kvm_tdp_mmu_walk_lockless_end();
+ if (leaf < 0)
+ return -ENOENT;
+
+ spte = sptes[leaf];
+ if (is_shadow_present_pte(spte) && is_last_spte(spte, leaf)) {
+ *pfn = spte_to_pfn(spte);
+ return leaf;
}
- return leaf;
+ return -ENOENT;
}
+EXPORT_SYMBOL_GPL(kvm_tdp_mmu_get_walk_private_pfn);
/*
* Returns the last level spte pointer of the shadow page walk for the given
--
2.43.2
--
Isaku Yamahata <isaku.yamahata@intel.com>
next prev parent reply other threads:[~2024-04-23 23:50 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-04 18:50 [PATCH 00/11] KVM: guest_memfd: New hooks and functionality for SEV-SNP and TDX Paolo Bonzini
2024-04-04 18:50 ` [PATCH 01/11] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Paolo Bonzini
2024-04-29 13:14 ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 02/11] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Paolo Bonzini
2024-04-29 13:15 ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 03/11] KVM: guest_memfd: pass error up from filemap_grab_folio Paolo Bonzini
2024-04-04 18:50 ` [PATCH 04/11] filemap: add FGP_CREAT_ONLY Paolo Bonzini
2024-04-25 5:52 ` Paolo Bonzini
2024-04-29 13:26 ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 05/11] KVM: guest_memfd: limit overzealous WARN Paolo Bonzini
2024-04-04 18:50 ` [PATCH 06/11] KVM: guest_memfd: Add hook for initializing memory Paolo Bonzini
2024-04-22 10:53 ` Xu Yilun
2024-05-07 16:17 ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 07/11] KVM: guest_memfd: extract __kvm_gmem_get_pfn() Paolo Bonzini
2024-04-09 23:35 ` Michael Roth
2024-04-24 22:34 ` Sean Christopherson
2024-04-24 22:59 ` Sean Christopherson
2024-04-04 18:50 ` [PATCH 08/11] KVM: guest_memfd: extract __kvm_gmem_punch_hole() Paolo Bonzini
2024-04-04 18:50 ` [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data Paolo Bonzini
2024-04-22 14:44 ` Xu Yilun
2024-04-23 23:50 ` Isaku Yamahata [this message]
2024-04-24 22:24 ` Sean Christopherson
2024-04-25 1:12 ` Isaku Yamahata
2024-04-25 6:01 ` Paolo Bonzini
2024-04-25 16:00 ` Sean Christopherson
2024-04-25 16:51 ` Isaku Yamahata
2024-04-26 5:44 ` Paolo Bonzini
2024-04-26 17:15 ` Isaku Yamahata
2024-04-26 5:41 ` Paolo Bonzini
2024-04-26 15:17 ` Sean Christopherson
2024-04-24 22:32 ` Sean Christopherson
2024-04-25 5:56 ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 10/11] KVM: guest_memfd: Add hook for invalidating memory Paolo Bonzini
2024-04-04 18:50 ` [PATCH 11/11] KVM: x86: Add gmem hook for determining max NPT mapping level Paolo Bonzini
2024-04-09 23:46 ` Michael Roth
2024-04-19 18:26 ` Isaku Yamahata
2024-04-22 14:52 ` Xu Yilun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240423235013.GO3596705@ls.amr.corp.intel.com \
--to=isaku.yamahata@intel.com \
--cc=isaku.yamahata@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).