[PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support

LKML Archive mirror
 help / color / mirror / Atom feed

* [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
@ 2024-04-21 18:01 Michael Roth
  2024-04-21 18:01 ` [PATCH v14 01/22] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
                   ` (23 more replies)
  0 siblings, 24 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

This patchset is also available at:

  https://github.com/amdese/linux/commits/snp-host-v14

and is based on commit 20cc50a0410f (just before the v13 SNP patches) from:

  https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue

Patch Layout
------------

01-04: These patches add some basic infrastructure and introduces a new
       KVM_X86_SNP_VM vm_type to handle differences verses the existing
       KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types.

05-07: These implement the KVM API to handle the creation of a
       cryptographic launch context, encrypt/measure the initial image
       into guest memory, and finalize it before launching it.

08-13: These implement handling for various guest-generated events such
       as page state changes, onlining of additional vCPUs, etc.

14-17: These implement the gmem hooks needed to prepare gmem-allocated
       pages before mapping them into guest private memory ranges as
       well as cleaning them up prior to returning them to the host for
       use as normal memory. Because this supplants certain activities
       like issued WBINVDs during KVM MMU invalidations, there's also
       a patch to avoid duplicating that work to avoid unecessary
       overhead.

18:    With all the core support in place, the patch adds a kvm_amd module
       parameter to enable SNP support.

19-22: These patches all deal with the servicing of guest requests to handle
       things like attestation, as well as some related host-management
       interfaces.

Testing
-------

For testing this via QEMU, use the following tree:

  https://github.com/amdese/qemu/commits/snp-v4-wip3b

A patched OVMF is also needed due to upstream KVM no longer supporting MMIO
ranges that are mapped as private. It is recommended you build the AmdSevX64
variant as it provides the kernel-hashing support present in this series:

  https://github.com/amdese/ovmf/commits/apic-mmio-fix1d

A basic command-line invocation for SNP would be:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd

With kernel-hashing and certificate data supplied:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=,certs-path=/home/mroth/cert.blob,kernel-hashes=on
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
  -kernel /boot/vmlinuz-$ver
  -initrd /boot/initrd.img-$ver
  -append "root=UUID=d72a6d1c-06cf-4b79-af43-f1bac4f620f9 ro console=ttyS0,115200n8"

With standard X64 OVMF package with separate image for persistent NVRAM:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d.fd 
  -drive if=pflash,format=raw,unit=0,file=OVMF_VARS-upstream-20240410-apic-mmio-fix1d.fd,readonly=off

Known issues / TODOs
--------------------

 * Base tree in some cases reports "Unpatched return thunk in use. This should 
   not happen!" the first time it runs an SVM/SEV/SNP guests. This a recent
   regression upstream and unrelated to this series:

     https://lore.kernel.org/linux-kernel/CANpmjNOcKzEvLHoGGeL-boWDHJobwfwyVxUqMq2kWeka3N4tXA@mail.gmail.com/T/

 * 2MB hugepage support has been dropped pending discussion on how we plan to
   re-enable it in gmem.

 * Host kexec should work, but there is a known issue with host kdump support
   while SNP guests are running that will be addressed as a follow-up.

 * SNP kselftests are currently a WIP and will be included as part of SNP
   upstreaming efforts in the near-term.

SEV-SNP Overview
----------------

This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the
changes required to add KVM support for SEV-SNP. This series builds upon
SEV-SNP guest support, which is now in mainline, and and SEV-SNP host
initialization support, which is now in linux-next.

While series provides the basic building blocks to support booting the
SEV-SNP VMs, it does not cover all the security enhancement introduced by
the SEV-SNP such as interrupt protection, which will added in the future.

With SNP, when pages are marked as guest-owned in the RMP table, they are
assigned to a specific guest/ASID, as well as a specific GFN with in the
guest. Any attempts to map it in the RMP table to a different guest/ASID,
or a different GFN within a guest/ASID, will result in an RMP nested page
fault.

Prior to accessing a guest-owned page, the guest must validate it with a
special PVALIDATE instruction which will set a special bit in the RMP table
for the guest. This is the only way to set the validated bit outside of the
initial pre-encrypted guest payload/image; any attempts outside the guest to
modify the RMP entry from that point forward will result in the validated
bit being cleared, at which point the guest will trigger an exception if it
attempts to access that page so it can be made aware of possible tampering.

One exception to this is the initial guest payload, which is pre-validated
by the firmware prior to launching. The guest can use Guest Message requests 
to fetch an attestation report which will include the measurement of the
initial image so that the guest can verify it was booted with the expected
image/environment.

After boot, guests can use Page State Change requests to switch pages
between shared/hypervisor-owned and private/guest-owned to share data for
things like DMA, virtio buffers, and other GHCB requests.

In this implementation of SEV-SNP, private guest memory is managed by a new
kernel framework called guest_memfd (gmem). With gmem, a new
KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM
MMU whether a particular GFN should be backed by shared (normal) memory or
private (gmem-allocated) memory. To tie into this, Page State Change
requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will
then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the
private/shared state in the KVM MMU.

The gmem / KVM MMU hooks implemented in this series will then update the RMP
table entries for the backing PFNs to set them to guest-owned/private when
mapping private pages into the guest via KVM MMU, or use the normal KVM MMU
handling in the case of shared pages where the corresponding RMP table
entries are left in the default shared/hypervisor-owned state.

Feedback/review is very much appreciated!

-Mike

Changes since v13:

 * rebase to new kvm-coco-queue and wire up to PFERR_PRIVATE_ACCESS (Paolo)
 * handle setting kvm->arch.has_private_mem in same location as
   kvm->arch.has_protected_state (Paolo)
 * add flags and additional padding fields to
   snp_launch{start,update,finish} APIs to address alignment and
   expandability (Paolo)
 * update snp_launch_update() to update input struct values to reflect
   current progress of command in situations where mulitple calls are
   needed (Paolo)
 * update snp_launch_update() to avoid copying/accessing 'src' parameter
   when dealing with zero pages. (Paolo)
 * update snp_launch_update() to use u64 as length input parameter instead
   of u32 and adjust padding accordingly
 * modify ordering of SNP_POLICY_MASK_* definitions to be consistent with
   bit order of corresponding flags
 * let firmware handle enforcement of policy bits corresponding to
   user-specified minimum API version
 * add missing "0x" prefixs in pr_debug()'s for snp_launch_start()
 * fix handling of VMSAs during in-place migration (Paolo)

Changes since v12:

 * rebased to latest kvm-coco-queue branch (commit 4d2deb62185f)
 * add more input validation for SNP_LAUNCH_START, especially for handling
   things like MBO/MBZ policy bits, and API major/minor minimums. (Paolo)
 * block SNP KVM instances from being able to run legacy SEV commands (Paolo)
 * don't attempt to measure VMSA for vcpu 0/BSP before the others, let
   userspace deal with the ordering just like with SEV-ES (Paolo)
 * fix up docs for SNP_LAUNCH_FINISH (Paolo)
 * introduce svm->sev_es.snp_has_guest_vmsa flag to better distinguish
   handling for guest-mapped vs non-guest-mapped VMSAs, rename
   'snp_ap_create' flag to 'snp_ap_waiting_for_reset' (Paolo)
 * drop "KVM: SEV: Use a VMSA physical address variable for populating VMCB"
   as it is no longer needed due to above VMSA rework
 * replace pr_debug_ratelimited() messages for RMP #NPFs with a single trace
   event
 * handle transient PSMASH_FAIL_INUSE return codes in kvm_gmem_invalidate(),
   switch to WARN_ON*()'s to indicate remaining error cases are not expected
   and should not be seen in practice. (Paolo)
 * add a cond_resched() in kvm_gmem_invalidate() to avoid soft lock-ups when
   cleaning up large guest memory ranges.
 * rename VLEK_REQUIRED to VCEK_DISABLE. it's be more applicable if another
   key type ever gets added.
 * don't allow attestation to be paused while an attestation request is
   being processed by firmware (Tom)
 * add missing Documentation entry for SNP_VLEK_LOAD
 * collect Reviewed-by's from Paolo and Tom

Changes since v11:

 * Rebase series on kvm-coco-queue and re-work to leverage more
   infrastructure between SNP/TDX series.
 * Drop KVM_SNP_INIT in favor of the new KVM_SEV_INIT2 interface introduced
   here (Paolo):
     https://lore.kernel.org/lkml/20240318233352.2728327-1-pbonzini@redhat.com/
 * Drop exposure API fields related to things like VMPL levels, migration
   agents, etc., until they are actually supported/used (Sean)
 * Rework KVM_SEV_SNP_LAUNCH_UPDATE handling to use a new
   kvm_gmem_populate() interface instead of copying data directly into
   gmem-allocated pages (Sean)
 * Add support for SNP_LOAD_VLEK, rework the SNP_SET_CONFIG_{START,END} to
   have simpler semantics that are applicable to management of SNP_LOAD_VLEK
   updates as well, rename interfaces to the now more appropriate
   SNP_{PAUSE,RESUME}_ATTESTATION
 * Fix up documentation wording and do print warnings for
   userspace-triggerable failures (Peter, Sean)
 * Fix a race with AP_CREATION wake-up events (Jacob, Sean)
 * Fix a memory leak with VMSA pages (Sean)
 * Tighten up handling of RMP page faults to better distinguish between real
   and spurious cases (Tom)
 * Various patch/documentation rewording, cleanups, etc.

----------------------------------------------------------------
Ashish Kalra (1):
      KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP

Brijesh Singh (11):
      KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
      KVM: SEV: Add initial SEV-SNP support
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
      KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
      KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
      KVM: SEV: Add support to handle Page State Change VMGEXIT
      KVM: SEV: Add support to handle RMP nested page faults
      KVM: SVM: Add module parameter to enable SEV-SNP
      KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event

Michael Roth (8):
      KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
      KVM: SEV: Add support for GHCB-based termination requests
      KVM: SEV: Implement gmem hook for initializing private pages
      KVM: SEV: Implement gmem hook for invalidating private pages
      KVM: x86: Implement gmem hook for determining max NPT mapping level
      crypto: ccp: Add the SNP_VLEK_LOAD command
      crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
      KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event

Tom Lendacky (2):
      KVM: SEV: Add support to handle AP reset MSR protocol
      KVM: SEV: Support SEV-SNP AP Creation NAE event

 Documentation/virt/coco/sev-guest.rst              |   69 +-
 Documentation/virt/kvm/api.rst                     |   73 +
 .../virt/kvm/x86/amd-memory-encryption.rst         |  110 +-
 arch/x86/include/asm/kvm_host.h                    |    2 +
 arch/x86/include/asm/sev-common.h                  |   22 +-
 arch/x86/include/asm/sev.h                         |   15 +
 arch/x86/include/asm/svm.h                         |    9 +-
 arch/x86/include/uapi/asm/kvm.h                    |   48 +
 arch/x86/kvm/Kconfig                               |    3 +
 arch/x86/kvm/mmu.h                                 |    2 -
 arch/x86/kvm/mmu/mmu.c                             |    1 +
 arch/x86/kvm/svm/sev.c                             | 1447 +++++++++++++++++++-
 arch/x86/kvm/svm/svm.c                             |   44 +-
 arch/x86/kvm/svm/svm.h                             |   50 +
 arch/x86/kvm/trace.h                               |   31 +
 arch/x86/kvm/x86.c                                 |   17 +
 arch/x86/virt/svm/sev.c                            |   80 ++
 drivers/crypto/ccp/sev-dev.c                       |   83 ++
 include/linux/psp-sev.h                            |    4 +-
 include/uapi/linux/kvm.h                           |   28 +
 include/uapi/linux/psp-sev.h                       |   39 +
 include/uapi/linux/sev-guest.h                     |    9 +
 virt/kvm/guest_memfd.c                             |    4 +-
 23 files changed, 2155 insertions(+), 35 deletions(-)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v14 01/22] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 02/22] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

SEV-SNP relies on private memory support to run guests, so make sure to
enable that support via the CONFIG_KVM_GENERIC_PRIVATE_MEM config
option.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d64fb2b3eb69..5e72faca4e8f 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -136,6 +136,7 @@ config KVM_AMD_SEV
 	depends on KVM_AMD && X86_64
 	depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
 	select ARCH_HAS_CC_PLATFORM
+	select KVM_GENERIC_PRIVATE_MEM
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 02/22] KVM: SEV: Add support to handle AP reset MSR protocol
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  2024-04-21 18:01 ` [PATCH v14 01/22] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Add support for AP Reset Hold being invoked using the GHCB MSR protocol,
available in version 2 of the GHCB specification.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  6 ++--
 arch/x86/kvm/svm/sev.c            | 56 ++++++++++++++++++++++++++-----
 arch/x86/kvm/svm/svm.h            |  1 +
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index b463fcbd4b90..01261f7054ad 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -54,8 +54,10 @@
 	(((unsigned long)fn) << 32))
 
 /* AP Reset Hold */
-#define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
-#define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
+#define GHCB_MSR_AP_RESET_HOLD_REQ		0x006
+#define GHCB_MSR_AP_RESET_HOLD_RESP		0x007
+#define GHCB_MSR_AP_RESET_HOLD_RESULT_POS	12
+#define GHCB_MSR_AP_RESET_HOLD_RESULT_MASK	GENMASK_ULL(51, 0)
 
 /* GHCB GPA Register */
 #define GHCB_MSR_REG_GPA_REQ		0x012
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 598d78b4107f..6e31cb408dd8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -49,6 +49,10 @@ static bool sev_es_debug_swap_enabled = true;
 module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
 static u64 sev_supported_vmsa_features;
 
+#define AP_RESET_HOLD_NONE		0
+#define AP_RESET_HOLD_NAE_EVENT		1
+#define AP_RESET_HOLD_MSR_PROTO		2
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -2727,6 +2731,9 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 
 void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 {
+	/* Clear any indication that the vCPU is in a type of AP Reset Hold */
+	svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_NONE;
+
 	if (!svm->sev_es.ghcb)
 		return;
 
@@ -2938,6 +2945,22 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_POS);
 		break;
 	}
+	case GHCB_MSR_AP_RESET_HOLD_REQ:
+		svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_MSR_PROTO;
+		ret = kvm_emulate_ap_reset_hold(&svm->vcpu);
+
+		/*
+		 * Preset the result to a non-SIPI return and then only set
+		 * the result to non-zero when delivering a SIPI.
+		 */
+		set_ghcb_msr_bits(svm, 0,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_MASK,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_POS);
+
+		set_ghcb_msr_bits(svm, GHCB_MSR_AP_RESET_HOLD_RESP,
+				  GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3037,6 +3060,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		ret = 1;
 		break;
 	case SVM_VMGEXIT_AP_HLT_LOOP:
+		svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_NAE_EVENT;
 		ret = kvm_emulate_ap_reset_hold(vcpu);
 		break;
 	case SVM_VMGEXIT_AP_JUMP_TABLE: {
@@ -3280,15 +3304,31 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 		return;
 	}
 
-	/*
-	 * Subsequent SIPI: Return from an AP Reset Hold VMGEXIT, where
-	 * the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a
-	 * non-zero value.
-	 */
-	if (!svm->sev_es.ghcb)
-		return;
+	/* Subsequent SIPI */
+	switch (svm->sev_es.ap_reset_hold_type) {
+	case AP_RESET_HOLD_NAE_EVENT:
+		/*
+		 * Return from an AP Reset Hold VMGEXIT, where the guest will
+		 * set the CS and RIP. Set SW_EXIT_INFO_2 to a non-zero value.
+		 */
+		ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
+		break;
+	case AP_RESET_HOLD_MSR_PROTO:
+		/*
+		 * Return from an AP Reset Hold VMGEXIT, where the guest will
+		 * set the CS and RIP. Set GHCB data field to a non-zero value.
+		 */
+		set_ghcb_msr_bits(svm, 1,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_MASK,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_POS);
 
-	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
+		set_ghcb_msr_bits(svm, GHCB_MSR_AP_RESET_HOLD_RESP,
+				  GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	default:
+		break;
+	}
 }
 
 struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 323901782547..6fd0f5862681 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -199,6 +199,7 @@ struct vcpu_sev_es_state {
 	u8 valid_bitmap[16];
 	struct kvm_host_map ghcb_map;
 	bool received_first_sipi;
+	unsigned int ap_reset_hold_type;
 
 	/* SEV-ES scratch area support */
 	u64 sw_scratch;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  2024-04-21 18:01 ` [PATCH v14 01/22] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
  2024-04-21 18:01 ` [PATCH v14 02/22] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-24 20:21   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 04/22] KVM: SEV: Add initial SEV-SNP support Michael Roth
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Version 2 of the GHCB specification introduced advertisement of features
that are supported by the Hypervisor.

Now that KVM supports version 2 of the GHCB specification, bump the
maximum supported protocol version.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  2 ++
 arch/x86/kvm/svm/sev.c            | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 01261f7054ad..5a8246dd532f 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -101,6 +101,8 @@ enum psc_op {
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
+#define GHCB_MSR_HV_FT_POS		12
+#define GHCB_MSR_HV_FT_MASK		GENMASK_ULL(51, 0)
 #define GHCB_MSR_HV_FT_RESP_VAL(v)			\
 	/* GHCBData[63:12] */				\
 	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6e31cb408dd8..1d2264e93afe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -33,9 +33,11 @@
 #include "cpuid.h"
 #include "trace.h"
 
-#define GHCB_VERSION_MAX	1ULL
+#define GHCB_VERSION_MAX	2ULL
 #define GHCB_VERSION_MIN	1ULL
 
+#define GHCB_HV_FT_SUPPORTED	GHCB_HV_FT_SNP
+
 /* enable/disable SEV support */
 static bool sev_enabled = true;
 module_param_named(sev, sev_enabled, bool, 0444);
@@ -2701,6 +2703,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_HLT_LOOP:
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
+	case SVM_VMGEXIT_HV_FEATURES:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -2961,6 +2964,12 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_MASK,
 				  GHCB_MSR_INFO_POS);
 		break;
+	case GHCB_MSR_HV_FT_REQ:
+		set_ghcb_msr_bits(svm, GHCB_HV_FT_SUPPORTED,
+				  GHCB_MSR_HV_FT_MASK, GHCB_MSR_HV_FT_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_HV_FT_RESP,
+				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3085,6 +3094,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		ret = 1;
 		break;
 	}
+	case SVM_VMGEXIT_HV_FEATURES:
+		ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_HV_FT_SUPPORTED);
+
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 04/22] KVM: SEV: Add initial SEV-SNP support
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (2 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based security protection. SEV-SNP adds strong memory
encryption and integrity protection to help prevent malicious
hypervisor-based attacks such as data replay, memory re-mapping, and
more, to create an isolated execution environment.

Define a new KVM_X86_SNP_VM type which makes use of these capabilities
and extend the KVM_SEV_INIT2 ioctl to support it. Also add a basic
helper to check whether SNP is enabled and set PFERR_PRIVATE_ACCESS for
private #NPFs so they are handled appropriately by KVM MMU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/svm.h      |  3 ++-
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm/sev.c          | 21 ++++++++++++++++++++-
 arch/x86/kvm/svm/svm.c          |  8 +++++++-
 arch/x86/kvm/svm/svm.h          | 12 ++++++++++++
 5 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 728c98175b9c..544a43c1cf11 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -285,7 +285,8 @@ static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_
 
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 
-#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
+#define SVM_SEV_FEAT_SNP_ACTIVE				BIT(0)
+#define SVM_SEV_FEAT_DEBUG_SWAP				BIT(5)
 
 struct vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 72ad5ace118d..9a8b81d20314 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -872,5 +872,6 @@ struct kvm_hyperv_eventfd {
 #define KVM_X86_SW_PROTECTED_VM	1
 #define KVM_X86_SEV_VM		2
 #define KVM_X86_SEV_ES_VM	3
+#define KVM_X86_SNP_VM		4
 
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1d2264e93afe..c41cc73a1efe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -46,6 +46,9 @@ module_param_named(sev, sev_enabled, bool, 0444);
 static bool sev_es_enabled = true;
 module_param_named(sev_es, sev_es_enabled, bool, 0444);
 
+/* enable/disable SEV-SNP support */
+static bool sev_snp_enabled;
+
 /* enable/disable SEV-ES DebugSwap support */
 static bool sev_es_debug_swap_enabled = true;
 module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
@@ -275,6 +278,9 @@ static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
 	sev->es_active = es_active;
 	sev->vmsa_features = data->vmsa_features;
 
+	if (vm_type == KVM_X86_SNP_VM)
+		sev->vmsa_features |= SVM_SEV_FEAT_SNP_ACTIVE;
+
 	ret = sev_asid_new(sev);
 	if (ret)
 		goto e_no_asid;
@@ -326,7 +332,8 @@ static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
 		return -EINVAL;
 
 	if (kvm->arch.vm_type != KVM_X86_SEV_VM &&
-	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM)
+	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM &&
+	    kvm->arch.vm_type != KVM_X86_SNP_VM)
 		return -EINVAL;
 
 	if (copy_from_user(&data, u64_to_user_ptr(argp->data), sizeof(data)))
@@ -2306,11 +2313,16 @@ void __init sev_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SEV_ES);
 		kvm_caps.supported_vm_types |= BIT(KVM_X86_SEV_ES_VM);
 	}
+	if (sev_snp_enabled) {
+		kvm_cpu_cap_set(X86_FEATURE_SEV_SNP);
+		kvm_caps.supported_vm_types |= BIT(KVM_X86_SNP_VM);
+	}
 }
 
 void __init sev_hardware_setup(void)
 {
 	unsigned int eax, ebx, ecx, edx, sev_asid_count, sev_es_asid_count;
+	bool sev_snp_supported = false;
 	bool sev_es_supported = false;
 	bool sev_supported = false;
 
@@ -2391,6 +2403,7 @@ void __init sev_hardware_setup(void)
 	sev_es_asid_count = min_sev_asid - 1;
 	WARN_ON_ONCE(misc_cg_set_capacity(MISC_CG_RES_SEV_ES, sev_es_asid_count));
 	sev_es_supported = true;
+	sev_snp_supported = sev_snp_enabled && cc_platform_has(CC_ATTR_HOST_SEV_SNP);
 
 out:
 	if (boot_cpu_has(X86_FEATURE_SEV))
@@ -2403,9 +2416,15 @@ void __init sev_hardware_setup(void)
 		pr_info("SEV-ES %s (ASIDs %u - %u)\n",
 			sev_es_supported ? "enabled" : "disabled",
 			min_sev_asid > 1 ? 1 : 0, min_sev_asid - 1);
+	if (boot_cpu_has(X86_FEATURE_SEV_SNP))
+		pr_info("SEV-SNP %s (ASIDs %u - %u)\n",
+			sev_snp_supported ? "enabled" : "disabled",
+			min_sev_asid > 1 ? 1 : 0, min_sev_asid - 1);
 
 	sev_enabled = sev_supported;
 	sev_es_enabled = sev_es_supported;
+	sev_snp_enabled = sev_snp_supported;
+
 	if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP) ||
 	    !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
 		sev_es_debug_swap_enabled = false;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 535018f152a3..422b452fbc3b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2056,6 +2056,9 @@ static int npf_interception(struct kvm_vcpu *vcpu)
 	if (WARN_ON_ONCE(error_code & PFERR_SYNTHETIC_MASK))
 		error_code &= ~PFERR_SYNTHETIC_MASK;
 
+	if (sev_snp_guest(vcpu->kvm) && (error_code & PFERR_GUEST_ENC_MASK))
+		error_code |= PFERR_PRIVATE_ACCESS;
+
 	trace_kvm_page_fault(vcpu, fault_address, error_code);
 	return kvm_mmu_page_fault(vcpu, fault_address, error_code,
 			static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
@@ -4899,8 +4902,11 @@ static int svm_vm_init(struct kvm *kvm)
 
 	if (type != KVM_X86_DEFAULT_VM &&
 	    type != KVM_X86_SW_PROTECTED_VM) {
-		kvm->arch.has_protected_state = (type == KVM_X86_SEV_ES_VM);
+		kvm->arch.has_protected_state =
+			(type == KVM_X86_SEV_ES_VM || type == KVM_X86_SNP_VM);
 		to_kvm_sev_info(kvm)->need_init = true;
+
+		kvm->arch.has_private_mem = (type == KVM_X86_SNP_VM);
 	}
 
 	if (!pause_filter_count || !pause_filter_thresh)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6fd0f5862681..7f2e9c7fc4ca 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -348,6 +348,18 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
 #endif
 }
 
+static __always_inline bool sev_snp_guest(struct kvm *kvm)
+{
+#ifdef CONFIG_KVM_AMD_SEV
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+	return (sev->vmsa_features & SVM_SEV_FEAT_SNP_ACTIVE) &&
+	       !WARN_ON_ONCE(!sev_es_guest(kvm));
+#else
+	return false;
+#endif
+}
+
 static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
 {
 	vmcb->control.clean = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (3 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 04/22] KVM: SEV: Add initial SEV-SNP support Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-24 23:26   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

KVM_SEV_SNP_LAUNCH_START begins the launch process for an SEV-SNP guest.
The command initializes a cryptographic digest context used to construct
the measurement of the guest. Other commands can then at that point be
used to load/encrypt data into the guest's initial launch image.

For more information see the SEV-SNP specification.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  28 ++-
 arch/x86/include/uapi/asm/kvm.h               |  11 +
 arch/x86/kvm/svm/sev.c                        | 195 +++++++++++++++++-
 arch/x86/kvm/svm/svm.h                        |   1 +
 4 files changed, 231 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 3381556d596d..d4c4a0b90bc9 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -459,6 +459,30 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+18. KVM_SEV_SNP_LAUNCH_START
+----------------------------
+
+The KVM_SNP_LAUNCH_START command is used for creating the memory encryption
+context for the SEV-SNP guest. It must be called prior to issuing
+KVM_SEV_SNP_LAUNCH_UPDATE or KVM_SEV_SNP_LAUNCH_FINISH;
+
+Parameters (in): struct  kvm_sev_snp_launch_start
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_snp_launch_start {
+                __u64 policy;           /* Guest policy to use. */
+                __u8 gosvw[16];         /* Guest OS visible workarounds. */
+                __u16 flags;            /* Must be zero. */
+                __u8 pad0[6];
+                __u64 pad1[4];
+        };
+
+See SNP_LAUNCH_START in the SEV-SNP specification [snp-fw-abi]_ for further
+details on the input parameters in ``struct kvm_sev_snp_launch_start``.
+
 Device attribute API
 ====================
 
@@ -490,9 +514,11 @@ References
 ==========
 
 
-See [white-paper]_, [api-spec]_, [amd-apm]_ and [kvm-forum]_ for more info.
+See [white-paper]_, [api-spec]_, [amd-apm]_, [kvm-forum]_, and [snp-fw-abi]_
+for more info.
 
 .. [white-paper] https://developer.amd.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
 .. [api-spec] https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf
 .. [amd-apm] https://support.amd.com/TechDocs/24593.pdf (section 15.34)
 .. [kvm-forum]  https://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
+.. [snp-fw-abi] https://www.amd.com/system/files/TechDocs/56860.pdf
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 9a8b81d20314..5765391f0fdb 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -697,6 +697,9 @@ enum sev_cmd_id {
 	/* Second time is the charm; improved versions of the above ioctls.  */
 	KVM_SEV_INIT2,
 
+	/* SNP-specific commands */
+	KVM_SEV_SNP_LAUNCH_START = 100,
+
 	KVM_SEV_NR_MAX,
 };
 
@@ -822,6 +825,14 @@ struct kvm_sev_receive_update_data {
 	__u32 pad2;
 };
 
+struct kvm_sev_snp_launch_start {
+	__u64 policy;
+	__u8 gosvw[16];
+	__u16 flags;
+	__u8 pad0[6];
+	__u64 pad1[4];
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c41cc73a1efe..9d08d1202544 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
 #include <asm/fpu/xcr.h>
 #include <asm/fpu/xstate.h>
 #include <asm/debugreg.h>
+#include <asm/sev.h>
 
 #include "mmu.h"
 #include "x86.h"
@@ -58,6 +59,21 @@ static u64 sev_supported_vmsa_features;
 #define AP_RESET_HOLD_NAE_EVENT		1
 #define AP_RESET_HOLD_MSR_PROTO		2
 
+/* As defined by SEV-SNP Firmware ABI, under "Guest Policy". */
+#define SNP_POLICY_MASK_API_MINOR	GENMASK_ULL(7, 0)
+#define SNP_POLICY_MASK_API_MAJOR	GENMASK_ULL(15, 8)
+#define SNP_POLICY_MASK_SMT		BIT_ULL(16)
+#define SNP_POLICY_MASK_RSVD_MBO	BIT_ULL(17)
+#define SNP_POLICY_MASK_DEBUG		BIT_ULL(19)
+#define SNP_POLICY_MASK_SINGLE_SOCKET	BIT_ULL(20)
+
+#define SNP_POLICY_MASK_VALID		(SNP_POLICY_MASK_API_MINOR	| \
+					 SNP_POLICY_MASK_API_MAJOR	| \
+					 SNP_POLICY_MASK_SMT		| \
+					 SNP_POLICY_MASK_RSVD_MBO	| \
+					 SNP_POLICY_MASK_DEBUG		| \
+					 SNP_POLICY_MASK_SINGLE_SOCKET)
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -68,6 +84,8 @@ static unsigned int nr_asids;
 static unsigned long *sev_asid_bitmap;
 static unsigned long *sev_reclaim_asid_bitmap;
 
+static int snp_decommission_context(struct kvm *kvm);
+
 struct enc_region {
 	struct list_head list;
 	unsigned long npages;
@@ -94,12 +112,17 @@ static int sev_flush_asids(unsigned int min_asid, unsigned int max_asid)
 	down_write(&sev_deactivate_lock);
 
 	wbinvd_on_all_cpus();
-	ret = sev_guest_df_flush(&error);
+
+	if (sev_snp_enabled)
+		ret = sev_do_cmd(SEV_CMD_SNP_DF_FLUSH, NULL, &error);
+	else
+		ret = sev_guest_df_flush(&error);
 
 	up_write(&sev_deactivate_lock);
 
 	if (ret)
-		pr_err("SEV: DF_FLUSH failed, ret=%d, error=%#x\n", ret, error);
+		pr_err("SEV%s: DF_FLUSH failed, ret=%d, error=%#x\n",
+		       sev_snp_enabled ? "-SNP" : "", ret, error);
 
 	return ret;
 }
@@ -1976,6 +1999,125 @@ int sev_dev_get_attr(u32 group, u64 attr, u64 *val)
 	}
 }
 
+/*
+ * The guest context contains all the information, keys and metadata
+ * associated with the guest that the firmware tracks to implement SEV
+ * and SNP features. The firmware stores the guest context in hypervisor
+ * provide page via the SNP_GCTX_CREATE command.
+ */
+static void *snp_context_create(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct sev_data_snp_addr data = {};
+	void *context;
+	int rc;
+
+	/* Allocate memory for context page */
+	context = snp_alloc_firmware_page(GFP_KERNEL_ACCOUNT);
+	if (!context)
+		return NULL;
+
+	data.address = __psp_pa(context);
+	rc = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_GCTX_CREATE, &data, &argp->error);
+	if (rc) {
+		pr_warn("Failed to create SEV-SNP context, rc %d fw_error %d",
+			rc, argp->error);
+		snp_free_firmware_page(context);
+		return NULL;
+	}
+
+	return context;
+}
+
+static int snp_bind_asid(struct kvm *kvm, int *error)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_activate data = {0};
+
+	data.gctx_paddr = __psp_pa(sev->snp_context);
+	data.asid = sev_get_asid(kvm);
+	return sev_issue_cmd(kvm, SEV_CMD_SNP_ACTIVATE, &data, error);
+}
+
+static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_launch_start start = {0};
+	struct kvm_sev_snp_launch_start params;
+	int rc;
+
+	if (!sev_snp_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	/* Don't allow userspace to allocate memory for more than 1 SNP context. */
+	if (sev->snp_context) {
+		pr_debug("%s: SEV-SNP context already exists. Refusing to allocate an additional one.\n",
+			 __func__);
+		return -EINVAL;
+	}
+
+	sev->snp_context = snp_context_create(kvm, argp);
+	if (!sev->snp_context)
+		return -ENOTTY;
+
+	if (params.flags) {
+		pr_debug("%s: SEV-SNP hypervisor does not support requested flags 0x%x\n",
+			 __func__, params.flags);
+		return -EINVAL;
+	}
+
+	if (params.policy & ~SNP_POLICY_MASK_VALID) {
+		pr_debug("%s: SEV-SNP hypervisor does not support requested policy 0x%llx (supported 0x%llx).\n",
+			 __func__, params.policy, SNP_POLICY_MASK_VALID);
+		return -EINVAL;
+	}
+
+	if (!(params.policy & SNP_POLICY_MASK_RSVD_MBO)) {
+		pr_debug("%s: SEV-SNP hypervisor does not support requested policy 0x%llx (must be set 0x%llx).\n",
+			 __func__, params.policy, SNP_POLICY_MASK_RSVD_MBO);
+		return -EINVAL;
+	}
+
+	if (params.policy & SNP_POLICY_MASK_SINGLE_SOCKET) {
+		pr_debug("%s: SEV-SNP hypervisor does not support limiting guests to a single socket.\n",
+			 __func__);
+		return -EINVAL;
+	}
+
+	if (!(params.policy & SNP_POLICY_MASK_SMT)) {
+		pr_debug("%s: SEV-SNP hypervisor does not support limiting guests to a single SMT thread.\n",
+			 __func__);
+		return -EINVAL;
+	}
+
+	start.gctx_paddr = __psp_pa(sev->snp_context);
+	start.policy = params.policy;
+	memcpy(start.gosvw, params.gosvw, sizeof(params.gosvw));
+	rc = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_START, &start, &argp->error);
+	if (rc) {
+		pr_debug("%s: SEV_CMD_SNP_LAUNCH_START firmware command failed, rc %d\n",
+			 __func__, rc);
+		goto e_free_context;
+	}
+
+	sev->fd = argp->sev_fd;
+	rc = snp_bind_asid(kvm, &argp->error);
+	if (rc) {
+		pr_debug("%s: Failed to bind ASID to SEV-SNP context, rc %d\n",
+			 __func__, rc);
+		goto e_free_context;
+	}
+
+	return 0;
+
+e_free_context:
+	snp_decommission_context(kvm);
+
+	return rc;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1999,6 +2141,15 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 		goto out;
 	}
 
+	/*
+	 * Once KVM_SEV_INIT2 initializes a KVM instance as an SNP guest, only
+	 * allow the use of SNP-specific commands.
+	 */
+	if (sev_snp_guest(kvm) && sev_cmd.id < KVM_SEV_SNP_LAUNCH_START) {
+		r = -EPERM;
+		goto out;
+	}
+
 	switch (sev_cmd.id) {
 	case KVM_SEV_ES_INIT:
 		if (!sev_es_enabled) {
@@ -2063,6 +2214,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_FINISH:
 		r = sev_receive_finish(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_START:
+		r = snp_launch_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
@@ -2258,6 +2412,33 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
 	return ret;
 }
 
+static int snp_decommission_context(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_addr data = {};
+	int ret;
+
+	/* If context is not created then do nothing */
+	if (!sev->snp_context)
+		return 0;
+
+	data.address = __sme_pa(sev->snp_context);
+	down_write(&sev_deactivate_lock);
+	ret = sev_do_cmd(SEV_CMD_SNP_DECOMMISSION, &data, NULL);
+	if (WARN_ONCE(ret, "failed to release guest context")) {
+		up_write(&sev_deactivate_lock);
+		return ret;
+	}
+
+	up_write(&sev_deactivate_lock);
+
+	/* free the context page now */
+	snp_free_firmware_page(sev->snp_context);
+	sev->snp_context = NULL;
+
+	return 0;
+}
+
 void sev_vm_destroy(struct kvm *kvm)
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
@@ -2299,7 +2480,15 @@ void sev_vm_destroy(struct kvm *kvm)
 		}
 	}
 
-	sev_unbind_asid(kvm, sev->handle);
+	if (sev_snp_guest(kvm)) {
+		if (snp_decommission_context(kvm)) {
+			WARN_ONCE(1, "Failed to free SNP guest context, leaking asid!\n");
+			return;
+		}
+	} else {
+		sev_unbind_asid(kvm, sev->handle);
+	}
+
 	sev_asid_free(sev);
 }
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7f2e9c7fc4ca..0654fc91d4db 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -92,6 +92,7 @@ struct kvm_sev_info {
 	struct list_head mirror_entry; /* Use as a list entry of mirrors */
 	struct misc_cg *misc_cg; /* For misc cgroup accounting */
 	atomic_t migration_in_progress;
+	void *snp_context;      /* SNP guest context page */
 };
 
 struct kvm_svm {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (4 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-24 23:58   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 07/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

A key aspect of a launching an SNP guest is initializing it with a
known/measured payload which is then encrypted into guest memory as
pre-validated private pages and then measured into the cryptographic
launch context created with KVM_SEV_SNP_LAUNCH_START so that the guest
can attest itself after booting.

Since all private pages are provided by guest_memfd, make use of the
kvm_gmem_populate() interface to handle this. The general flow is that
guest_memfd will handle allocating the pages associated with the GPA
ranges being initialized by each particular call of
KVM_SEV_SNP_LAUNCH_UPDATE, copying data from userspace into those pages,
and then the post_populate callback will do the work of setting the
RMP entries for these pages to private and issuing the SNP firmware
calls to encrypt/measure them.

For more information see the SEV-SNP specification.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  54 ++++
 arch/x86/include/uapi/asm/kvm.h               |  19 ++
 arch/x86/kvm/svm/sev.c                        | 237 ++++++++++++++++++
 3 files changed, 310 insertions(+)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index d4c4a0b90bc9..60728868c5c6 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -483,6 +483,60 @@ Returns: 0 on success, -negative on error
 See SNP_LAUNCH_START in the SEV-SNP specification [snp-fw-abi]_ for further
 details on the input parameters in ``struct kvm_sev_snp_launch_start``.
 
+19. KVM_SEV_SNP_LAUNCH_UPDATE
+-----------------------------
+
+The KVM_SEV_SNP_LAUNCH_UPDATE command is used for loading userspace-provided
+data into a guest GPA range, measuring the contents into the SNP guest context
+created by KVM_SEV_SNP_LAUNCH_START, and then encrypting/validating that GPA
+range so that it will be immediately readable using the encryption key
+associated with the guest context once it is booted, after which point it can
+attest the measurement associated with its context before unlocking any
+secrets.
+
+It is required that the GPA ranges initialized by this command have had the
+KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation
+for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect.
+
+Upon success, this command is not guaranteed to have processed the entire
+range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of
+``struct kvm_sev_snp_launch_update`` will be updated to correspond to the
+remaining range that has yet to be processed. The caller should continue
+calling this command until those fields indicate the entire range has been
+processed, e.g. ``len`` is 0, ``gfn_start`` is equal to the last GFN in the
+range plus 1, and ``uaddr`` is the last byte of the userspace-provided source
+buffer address plus 1. In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO,
+``uaddr`` will be ignored completely.
+
+Parameters (in): struct  kvm_sev_snp_launch_update
+
+Returns: 0 on success, < 0 on error, -EAGAIN if caller should retry
+
+::
+
+        struct kvm_sev_snp_launch_update {
+                __u64 gfn_start;        /* Guest page number to load/encrypt data into. */
+                __u64 uaddr;            /* Userspace address of data to be loaded/encrypted. */
+                __u64 len;              /* 4k-aligned length in bytes to copy into guest memory.*/
+                __u8 type;              /* The type of the guest pages being initialized. */
+                __u8 pad0;
+                __u16 flags;            /* Must be zero. */
+                __u32 pad1;
+                __u64 pad2[4];
+
+        };
+
+where the allowed values for page_type are #define'd as::
+
+        KVM_SEV_SNP_PAGE_TYPE_NORMAL
+        KVM_SEV_SNP_PAGE_TYPE_ZERO
+        KVM_SEV_SNP_PAGE_TYPE_UNMEASURED
+        KVM_SEV_SNP_PAGE_TYPE_SECRETS
+        KVM_SEV_SNP_PAGE_TYPE_CPUID
+
+See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is
+used/measured.
+
 Device attribute API
 ====================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 5765391f0fdb..3c9255de76db 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -699,6 +699,7 @@ enum sev_cmd_id {
 
 	/* SNP-specific commands */
 	KVM_SEV_SNP_LAUNCH_START = 100,
+	KVM_SEV_SNP_LAUNCH_UPDATE,
 
 	KVM_SEV_NR_MAX,
 };
@@ -833,6 +834,24 @@ struct kvm_sev_snp_launch_start {
 	__u64 pad1[4];
 };
 
+/* Kept in sync with firmware values for simplicity. */
+#define KVM_SEV_SNP_PAGE_TYPE_NORMAL		0x1
+#define KVM_SEV_SNP_PAGE_TYPE_ZERO		0x3
+#define KVM_SEV_SNP_PAGE_TYPE_UNMEASURED	0x4
+#define KVM_SEV_SNP_PAGE_TYPE_SECRETS		0x5
+#define KVM_SEV_SNP_PAGE_TYPE_CPUID		0x6
+
+struct kvm_sev_snp_launch_update {
+	__u64 gfn_start;
+	__u64 uaddr;
+	__u64 len;
+	__u8 type;
+	__u8 pad0;
+	__u16 flags;
+	__u32 pad1;
+	__u64 pad2[4];
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 9d08d1202544..d3ae4ded91df 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -258,6 +258,35 @@ static void sev_decommission(unsigned int handle)
 	sev_guest_decommission(&decommission, NULL);
 }
 
+static int snp_page_reclaim(u64 pfn)
+{
+	struct sev_data_snp_page_reclaim data = {0};
+	int err, rc;
+
+	data.paddr = __sme_set(pfn << PAGE_SHIFT);
+	rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+	if (WARN_ON_ONCE(rc)) {
+		/*
+		 * This shouldn't happen under normal circumstances, but if the
+		 * reclaim failed, then the page is no longer safe to use.
+		 */
+		snp_leak_pages(pfn, 1);
+	}
+
+	return rc;
+}
+
+static int host_rmp_make_shared(u64 pfn, enum pg_level level)
+{
+	int rc;
+
+	rc = rmp_make_shared(pfn, level);
+	if (rc)
+		snp_leak_pages(pfn, page_level_size(level) >> PAGE_SHIFT);
+
+	return rc;
+}
+
 static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 {
 	struct sev_data_deactivate deactivate;
@@ -2118,6 +2147,211 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return rc;
 }
 
+struct sev_gmem_populate_args {
+	__u8 type;
+	int sev_fd;
+	int fw_error;
+};
+
+static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn_start, kvm_pfn_t pfn,
+				  void __user *src, int order, void *opaque)
+{
+	struct sev_gmem_populate_args *sev_populate_args = opaque;
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	int n_private = 0, ret, i;
+	int npages = (1 << order);
+	gfn_t gfn;
+
+	pr_debug("%s: gfn_start 0x%llx pfn_start 0x%llx npages %d\n",
+		 __func__, gfn_start, pfn, npages);
+
+	if (WARN_ON_ONCE(sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	for (gfn = gfn_start, i = 0; gfn < gfn_start + npages; gfn++, i++) {
+		struct sev_data_snp_launch_update fw_args = {0};
+		bool assigned;
+		void *vaddr;
+		int level;
+
+		if (!kvm_mem_is_private(kvm, gfn)) {
+			pr_debug("%s: Failed to ensure GFN 0x%llx has private memory attribute set\n",
+				 __func__, gfn);
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = snp_lookup_rmpentry((u64)pfn + i, &assigned, &level);
+		if (ret || assigned) {
+			pr_debug("%s: Failed to ensure GFN 0x%llx RMP entry is initial shared state, ret: %d assigned: %d\n",
+				 __func__, gfn, ret, assigned);
+			ret = -EINVAL;
+			break;
+		}
+
+		if (src) {
+			vaddr = kmap_local_pfn(pfn + i);
+			ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
+			if (ret) {
+				pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
+				goto out_unmap;
+			}
+		}
+
+		ret = rmp_make_private(pfn + i, gfn << PAGE_SHIFT, PG_LEVEL_4K,
+				       sev_get_asid(kvm), true);
+		if (ret) {
+			pr_debug("%s: Failed to mark RMP entry for GFN 0x%llx as private, ret: %d\n",
+				 __func__, gfn, ret);
+			goto out_unmap;
+		}
+
+		n_private++;
+
+		fw_args.gctx_paddr = __psp_pa(sev->snp_context);
+		fw_args.address = __sme_set(pfn_to_hpa(pfn + i));
+		fw_args.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
+		fw_args.page_type = sev_populate_args->type;
+		ret = __sev_issue_cmd(sev_populate_args->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
+				      &fw_args, &sev_populate_args->fw_error);
+		if (ret) {
+			pr_debug("%s: SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
+				 __func__, ret, sev_populate_args->fw_error);
+
+			if (WARN_ON_ONCE(snp_page_reclaim(pfn + i)))
+				goto out_unmap;
+
+			/*
+			 * When invalid CPUID function entries are detected,
+			 * firmware writes the expected values into the page and
+			 * leaves it unencrypted so it can be used for debugging
+			 * and error-reporting.
+			 *
+			 * Copy this page back into the source buffer so
+			 * userspace can use this information to provide
+			 * information on which CPUID leaves/fields failed CPUID
+			 * validation.
+			 */
+			if (sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
+			    sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
+				if (WARN_ON_ONCE(host_rmp_make_shared(pfn + i, PG_LEVEL_4K)))
+					goto out_unmap;
+
+				if (copy_to_user(src + i * PAGE_SIZE, vaddr,
+						 PAGE_SIZE))
+					pr_debug("Failed to write CPUID page back to userspace\n");
+
+				/* PFN is hypervisor-owned at this point, skip cleanup for it. */
+				n_private--;
+			}
+		}
+
+out_unmap:
+		kunmap_local(vaddr);
+		if (ret)
+			break;
+	}
+
+out:
+	if (ret) {
+		pr_debug("%s: exiting with error ret %d, restoring %d gmem PFNs to shared.\n",
+			 __func__, ret, n_private);
+		for (i = 0; i < n_private; i++)
+			WARN_ON_ONCE(host_rmp_make_shared(pfn + i, PG_LEVEL_4K));
+	}
+
+	return ret;
+}
+
+static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_gmem_populate_args sev_populate_args = {0};
+	struct kvm_sev_snp_launch_update params;
+	struct kvm_memory_slot *memslot;
+	long npages, count;
+	void __user *src;
+	int ret = 0;
+
+	if (!sev_snp_guest(kvm) || !sev->snp_context)
+		return -EINVAL;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	pr_debug("%s: GFN start 0x%llx length 0x%llx type %d flags %d\n", __func__,
+		 params.gfn_start, params.len, params.type, params.flags);
+
+	if (!PAGE_ALIGNED(params.len) || params.flags ||
+	    (params.type != KVM_SEV_SNP_PAGE_TYPE_NORMAL &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_UNMEASURED &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_SECRETS &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_CPUID))
+		return -EINVAL;
+
+	npages = params.len / PAGE_SIZE;
+
+	/*
+	 * For each GFN that's being prepared as part of the initial guest
+	 * state, the following pre-conditions are verified:
+	 *
+	 *   1) The backing memslot is a valid private memslot.
+	 *   2) The GFN has been set to private via KVM_SET_MEMORY_ATTRIBUTES
+	 *      beforehand.
+	 *   3) The PFN of the guest_memfd has not already been set to private
+	 *      in the RMP table.
+	 *
+	 * The KVM MMU relies on kvm->mmu_invalidate_seq to retry nested page
+	 * faults if there's a race between a fault and an attribute update via
+	 * KVM_SET_MEMORY_ATTRIBUTES, and a similar approach could be utilized
+	 * here. However, kvm->slots_lock guards against both this as well as
+	 * concurrent memslot updates occurring while these checks are being
+	 * performed, so use that here to make it easier to reason about the
+	 * initial expected state and better guard against unexpected
+	 * situations.
+	 */
+	mutex_lock(&kvm->slots_lock);
+
+	memslot = gfn_to_memslot(kvm, params.gfn_start);
+	if (!kvm_slot_can_be_private(memslot)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	sev_populate_args.sev_fd = argp->sev_fd;
+	sev_populate_args.type = params.type;
+	src = params.type == KVM_SEV_SNP_PAGE_TYPE_ZERO ? NULL : u64_to_user_ptr(params.uaddr);
+
+	count = kvm_gmem_populate(kvm, params.gfn_start, src, npages,
+				  sev_gmem_post_populate, &sev_populate_args);
+	if (count < 0) {
+		argp->error = sev_populate_args.fw_error;
+		pr_debug("%s: kvm_gmem_populate failed, ret %ld (fw_error %d)\n",
+			 __func__, count, argp->error);
+		ret = -EIO;
+	} else if (count <= npages) {
+		params.gfn_start += count;
+		params.len -= count * PAGE_SIZE;
+		if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
+			params.uaddr += count * PAGE_SIZE;
+
+		ret = copy_to_user(u64_to_user_ptr(argp->data), &params, sizeof(params))
+		      ? -EIO : 0;
+	} else {
+		WARN_ONCE(1, "Completed page count %ld exceeds requested amount %ld",
+			  count, npages);
+		ret = -EINVAL;
+	}
+
+out:
+	mutex_unlock(&kvm->slots_lock);
+
+	return ret;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -2217,6 +2451,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SNP_LAUNCH_START:
 		r = snp_launch_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_UPDATE:
+		r = snp_launch_update(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 07/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (5 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 08/22] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Harald Hoyer

From: Brijesh Singh <brijesh.singh@amd.com>

Add a KVM_SEV_SNP_LAUNCH_FINISH command to finalize the cryptographic
launch digest which stores the measurement of the guest at launch time.
Also extend the existing SNP firmware data structures to support
disabling the use of Versioned Chip Endorsement Keys (VCEK) by guests as
part of this command.

While finalizing the launch flow, the code also issues the LAUNCH_UPDATE
SNP firmware commands to encrypt/measure the initial VMSA pages for each
configured vCPU, which requires setting the RMP entries for those pages
to private, so also add handling to clean up the RMP entries for these
pages whening freeing vCPUs during shutdown.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Harald Hoyer <harald@profian.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  28 ++++
 arch/x86/include/uapi/asm/kvm.h               |  17 +++
 arch/x86/kvm/svm/sev.c                        | 126 ++++++++++++++++++
 include/linux/psp-sev.h                       |   4 +-
 4 files changed, 174 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 60728868c5c6..67bcede94bb5 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -537,6 +537,34 @@ where the allowed values for page_type are #define'd as::
 See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is
 used/measured.
 
+20. KVM_SEV_SNP_LAUNCH_FINISH
+-----------------------------
+
+After completion of the SNP guest launch flow, the KVM_SEV_SNP_LAUNCH_FINISH
+command can be issued to make the guest ready for execution.
+
+Parameters (in): struct kvm_sev_snp_launch_finish
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_snp_launch_finish {
+                __u64 id_block_uaddr;
+                __u64 id_auth_uaddr;
+                __u8 id_block_en;
+                __u8 auth_key_en;
+                __u8 vcek_disabled;
+                __u8 host_data[32];
+                __u8 pad0[3];
+                __u16 flags;                    /* Must be zero */
+                __u64 pad1[4];
+        };
+
+
+See SNP_LAUNCH_FINISH in the SEV-SNP specification [snp-fw-abi]_ for further
+details on the input parameters in ``struct kvm_sev_snp_launch_finish``.
+
 Device attribute API
 ====================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 3c9255de76db..8007fbfe0160 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -700,6 +700,7 @@ enum sev_cmd_id {
 	/* SNP-specific commands */
 	KVM_SEV_SNP_LAUNCH_START = 100,
 	KVM_SEV_SNP_LAUNCH_UPDATE,
+	KVM_SEV_SNP_LAUNCH_FINISH,
 
 	KVM_SEV_NR_MAX,
 };
@@ -852,6 +853,22 @@ struct kvm_sev_snp_launch_update {
 	__u64 pad2[4];
 };
 
+#define KVM_SEV_SNP_ID_BLOCK_SIZE	96
+#define KVM_SEV_SNP_ID_AUTH_SIZE	4096
+#define KVM_SEV_SNP_FINISH_DATA_SIZE	32
+
+struct kvm_sev_snp_launch_finish {
+	__u64 id_block_uaddr;
+	__u64 id_auth_uaddr;
+	__u8 id_block_en;
+	__u8 auth_key_en;
+	__u8 vcek_disabled;
+	__u8 host_data[KVM_SEV_SNP_FINISH_DATA_SIZE];
+	__u8 pad0[3];
+	__u16 flags;
+	__u64 pad1[4];
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index d3ae4ded91df..6ca1b13c9beb 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -74,6 +74,8 @@ static u64 sev_supported_vmsa_features;
 					 SNP_POLICY_MASK_DEBUG		| \
 					 SNP_POLICY_MASK_SINGLE_SOCKET)
 
+#define INITIAL_VMSA_GPA 0xFFFFFFFFF000
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -2352,6 +2354,114 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_launch_update data = {};
+	struct kvm_vcpu *vcpu;
+	unsigned long i;
+	int ret;
+
+	data.gctx_paddr = __psp_pa(sev->snp_context);
+	data.page_type = SNP_PAGE_TYPE_VMSA;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		struct vcpu_svm *svm = to_svm(vcpu);
+		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
+
+		/* Perform some pre-encryption checks against the VMSA */
+		ret = sev_es_sync_vmsa(svm);
+		if (ret)
+			return ret;
+
+		/* Transition the VMSA page to a firmware state. */
+		ret = rmp_make_private(pfn, INITIAL_VMSA_GPA, PG_LEVEL_4K, sev->asid, true);
+		if (ret)
+			return ret;
+
+		/* Issue the SNP command to encrypt the VMSA */
+		data.address = __sme_pa(svm->sev_es.vmsa);
+		ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
+				      &data, &argp->error);
+		if (ret) {
+			snp_page_reclaim(pfn);
+			return ret;
+		}
+
+		svm->vcpu.arch.guest_state_protected = true;
+	}
+
+	return 0;
+}
+
+static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct kvm_sev_snp_launch_finish params;
+	struct sev_data_snp_launch_finish *data;
+	void *id_block = NULL, *id_auth = NULL;
+	int ret;
+
+	if (!sev_snp_guest(kvm))
+		return -ENOTTY;
+
+	if (!sev->snp_context)
+		return -EINVAL;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	if (params.flags)
+		return -EINVAL;
+
+	/* Measure all vCPUs using LAUNCH_UPDATE before finalizing the launch flow. */
+	ret = snp_launch_update_vmsa(kvm, argp);
+	if (ret)
+		return ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.id_block_en) {
+		id_block = psp_copy_user_blob(params.id_block_uaddr, KVM_SEV_SNP_ID_BLOCK_SIZE);
+		if (IS_ERR(id_block)) {
+			ret = PTR_ERR(id_block);
+			goto e_free;
+		}
+
+		data->id_block_en = 1;
+		data->id_block_paddr = __sme_pa(id_block);
+
+		id_auth = psp_copy_user_blob(params.id_auth_uaddr, KVM_SEV_SNP_ID_AUTH_SIZE);
+		if (IS_ERR(id_auth)) {
+			ret = PTR_ERR(id_auth);
+			goto e_free_id_block;
+		}
+
+		data->id_auth_paddr = __sme_pa(id_auth);
+
+		if (params.auth_key_en)
+			data->auth_key_en = 1;
+	}
+
+	data->vcek_disabled = params.vcek_disabled;
+
+	memcpy(data->host_data, params.host_data, KVM_SEV_SNP_FINISH_DATA_SIZE);
+	data->gctx_paddr = __psp_pa(sev->snp_context);
+	ret = sev_issue_cmd(kvm, SEV_CMD_SNP_LAUNCH_FINISH, data, &argp->error);
+
+	kfree(id_auth);
+
+e_free_id_block:
+	kfree(id_block);
+
+e_free:
+	kfree(data);
+
+	return ret;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -2454,6 +2564,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SNP_LAUNCH_UPDATE:
 		r = snp_launch_update(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_FINISH:
+		r = snp_launch_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
@@ -2944,11 +3057,24 @@ void sev_free_vcpu(struct kvm_vcpu *vcpu)
 
 	svm = to_svm(vcpu);
 
+	/*
+	 * If it's an SNP guest, then the VMSA was marked in the RMP table as
+	 * a guest-owned page. Transition the page to hypervisor state before
+	 * releasing it back to the system.
+	 */
+	if (sev_snp_guest(vcpu->kvm)) {
+		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
+
+		if (host_rmp_make_shared(pfn, PG_LEVEL_4K))
+			goto skip_vmsa_free;
+	}
+
 	if (vcpu->arch.guest_state_protected)
 		sev_flush_encrypted_page(vcpu, svm->sev_es.vmsa);
 
 	__free_page(virt_to_page(svm->sev_es.vmsa));
 
+skip_vmsa_free:
 	if (svm->sev_es.ghcb_sa_free)
 		kvfree(svm->sev_es.ghcb_sa);
 }
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 3705c2044fc0..903ddfea8585 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -658,6 +658,7 @@ struct sev_data_snp_launch_update {
  * @id_auth_paddr: system physical address of ID block authentication structure
  * @id_block_en: indicates whether ID block is present
  * @auth_key_en: indicates whether author key is present in authentication structure
+ * @vcek_disabled: indicates whether use of VCEK is allowed for attestation reports
  * @rsvd: reserved
  * @host_data: host-supplied data for guest, not interpreted by firmware
  */
@@ -667,7 +668,8 @@ struct sev_data_snp_launch_finish {
 	u64 id_auth_paddr;
 	u8 id_block_en:1;
 	u8 auth_key_en:1;
-	u64 rsvd:62;
+	u8 vcek_disabled:1;
+	u64 rsvd:61;
 	u8 host_data[32];
 } __packed;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 08/22] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (6 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 07/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP guests are required to perform a GHCB GPA registration. Before
using a GHCB GPA for a vCPU the first time, a guest must register the
vCPU GHCB GPA. If hypervisor can work with the guest requested GPA then
it must respond back with the same GPA otherwise return -1.

On VMEXIT, verify that the GHCB GPA matches with the registered value.
If a mismatch is detected, then abort the guest.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  8 ++++++++
 arch/x86/kvm/svm/sev.c            | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.h            |  7 +++++++
 3 files changed, 42 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 5a8246dd532f..1006bfffe07a 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -59,6 +59,14 @@
 #define GHCB_MSR_AP_RESET_HOLD_RESULT_POS	12
 #define GHCB_MSR_AP_RESET_HOLD_RESULT_MASK	GENMASK_ULL(51, 0)
 
+/* Preferred GHCB GPA Request */
+#define GHCB_MSR_PREF_GPA_REQ		0x010
+#define GHCB_MSR_GPA_VALUE_POS		12
+#define GHCB_MSR_GPA_VALUE_MASK		GENMASK_ULL(51, 0)
+
+#define GHCB_MSR_PREF_GPA_RESP		0x011
+#define GHCB_MSR_PREF_GPA_NONE		0xfffffffffffff
+
 /* GHCB GPA Register */
 #define GHCB_MSR_REG_GPA_REQ		0x012
 #define GHCB_MSR_REG_GPA_REQ_VAL(v)			\
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6ca1b13c9beb..76084e109f66 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3541,6 +3541,26 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		set_ghcb_msr_bits(svm, GHCB_MSR_HV_FT_RESP,
 				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
 		break;
+	case GHCB_MSR_PREF_GPA_REQ:
+		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_NONE, GHCB_MSR_GPA_VALUE_MASK,
+				  GHCB_MSR_GPA_VALUE_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_RESP, GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	case GHCB_MSR_REG_GPA_REQ: {
+		u64 gfn;
+
+		gfn = get_ghcb_msr_bits(svm, GHCB_MSR_GPA_VALUE_MASK,
+					GHCB_MSR_GPA_VALUE_POS);
+
+		svm->sev_es.ghcb_registered_gpa = gfn_to_gpa(gfn);
+
+		set_ghcb_msr_bits(svm, gfn, GHCB_MSR_GPA_VALUE_MASK,
+				  GHCB_MSR_GPA_VALUE_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_REG_GPA_RESP, GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	}
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3604,6 +3624,13 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	trace_kvm_vmgexit_enter(vcpu->vcpu_id, svm->sev_es.ghcb);
 
 	sev_es_sync_from_ghcb(svm);
+
+	/* SEV-SNP guest requires that the GHCB GPA must be registered */
+	if (sev_snp_guest(svm->vcpu.kvm) && !ghcb_gpa_is_registered(svm, ghcb_gpa)) {
+		vcpu_unimpl(&svm->vcpu, "vmgexit: GHCB GPA [%#llx] is not registered.\n", ghcb_gpa);
+		return -EINVAL;
+	}
+
 	ret = sev_es_validate_vmgexit(svm);
 	if (ret)
 		return ret;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 0654fc91d4db..730f5ced2a2e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -208,6 +208,8 @@ struct vcpu_sev_es_state {
 	u32 ghcb_sa_len;
 	bool ghcb_sa_sync;
 	bool ghcb_sa_free;
+
+	u64 ghcb_registered_gpa;
 };
 
 struct vcpu_svm {
@@ -361,6 +363,11 @@ static __always_inline bool sev_snp_guest(struct kvm *kvm)
 #endif
 }
 
+static inline bool ghcb_gpa_is_registered(struct vcpu_svm *svm, u64 val)
+{
+	return svm->sev_es.ghcb_registered_gpa == val;
+}
+
 static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
 {
 	vmcb->control.clean = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (7 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 08/22] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-24 20:59   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 10/22] KVM: SEV: Add support to handle " Michael Roth
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP VMs can ask the hypervisor to change the page state in the RMP
table to be private or shared using the Page State Change MSR protocol
as defined in the GHCB specification.

When using gmem, private/shared memory is allocated through separate
pools, and KVM relies on userspace issuing a KVM_SET_MEMORY_ATTRIBUTES
KVM ioctl to tell the KVM MMU whether or not a particular GFN should be
backed by private memory or not.

Forward these page state change requests to userspace so that it can
issue the expected KVM ioctls. The KVM MMU will handle updating the RMP
entries when it is ready to map a private page into a guest.

Define a new KVM_EXIT_VMGEXIT for exits of this type, and structure it
so that it can be extended for other cases where VMGEXITs need some
level of handling in userspace.

Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.rst    | 33 +++++++++++++++++++++++++++++++
 arch/x86/include/asm/sev-common.h |  6 ++++++
 arch/x86/kvm/svm/sev.c            | 33 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h          | 17 ++++++++++++++++
 4 files changed, 89 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f0b76ff5030d..4a7a2945bc78 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7060,6 +7060,39 @@ Please note that the kernel is allowed to use the kvm_run structure as the
 primary storage for certain register types. Therefore, the kernel may use the
 values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 
+::
+
+		/* KVM_EXIT_VMGEXIT */
+		struct kvm_user_vmgexit {
+		#define KVM_USER_VMGEXIT_PSC_MSR	1
+			__u32 type; /* KVM_USER_VMGEXIT_* type */
+			union {
+				struct {
+					__u64 gpa;
+		#define KVM_USER_VMGEXIT_PSC_MSR_OP_PRIVATE	1
+		#define KVM_USER_VMGEXIT_PSC_MSR_OP_SHARED	2
+					__u8 op;
+					__u32 ret;
+				} psc_msr;
+			};
+		};
+
+If exit reason is KVM_EXIT_VMGEXIT then it indicates that an SEV-SNP guest
+has issued a VMGEXIT instruction (as documented by the AMD Architecture
+Programmer's Manual (APM)) to the hypervisor that needs to be serviced by
+userspace. These are generally handled by the host kernel, but in some
+cases some aspects handling a VMGEXIT are handled by userspace.
+
+A kvm_user_vmgexit structure is defined to encapsulate the data to be
+sent to or returned by userspace. The type field defines the specific type
+of exit that needs to be serviced, and that type is used as a discriminator
+to determine which union type should be used for input/output.
+
+For the KVM_USER_VMGEXIT_PSC_MSR type, the psc_msr union type is used. The
+kernel will supply the 'gpa' and 'op' fields, and userspace is expected to
+update the private/shared state of the GPA using the corresponding
+KVM_SET_MEMORY_ATTRIBUTES ioctl. The 'ret' field is to be set to 0 by
+userpace on success, or some non-zero value on failure.
 
 6. Capabilities that can be enabled on vCPUs
 ============================================
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1006bfffe07a..6d68db812de1 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -101,11 +101,17 @@ enum psc_op {
 	/* GHCBData[11:0] */				\
 	GHCB_MSR_PSC_REQ)
 
+#define GHCB_MSR_PSC_REQ_TO_GFN(msr) (((msr) & GENMASK_ULL(51, 12)) >> 12)
+#define GHCB_MSR_PSC_REQ_TO_OP(msr) (((msr) & GENMASK_ULL(55, 52)) >> 52)
+
 #define GHCB_MSR_PSC_RESP		0x015
 #define GHCB_MSR_PSC_RESP_VAL(val)			\
 	/* GHCBData[63:32] */				\
 	(((u64)(val) & GENMASK_ULL(63, 32)) >> 32)
 
+/* Set highest bit as a generic error response */
+#define GHCB_MSR_PSC_RESP_ERROR (BIT_ULL(63) | GHCB_MSR_PSC_RESP)
+
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 76084e109f66..f6f54a889fde 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3463,6 +3463,36 @@ static void set_ghcb_msr(struct vcpu_svm *svm, u64 value)
 	svm->vmcb->control.ghcb_gpa = value;
 }
 
+static int snp_complete_psc_msr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	u64 vmm_ret = vcpu->run->vmgexit.psc_msr.ret;
+
+	set_ghcb_msr(svm, (vmm_ret << 32) | GHCB_MSR_PSC_RESP);
+
+	return 1; /* resume guest */
+}
+
+static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
+{
+	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
+	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
+		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
+		return 1; /* resume guest */
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
+	vcpu->run->vmgexit.psc_msr.gpa = gpa;
+	vcpu->run->vmgexit.psc_msr.op = op;
+	vcpu->arch.complete_userspace_io = snp_complete_psc_msr;
+
+	return 0; /* forward request to userspace */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3561,6 +3591,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_POS);
 		break;
 	}
+	case GHCB_MSR_PSC_REQ:
+		ret = snp_begin_psc_msr(vcpu, control->ghcb_gpa);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2190adbe3002..54b81e46a9fa 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -135,6 +135,20 @@ struct kvm_xen_exit {
 	} u;
 };
 
+struct kvm_user_vmgexit {
+#define KVM_USER_VMGEXIT_PSC_MSR	1
+	__u32 type; /* KVM_USER_VMGEXIT_* type */
+	union {
+		struct {
+			__u64 gpa;
+#define KVM_USER_VMGEXIT_PSC_MSR_OP_PRIVATE	1
+#define KVM_USER_VMGEXIT_PSC_MSR_OP_SHARED	2
+			__u8 op;
+			__u32 ret;
+		} psc_msr;
+	};
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -178,6 +192,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_NOTIFY           37
 #define KVM_EXIT_LOONGARCH_IOCSR  38
 #define KVM_EXIT_MEMORY_FAULT     39
+#define KVM_EXIT_VMGEXIT          40
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -433,6 +448,8 @@ struct kvm_run {
 			__u64 gpa;
 			__u64 size;
 		} memory_fault;
+		/* KVM_EXIT_VMGEXIT */
+		struct kvm_user_vmgexit vmgexit;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 10/22] KVM: SEV: Add support to handle Page State Change VMGEXIT
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (8 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 11/22] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP VMs can ask the hypervisor to change the page state in the RMP
table to be private or shared using the Page State Change NAE event
as defined in the GHCB specification version 2.

Forward these requests to userspace as KVM_EXIT_VMGEXITs, similar to how
it is done for requests that don't use a GHCB page.

Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.rst | 14 ++++++++++++++
 arch/x86/kvm/svm/sev.c         | 16 ++++++++++++++++
 include/uapi/linux/kvm.h       |  5 +++++
 3 files changed, 35 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4a7a2945bc78..85099198a10f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7065,6 +7065,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 		/* KVM_EXIT_VMGEXIT */
 		struct kvm_user_vmgexit {
 		#define KVM_USER_VMGEXIT_PSC_MSR	1
+		#define KVM_USER_VMGEXIT_PSC		2
 			__u32 type; /* KVM_USER_VMGEXIT_* type */
 			union {
 				struct {
@@ -7074,9 +7075,14 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 					__u8 op;
 					__u32 ret;
 				} psc_msr;
+				struct {
+					__u64 shared_gpa;
+					__u64 ret;
+				} psc;
 			};
 		};
 
+
 If exit reason is KVM_EXIT_VMGEXIT then it indicates that an SEV-SNP guest
 has issued a VMGEXIT instruction (as documented by the AMD Architecture
 Programmer's Manual (APM)) to the hypervisor that needs to be serviced by
@@ -7094,6 +7100,14 @@ update the private/shared state of the GPA using the corresponding
 KVM_SET_MEMORY_ATTRIBUTES ioctl. The 'ret' field is to be set to 0 by
 userpace on success, or some non-zero value on failure.
 
+For the KVM_USER_VMGEXIT_PSC type, the psc union type is used. The kernel
+will supply the GPA of the Page State Structure defined in the GHCB spec.
+Userspace will process this structure as defined by the GHCB, and issue
+KVM_SET_MEMORY_ATTRIBUTES ioctls to set the GPAs therein to the expected
+private/shared state. Userspace will return a value in 'ret' that is in
+agreement with the GHCB-defined return values that the guest will expect
+in the SW_EXITINFO2 field of the GHCB in response to these requests.
+
 6. Capabilities that can be enabled on vCPUs
 ============================================
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f6f54a889fde..e9519af8f14c 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3275,6 +3275,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
+	case SVM_VMGEXIT_PSC:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3493,6 +3494,15 @@ static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
 	return 0; /* forward request to userspace */
 }
 
+static int snp_complete_psc(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, vcpu->run->vmgexit.psc.ret);
+
+	return 1; /* resume guest */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3730,6 +3740,12 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_PSC:
+		vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+		vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC;
+		vcpu->run->vmgexit.psc.shared_gpa = svm->sev_es.sw_scratch;
+		vcpu->arch.complete_userspace_io = snp_complete_psc;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 54b81e46a9fa..e33c48bfbd67 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -137,6 +137,7 @@ struct kvm_xen_exit {
 
 struct kvm_user_vmgexit {
 #define KVM_USER_VMGEXIT_PSC_MSR	1
+#define KVM_USER_VMGEXIT_PSC		2
 	__u32 type; /* KVM_USER_VMGEXIT_* type */
 	union {
 		struct {
@@ -146,6 +147,10 @@ struct kvm_user_vmgexit {
 			__u8 op;
 			__u32 ret;
 		} psc_msr;
+		struct {
+			__u64 shared_gpa;
+			__u64 ret;
+		} psc;
 	};
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 11/22] KVM: SEV: Add support to handle RMP nested page faults
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (9 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 10/22] KVM: SEV: Add support to handle " Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 12/22] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

When SEV-SNP is enabled in the guest, the hardware places restrictions
on all memory accesses based on the contents of the RMP table. When
hardware encounters RMP check failure caused by the guest memory access
it raises the #NPF. The error code contains additional information on
the access type. See the APM volume 2 for additional information.

When using gmem, RMP faults resulting from mismatches between the state
in the RMP table vs. what the guest expects via its page table result
in KVM_EXIT_MEMORY_FAULTs being forwarded to userspace to handle. This
means the only expected case that needs to be handled in the kernel is
when the page size of the entry in the RMP table is larger than the
mapping in the nested page table, in which case a PSMASH instruction
needs to be issued to split the large RMP entry into individual 4K
entries so that subsequent accesses can succeed.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/include/asm/sev.h      |   3 +
 arch/x86/kvm/mmu.h              |   2 -
 arch/x86/kvm/mmu/mmu.c          |   1 +
 arch/x86/kvm/svm/sev.c          | 109 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c          |  21 ++++--
 arch/x86/kvm/svm/svm.h          |   3 +
 arch/x86/kvm/trace.h            |  31 +++++++++
 arch/x86/kvm/x86.c              |   1 +
 9 files changed, 166 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4c9d8a22840a..90f0de2b8645 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1945,6 +1945,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 				   const struct kvm_memory_slot *memslot);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long kvm_nr_mmu_pages);
+void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
 
 int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3);
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 7f57382afee4..3a06f06b847a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -91,6 +91,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* RMUPDATE detected 4K page and 2MB page overlap. */
 #define RMPUPDATE_FAIL_OVERLAP		4
 
+/* PSMASH failed due to concurrent access by another CPU */
+#define PSMASH_FAIL_INUSE		3
+
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 #define RMP_PG_SIZE_2M			1
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 2343c9f00e31..e3cb35b9396d 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -251,8 +251,6 @@ static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm)
 	return __kvm_mmu_honors_guest_mtrrs(kvm_arch_has_noncoherent_dma(kvm));
 }
 
-void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
-
 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_post_init_vm(struct kvm *kvm);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index eebb1562c5bc..b6d0aa18b72b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6752,6 +6752,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 
 	return need_tlb_flush;
 }
+EXPORT_SYMBOL_GPL(kvm_zap_gfn_range);
 
 static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 					   const struct kvm_memory_slot *slot)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e9519af8f14c..65882033a82f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3464,6 +3464,23 @@ static void set_ghcb_msr(struct vcpu_svm *svm, u64 value)
 	svm->vmcb->control.ghcb_gpa = value;
 }
 
+static int snp_rmptable_psmash(kvm_pfn_t pfn)
+{
+	int ret;
+
+	pfn = pfn & ~(KVM_PAGES_PER_HPAGE(PG_LEVEL_2M) - 1);
+
+	/*
+	 * PSMASH_FAIL_INUSE indicates another processor is modifying the
+	 * entry, so retry until that's no longer the case.
+	 */
+	do {
+		ret = psmash(pfn);
+	} while (ret == PSMASH_FAIL_INUSE);
+
+	return ret;
+}
+
 static int snp_complete_psc_msr(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -4023,3 +4040,95 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 
 	return p;
 }
+
+void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
+{
+	struct kvm_memory_slot *slot;
+	struct kvm *kvm = vcpu->kvm;
+	int order, rmp_level, ret;
+	bool assigned;
+	kvm_pfn_t pfn;
+	gfn_t gfn;
+
+	gfn = gpa >> PAGE_SHIFT;
+
+	/*
+	 * The only time RMP faults occur for shared pages is when the guest is
+	 * triggering an RMP fault for an implicit page-state change from
+	 * shared->private. Implicit page-state changes are forwarded to
+	 * userspace via KVM_EXIT_MEMORY_FAULT events, however, so RMP faults
+	 * for shared pages should not end up here.
+	 */
+	if (!kvm_mem_is_private(kvm, gfn)) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault for non-private GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	slot = gfn_to_memslot(kvm, gfn);
+	if (!kvm_slot_can_be_private(slot)) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, non-private slot for GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, &order);
+	if (ret) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, no backing page for private GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	ret = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+	if (ret || !assigned) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, no assigned RMP entry found for GPA 0x%llx PFN 0x%llx error %d\n",
+				    gpa, pfn, ret);
+		goto out_no_trace;
+	}
+
+	/*
+	 * There are 2 cases where a PSMASH may be needed to resolve an #NPF
+	 * with PFERR_GUEST_RMP_BIT set:
+	 *
+	 * 1) RMPADJUST/PVALIDATE can trigger an #NPF with PFERR_GUEST_SIZEM
+	 *    bit set if the guest issues them with a smaller granularity than
+	 *    what is indicated by the page-size bit in the 2MB RMP entry for
+	 *    the PFN that backs the GPA.
+	 *
+	 * 2) Guest access via NPT can trigger an #NPF if the NPT mapping is
+	 *    smaller than what is indicated by the 2MB RMP entry for the PFN
+	 *    that backs the GPA.
+	 *
+	 * In both these cases, the corresponding 2M RMP entry needs to
+	 * be PSMASH'd to 512 4K RMP entries.  If the RMP entry is already
+	 * split into 4K RMP entries, then this is likely a spurious case which
+	 * can occur when there are concurrent accesses by the guest to a 2MB
+	 * GPA range that is backed by a 2MB-aligned PFN who's RMP entry is in
+	 * the process of being PMASH'd into 4K entries. These cases should
+	 * resolve automatically on subsequent accesses, so just ignore them
+	 * here.
+	 */
+	if (rmp_level == PG_LEVEL_4K)
+		goto out;
+
+	ret = snp_rmptable_psmash(pfn);
+	if (ret) {
+		/*
+		 * Look it up again. If it's 4K now then the PSMASH may have
+		 * raced with another process and the issue has already resolved
+		 * itself.
+		 */
+		if (!snp_lookup_rmpentry(pfn, &assigned, &rmp_level) &&
+		    assigned && rmp_level == PG_LEVEL_4K)
+			goto out;
+
+		pr_warn_ratelimited("SEV: Unable to split RMP entry for GPA 0x%llx PFN 0x%llx ret %d\n",
+				    gpa, pfn, ret);
+	}
+
+	kvm_zap_gfn_range(kvm, gfn, gfn + PTRS_PER_PMD);
+out:
+	trace_kvm_rmp_fault(vcpu, gpa, pfn, error_code, rmp_level, ret);
+out_no_trace:
+	put_page(pfn_to_page(pfn));
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 422b452fbc3b..7c9807fdafc3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2043,6 +2043,7 @@ static int pf_interception(struct kvm_vcpu *vcpu)
 static int npf_interception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	int rc;
 
 	u64 fault_address = svm->vmcb->control.exit_info_2;
 	u64 error_code = svm->vmcb->control.exit_info_1;
@@ -2060,10 +2061,22 @@ static int npf_interception(struct kvm_vcpu *vcpu)
 		error_code |= PFERR_PRIVATE_ACCESS;
 
 	trace_kvm_page_fault(vcpu, fault_address, error_code);
-	return kvm_mmu_page_fault(vcpu, fault_address, error_code,
-			static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
-			svm->vmcb->control.insn_bytes : NULL,
-			svm->vmcb->control.insn_len);
+	rc = kvm_mmu_page_fault(vcpu, fault_address, error_code,
+				static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
+				svm->vmcb->control.insn_bytes : NULL,
+				svm->vmcb->control.insn_len);
+
+	/*
+	 * rc == 0 indicates a userspace exit is needed to handle page
+	 * transitions, so do that first before updating the RMP table.
+	 */
+	if (error_code & PFERR_GUEST_RMP_MASK) {
+		if (rc == 0)
+			return rc;
+		sev_handle_rmp_fault(vcpu, fault_address, error_code);
+	}
+
+	return rc;
 }
 
 static int db_interception(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 730f5ced2a2e..d2b0ec27d4fe 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -722,6 +722,7 @@ void sev_hardware_unsetup(void);
 int sev_cpu_init(struct svm_cpu_data *sd);
 int sev_dev_get_attr(u32 group, u64 attr, u64 *val);
 extern unsigned int max_sev_asid;
+void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -735,6 +736,8 @@ static inline void sev_hardware_unsetup(void) {}
 static inline int sev_cpu_init(struct svm_cpu_data *sd) { return 0; }
 static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXIO; }
 #define max_sev_asid 0
+static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
+
 #endif
 
 /* vmenter.S */
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index c6b4b1728006..3531a187d5d9 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1834,6 +1834,37 @@ TRACE_EVENT(kvm_vmgexit_msr_protocol_exit,
 		  __entry->vcpu_id, __entry->ghcb_gpa, __entry->result)
 );
 
+/*
+ * Tracepoint for #NPFs due to RMP faults.
+ */
+TRACE_EVENT(kvm_rmp_fault,
+	TP_PROTO(struct kvm_vcpu *vcpu, u64 gpa, u64 pfn, u64 error_code,
+		 int rmp_level, int psmash_ret),
+	TP_ARGS(vcpu, gpa, pfn, error_code, rmp_level, psmash_ret),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, vcpu_id)
+		__field(u64, gpa)
+		__field(u64, pfn)
+		__field(u64, error_code)
+		__field(int, rmp_level)
+		__field(int, psmash_ret)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_id	= vcpu->vcpu_id;
+		__entry->gpa		= gpa;
+		__entry->pfn		= pfn;
+		__entry->error_code	= error_code;
+		__entry->rmp_level	= rmp_level;
+		__entry->psmash_ret	= psmash_ret;
+	),
+
+	TP_printk("vcpu %u gpa %016llx pfn 0x%llx error_code 0x%llx rmp_level %d psmash_ret %d",
+		  __entry->vcpu_id, __entry->gpa, __entry->pfn,
+		  __entry->error_code, __entry->rmp_level, __entry->psmash_ret)
+);
+
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83b8260443a3..14693effec6b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13996,6 +13996,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_enter);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_enter);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_exit);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault);
 
 static int __init kvm_x86_init(void)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 12/22] KVM: SEV: Support SEV-SNP AP Creation NAE event
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (10 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 11/22] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 13/22] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Add support for the SEV-SNP AP Creation NAE event. This allows SEV-SNP
guests to alter the register state of the APs on their own. This allows
the guest a way of simulating INIT-SIPI.

A new event, KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, is created and used
so as to avoid updating the VMSA pointer while the vCPU is running.

For CREATE
  The guest supplies the GPA of the VMSA to be used for the vCPU with
  the specified APIC ID. The GPA is saved in the svm struct of the
  target vCPU, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added
  to the vCPU and then the vCPU is kicked.

For CREATE_ON_INIT:
  The guest supplies the GPA of the VMSA to be used for the vCPU with
  the specified APIC ID the next time an INIT is performed. The GPA is
  saved in the svm struct of the target vCPU.

For DESTROY:
  The guest indicates it wishes to stop the vCPU. The GPA is cleared
  from the svm struct, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is
  added to vCPU and then the vCPU is kicked.

The KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event handler will be invoked
as a result of the event or as a result of an INIT. If a new VMSA is to
be installed, the VMSA guest page is set as the VMSA in the vCPU VMCB
and the vCPU state is set to KVM_MP_STATE_RUNNABLE. If a new VMSA is not
to be installed, the VMSA is cleared in the vCPU VMCB and the vCPU state
is set to KVM_MP_STATE_HALTED to prevent it from being run.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/include/asm/svm.h      |   6 +
 arch/x86/kvm/svm/sev.c          | 229 +++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.c          |  11 +-
 arch/x86/kvm/svm/svm.h          |   9 ++
 arch/x86/kvm/x86.c              |  11 ++
 6 files changed, 264 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 90f0de2b8645..54aafcb50d8b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -121,6 +121,7 @@
 	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_HV_TLB_FLUSH \
 	KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE	KVM_ARCH_REQ(34)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 544a43c1cf11..f0dea3750ca9 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -286,8 +286,14 @@ static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 
 #define SVM_SEV_FEAT_SNP_ACTIVE				BIT(0)
+#define SVM_SEV_FEAT_RESTRICTED_INJECTION		BIT(3)
+#define SVM_SEV_FEAT_ALTERNATE_INJECTION		BIT(4)
 #define SVM_SEV_FEAT_DEBUG_SWAP				BIT(5)
 
+#define SVM_SEV_FEAT_INT_INJ_MODES		\
+	(SVM_SEV_FEAT_RESTRICTED_INJECTION |	\
+	 SVM_SEV_FEAT_ALTERNATE_INJECTION)
+
 struct vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 65882033a82f..67e245a0d2bb 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -37,7 +37,7 @@
 #define GHCB_VERSION_MAX	2ULL
 #define GHCB_VERSION_MIN	1ULL
 
-#define GHCB_HV_FT_SUPPORTED	GHCB_HV_FT_SNP
+#define GHCB_HV_FT_SUPPORTED	(GHCB_HV_FT_SNP | GHCB_HV_FT_SNP_AP_CREATION)
 
 /* enable/disable SEV support */
 static bool sev_enabled = true;
@@ -3270,6 +3270,11 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 		if (!kvm_ghcb_sw_scratch_is_valid(svm))
 			goto vmgexit_err;
 		break;
+	case SVM_VMGEXIT_AP_CREATION:
+		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
+			if (!kvm_ghcb_rax_is_valid(svm))
+				goto vmgexit_err;
+		break;
 	case SVM_VMGEXIT_NMI_COMPLETE:
 	case SVM_VMGEXIT_AP_HLT_LOOP:
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
@@ -3520,6 +3525,205 @@ static int snp_complete_psc(struct kvm_vcpu *vcpu)
 	return 1; /* resume guest */
 }
 
+static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	WARN_ON(!mutex_is_locked(&svm->sev_es.snp_vmsa_mutex));
+
+	/* Mark the vCPU as offline and not runnable */
+	vcpu->arch.pv.pv_unhalted = false;
+	vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
+
+	/* Clear use of the VMSA */
+	svm->vmcb->control.vmsa_pa = INVALID_PAGE;
+
+	if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
+		gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
+		struct kvm_memory_slot *slot;
+		kvm_pfn_t pfn;
+
+		slot = gfn_to_memslot(vcpu->kvm, gfn);
+		if (!slot)
+			return -EINVAL;
+
+		/*
+		 * The new VMSA will be private memory guest memory, so
+		 * retrieve the PFN from the gmem backend.
+		 */
+		if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, NULL))
+			return -EINVAL;
+
+		/*
+		 * From this point forward, the VMSA will always be a
+		 * guest-mapped page rather than the initial one allocated
+		 * by KVM in svm->sev_es.vmsa. In theory, svm->sev_es.vmsa
+		 * could be free'd and cleaned up here, but that involves
+		 * cleanups like wbinvd_on_all_cpus() which would ideally
+		 * be handled during teardown rather than guest boot.
+		 * Deferring that also allows the existing logic for SEV-ES
+		 * VMSAs to be re-used with minimal SNP-specific changes.
+		 */
+		svm->sev_es.snp_has_guest_vmsa = true;
+
+		/* Use the new VMSA */
+		svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
+
+		/* Mark the vCPU as runnable */
+		vcpu->arch.pv.pv_unhalted = false;
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+
+		svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+
+		/*
+		 * gmem pages aren't currently migratable, but if this ever
+		 * changes then care should be taken to ensure
+		 * svm->sev_es.vmsa is pinned through some other means.
+		 */
+		kvm_release_pfn_clean(pfn);
+	}
+
+	/*
+	 * When replacing the VMSA during SEV-SNP AP creation,
+	 * mark the VMCB dirty so that full state is always reloaded.
+	 */
+	vmcb_mark_all_dirty(svm->vmcb);
+
+	return 0;
+}
+
+/*
+ * Invoked as part of svm_vcpu_reset() processing of an init event.
+ */
+void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	int ret;
+
+	if (!sev_snp_guest(vcpu->kvm))
+		return;
+
+	mutex_lock(&svm->sev_es.snp_vmsa_mutex);
+
+	if (!svm->sev_es.snp_ap_waiting_for_reset)
+		goto unlock;
+
+	svm->sev_es.snp_ap_waiting_for_reset = false;
+
+	ret = __sev_snp_update_protected_guest_state(vcpu);
+	if (ret)
+		vcpu_unimpl(vcpu, "snp: AP state update on init failed\n");
+
+unlock:
+	mutex_unlock(&svm->sev_es.snp_vmsa_mutex);
+}
+
+static int sev_snp_ap_creation(struct vcpu_svm *svm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_vcpu *target_vcpu;
+	struct vcpu_svm *target_svm;
+	unsigned int request;
+	unsigned int apic_id;
+	bool kick;
+	int ret;
+
+	request = lower_32_bits(svm->vmcb->control.exit_info_1);
+	apic_id = upper_32_bits(svm->vmcb->control.exit_info_1);
+
+	/* Validate the APIC ID */
+	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, apic_id);
+	if (!target_vcpu) {
+		vcpu_unimpl(vcpu, "vmgexit: invalid AP APIC ID [%#x] from guest\n",
+			    apic_id);
+		return -EINVAL;
+	}
+
+	ret = 0;
+
+	target_svm = to_svm(target_vcpu);
+
+	/*
+	 * The target vCPU is valid, so the vCPU will be kicked unless the
+	 * request is for CREATE_ON_INIT. For any errors at this stage, the
+	 * kick will place the vCPU in an non-runnable state.
+	 */
+	kick = true;
+
+	mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
+
+	target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+	target_svm->sev_es.snp_ap_waiting_for_reset = true;
+
+	/* Interrupt injection mode shouldn't change for AP creation */
+	if (request < SVM_VMGEXIT_AP_DESTROY) {
+		u64 sev_features;
+
+		sev_features = vcpu->arch.regs[VCPU_REGS_RAX];
+		sev_features ^= sev->vmsa_features;
+
+		if (sev_features & SVM_SEV_FEAT_INT_INJ_MODES) {
+			vcpu_unimpl(vcpu, "vmgexit: invalid AP injection mode [%#lx] from guest\n",
+				    vcpu->arch.regs[VCPU_REGS_RAX]);
+			ret = -EINVAL;
+			goto out;
+		}
+	}
+
+	switch (request) {
+	case SVM_VMGEXIT_AP_CREATE_ON_INIT:
+		kick = false;
+		fallthrough;
+	case SVM_VMGEXIT_AP_CREATE:
+		if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
+			vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
+				    svm->vmcb->control.exit_info_2);
+			ret = -EINVAL;
+			goto out;
+		}
+
+		/*
+		 * Malicious guest can RMPADJUST a large page into VMSA which
+		 * will hit the SNP erratum where the CPU will incorrectly signal
+		 * an RMP violation #PF if a hugepage collides with the RMP entry
+		 * of VMSA page, reject the AP CREATE request if VMSA address from
+		 * guest is 2M aligned.
+		 */
+		if (IS_ALIGNED(svm->vmcb->control.exit_info_2, PMD_SIZE)) {
+			vcpu_unimpl(vcpu,
+				    "vmgexit: AP VMSA address [%llx] from guest is unsafe as it is 2M aligned\n",
+				    svm->vmcb->control.exit_info_2);
+			ret = -EINVAL;
+			goto out;
+		}
+
+		target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
+		break;
+	case SVM_VMGEXIT_AP_DESTROY:
+		break;
+	default:
+		vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
+			    request);
+		ret = -EINVAL;
+		break;
+	}
+
+out:
+	if (kick) {
+		kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
+
+		if (target_vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)
+			kvm_make_request(KVM_REQ_UNBLOCK, target_vcpu);
+
+		kvm_vcpu_kick(target_vcpu);
+	}
+
+	mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
+
+	return ret;
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3763,6 +3967,15 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->vmgexit.psc.shared_gpa = svm->sev_es.sw_scratch;
 		vcpu->arch.complete_userspace_io = snp_complete_psc;
 		break;
+	case SVM_VMGEXIT_AP_CREATION:
+		ret = sev_snp_ap_creation(svm);
+		if (ret) {
+			ghcb_set_sw_exit_info_1(svm->sev_es.ghcb, 2);
+			ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_ERR_INVALID_INPUT);
+		}
+
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
@@ -3857,7 +4070,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
 	 * the VMSA will be NULL if this vCPU is the destination for intrahost
 	 * migration, and will be copied later.
 	 */
-	if (svm->sev_es.vmsa)
+	if (svm->sev_es.vmsa && !svm->sev_es.snp_has_guest_vmsa)
 		svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
 
 	/* Can't intercept CR register access, HV can't modify CR registers */
@@ -3930,6 +4143,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
 					    GHCB_VERSION_MIN,
 					    sev_enc_bit));
+
+	mutex_init(&svm->sev_es.snp_vmsa_mutex);
 }
 
 void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
@@ -4041,6 +4256,16 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 	return p;
 }
 
+void sev_vcpu_unblocking(struct kvm_vcpu *vcpu)
+{
+	if (!sev_snp_guest(vcpu->kvm))
+		return;
+
+	if (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu) &&
+	    vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+}
+
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
 {
 	struct kvm_memory_slot *slot;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7c9807fdafc3..b70556608e8d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1398,6 +1398,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	svm->spec_ctrl = 0;
 	svm->virt_spec_ctrl = 0;
 
+	if (init_event)
+		sev_snp_init_protected_guest_state(vcpu);
+
 	init_vmcb(vcpu);
 
 	if (!init_event)
@@ -4944,6 +4947,12 @@ static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
 	return page_address(page);
 }
 
+static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
+{
+	sev_vcpu_unblocking(vcpu);
+	avic_vcpu_unblocking(vcpu);
+}
+
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -4966,7 +4975,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.vcpu_load = svm_vcpu_load,
 	.vcpu_put = svm_vcpu_put,
 	.vcpu_blocking = avic_vcpu_blocking,
-	.vcpu_unblocking = avic_vcpu_unblocking,
+	.vcpu_unblocking = svm_vcpu_unblocking,
 
 	.update_exception_bitmap = svm_update_exception_bitmap,
 	.get_msr_feature = svm_get_msr_feature,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index d2b0ec27d4fe..81e335dca281 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -210,6 +210,11 @@ struct vcpu_sev_es_state {
 	bool ghcb_sa_free;
 
 	u64 ghcb_registered_gpa;
+
+	struct mutex snp_vmsa_mutex; /* Used to handle concurrent updates of VMSA. */
+	gpa_t snp_vmsa_gpa;
+	bool snp_ap_waiting_for_reset;
+	bool snp_has_guest_vmsa;
 };
 
 struct vcpu_svm {
@@ -723,6 +728,8 @@ int sev_cpu_init(struct svm_cpu_data *sd);
 int sev_dev_get_attr(u32 group, u64 attr, u64 *val);
 extern unsigned int max_sev_asid;
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
+void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
+void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -737,6 +744,8 @@ static inline int sev_cpu_init(struct svm_cpu_data *sd) { return 0; }
 static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXIO; }
 #define max_sev_asid 0
 static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
+static inline void sev_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu) {}
 
 #endif
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14693effec6b..b20f6c1b8214 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10938,6 +10938,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
 			static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
+
+		if (kvm_check_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu)) {
+			kvm_vcpu_reset(vcpu, true);
+			if (vcpu->arch.mp_state != KVM_MP_STATE_RUNNABLE) {
+				r = 1;
+				goto out;
+			}
+		}
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
@@ -13145,6 +13153,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
 	if (kvm_test_request(KVM_REQ_PMI, vcpu))
 		return true;
 
+	if (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu))
+		return true;
+
 	if (kvm_arch_interrupt_allowed(vcpu) &&
 	    (kvm_cpu_has_interrupt(vcpu) ||
 	    kvm_guest_apic_has_interrupt(vcpu)))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 13/22] KVM: SEV: Add support for GHCB-based termination requests
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (11 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 12/22] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 14/22] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

GHCB version 2 adds support for a GHCB-based termination request that
a guest can issue when it reaches an error state and wishes to inform
the hypervisor that it should be terminated. Implement support for that
similarly to GHCB MSR-based termination requests that are already
available to SEV-ES guests via earlier versions of the GHCB protocol.

See 'Termination Request' in the 'Invoking VMGEXIT' section of the GHCB
specification for more details.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 67e245a0d2bb..1d18e3497b4e 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3281,6 +3281,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
 	case SVM_VMGEXIT_PSC:
+	case SVM_VMGEXIT_TERM_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3976,6 +3977,14 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_TERM_REQUEST:
+		pr_info("SEV-ES guest requested termination: reason %#llx info %#llx\n",
+			control->exit_info_1, control->exit_info_2);
+		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
+		vcpu->run->system_event.ndata = 1;
+		vcpu->run->system_event.data[0] = control->ghcb_gpa;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 14/22] KVM: SEV: Implement gmem hook for initializing private pages
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (12 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 13/22] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 15/22] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

This will handle the RMP table updates needed to put a page into a
private state before mapping it into an SEV-SNP guest.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig   |  1 +
 arch/x86/kvm/svm/sev.c | 98 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  2 +
 arch/x86/kvm/svm/svm.h |  5 +++
 arch/x86/kvm/x86.c     |  5 +++
 virt/kvm/guest_memfd.c |  4 +-
 6 files changed, 113 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 5e72faca4e8f..10768f13b240 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -137,6 +137,7 @@ config KVM_AMD_SEV
 	depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
 	select ARCH_HAS_CC_PLATFORM
 	select KVM_GENERIC_PRIVATE_MEM
+	select HAVE_KVM_GMEM_PREPARE
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1d18e3497b4e..2906fee3187d 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4366,3 +4366,101 @@ void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
 out_no_trace:
 	put_page(pfn_to_page(pfn));
 }
+
+static bool is_pfn_range_shared(kvm_pfn_t start, kvm_pfn_t end)
+{
+	kvm_pfn_t pfn = start;
+
+	while (pfn < end) {
+		int ret, rmp_level;
+		bool assigned;
+
+		ret = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+		if (ret) {
+			pr_warn_ratelimited("SEV: Failed to retrieve RMP entry: PFN 0x%llx GFN start 0x%llx GFN end 0x%llx RMP level %d error %d\n",
+					    pfn, start, end, rmp_level, ret);
+			return false;
+		}
+
+		if (assigned) {
+			pr_debug("%s: overlap detected, PFN 0x%llx start 0x%llx end 0x%llx RMP level %d\n",
+				 __func__, pfn, start, end, rmp_level);
+			return false;
+		}
+
+		pfn++;
+	}
+
+	return true;
+}
+
+static u8 max_level_for_order(int order)
+{
+	if (order >= KVM_HPAGE_GFN_SHIFT(PG_LEVEL_2M))
+		return PG_LEVEL_2M;
+
+	return PG_LEVEL_4K;
+}
+
+static bool is_large_rmp_possible(struct kvm *kvm, kvm_pfn_t pfn, int order)
+{
+	kvm_pfn_t pfn_aligned = ALIGN_DOWN(pfn, PTRS_PER_PMD);
+
+	/*
+	 * If this is a large folio, and the entire 2M range containing the
+	 * PFN is currently shared, then the entire 2M-aligned range can be
+	 * set to private via a single 2M RMP entry.
+	 */
+	if (max_level_for_order(order) > PG_LEVEL_4K &&
+	    is_pfn_range_shared(pfn_aligned, pfn_aligned + PTRS_PER_PMD))
+		return true;
+
+	return false;
+}
+
+int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	kvm_pfn_t pfn_aligned;
+	gfn_t gfn_aligned;
+	int level, rc;
+	bool assigned;
+
+	if (!sev_snp_guest(kvm))
+		return 0;
+
+	rc = snp_lookup_rmpentry(pfn, &assigned, &level);
+	if (rc) {
+		pr_err_ratelimited("SEV: Failed to look up RMP entry: GFN %llx PFN %llx error %d\n",
+				   gfn, pfn, rc);
+		return -ENOENT;
+	}
+
+	if (assigned) {
+		pr_debug("%s: already assigned: gfn %llx pfn %llx max_order %d level %d\n",
+			 __func__, gfn, pfn, max_order, level);
+		return 0;
+	}
+
+	if (is_large_rmp_possible(kvm, pfn, max_order)) {
+		level = PG_LEVEL_2M;
+		pfn_aligned = ALIGN_DOWN(pfn, PTRS_PER_PMD);
+		gfn_aligned = ALIGN_DOWN(gfn, PTRS_PER_PMD);
+	} else {
+		level = PG_LEVEL_4K;
+		pfn_aligned = pfn;
+		gfn_aligned = gfn;
+	}
+
+	rc = rmp_make_private(pfn_aligned, gfn_to_gpa(gfn_aligned), level, sev->asid, false);
+	if (rc) {
+		pr_err_ratelimited("SEV: Failed to update RMP entry: GFN %llx PFN %llx level %d error %d\n",
+				   gfn, pfn, level, rc);
+		return -EINVAL;
+	}
+
+	pr_debug("%s: updated: gfn %llx pfn %llx pfn_aligned %llx max_order %d level %d\n",
+		 __func__, gfn, pfn, pfn_aligned, max_order, level);
+
+	return 0;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b70556608e8d..60783e9f2ae8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5085,6 +5085,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
 	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
 	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
+
+	.gmem_prepare = sev_gmem_prepare,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 81e335dca281..7712ed90aae8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -730,6 +730,7 @@ extern unsigned int max_sev_asid;
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
+int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -746,6 +747,10 @@ static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXI
 static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
 static inline void sev_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 static inline void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu) {}
+static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
+{
+	return 0;
+}
 
 #endif
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b20f6c1b8214..0fb76ef9b7e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13610,6 +13610,11 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
 EXPORT_SYMBOL_GPL(kvm_arch_no_poll);
 
 #ifdef CONFIG_HAVE_KVM_GMEM_PREPARE
+bool kvm_arch_gmem_prepare_needed(struct kvm *kvm)
+{
+	return kvm->arch.vm_type == KVM_X86_SNP_VM;
+}
+
 int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_order)
 {
 	return static_call(kvm_x86_gmem_prepare)(kvm, pfn, gfn, max_order);
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index a44f983eb673..7d3932e5a689 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -46,8 +46,8 @@ static int kvm_gmem_prepare_folio(struct inode *inode, pgoff_t index, struct fol
 		gfn = slot->base_gfn + index - slot->gmem.pgoff;
 		rc = kvm_arch_gmem_prepare(kvm, gfn, pfn, compound_order(compound_head(page)));
 		if (rc) {
-			pr_warn_ratelimited("gmem: Failed to prepare folio for index %lx, error %d.\n",
-					    index, rc);
+			pr_warn_ratelimited("gmem: Failed to prepare folio for index %lx GFN %llx PFN %llx error %d.\n",
+					    index, gfn, pfn, rc);
 			return rc;
 		}
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 15/22] KVM: SEV: Implement gmem hook for invalidating private pages
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (13 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 14/22] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Implement a platform hook to do the work of restoring the direct map
entries of gmem-managed pages and transitioning the corresponding RMP
table entries back to the default shared/hypervisor-owned state.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig   |  1 +
 arch/x86/kvm/svm/sev.c | 64 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  1 +
 arch/x86/kvm/svm/svm.h |  2 ++
 4 files changed, 68 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 10768f13b240..2a7f69abcac3 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -138,6 +138,7 @@ config KVM_AMD_SEV
 	select ARCH_HAS_CC_PLATFORM
 	select KVM_GENERIC_PRIVATE_MEM
 	select HAVE_KVM_GMEM_PREPARE
+	select HAVE_KVM_GMEM_INVALIDATE
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2906fee3187d..ff9b8c68ae56 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4464,3 +4464,67 @@ int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
 
 	return 0;
 }
+
+void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
+{
+	kvm_pfn_t pfn;
+
+	pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n", __func__, start, end);
+
+	for (pfn = start; pfn < end;) {
+		bool use_2m_update = false;
+		int rc, rmp_level;
+		bool assigned;
+
+		rc = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+		if (WARN_ONCE(rc, "SEV: Failed to retrieve RMP entry for PFN 0x%llx error %d\n",
+			      pfn, rc))
+			goto next_pfn;
+
+		if (!assigned)
+			goto next_pfn;
+
+		use_2m_update = IS_ALIGNED(pfn, PTRS_PER_PMD) &&
+				end >= (pfn + PTRS_PER_PMD) &&
+				rmp_level > PG_LEVEL_4K;
+
+		/*
+		 * If an unaligned PFN corresponds to a 2M region assigned as a
+		 * large page in the RMP table, PSMASH the region into individual
+		 * 4K RMP entries before attempting to convert a 4K sub-page.
+		 */
+		if (!use_2m_update && rmp_level > PG_LEVEL_4K) {
+			/*
+			 * This shouldn't fail, but if it does, report it, but
+			 * still try to update RMP entry to shared and pray this
+			 * was a spurious error that can be addressed later.
+			 */
+			rc = snp_rmptable_psmash(pfn);
+			WARN_ONCE(rc, "SEV: Failed to PSMASH RMP entry for PFN 0x%llx error %d\n",
+				  pfn, rc);
+		}
+
+		rc = rmp_make_shared(pfn, use_2m_update ? PG_LEVEL_2M : PG_LEVEL_4K);
+		if (WARN_ONCE(rc, "SEV: Failed to update RMP entry for PFN 0x%llx error %d\n",
+			      pfn, rc))
+			goto next_pfn;
+
+		/*
+		 * SEV-ES avoids host/guest cache coherency issues through
+		 * WBINVD hooks issued via MMU notifiers during run-time, and
+		 * KVM's VM destroy path at shutdown. Those MMU notifier events
+		 * don't cover gmem since there is no requirement to map pages
+		 * to a HVA in order to use them for a running guest. While the
+		 * shutdown path would still likely cover things for SNP guests,
+		 * userspace may also free gmem pages during run-time via
+		 * hole-punching operations on the guest_memfd, so flush the
+		 * cache entries for these pages before free'ing them back to
+		 * the host.
+		 */
+		clflush_cache_range(__va(pfn_to_hpa(pfn)),
+				    use_2m_update ? PMD_SIZE : PAGE_SIZE);
+next_pfn:
+		pfn += use_2m_update ? PTRS_PER_PMD : 1;
+		cond_resched();
+	}
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 60783e9f2ae8..29dc5fa28d97 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5087,6 +5087,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
 
 	.gmem_prepare = sev_gmem_prepare,
+	.gmem_invalidate = sev_gmem_invalidate,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7712ed90aae8..6721e5c6cf73 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -731,6 +731,7 @@ void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
+void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -751,6 +752,7 @@ static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, in
 {
 	return 0;
 }
+static inline void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) {}
 
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (14 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 15/22] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-25  0:45   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

In the case of SEV-SNP, whether or not a 2MB page can be mapped via a
2MB mapping in the guest's nested page table depends on whether or not
any subpages within the range have already been initialized as private
in the RMP table. The existing mixed-attribute tracking in KVM is
insufficient here, for instance:

  - gmem allocates 2MB page
  - guest issues PVALIDATE on 2MB page
  - guest later converts a subpage to shared
  - SNP host code issues PSMASH to split 2MB RMP mapping to 4K
  - KVM MMU splits NPT mapping to 4K
  - guest later converts that shared page back to private

At this point there are no mixed attributes, and KVM would normally
allow for 2MB NPT mappings again, but this is actually not allowed
because the RMP table mappings are 4K and cannot be promoted on the
hypervisor side, so the NPT mappings must still be limited to 4K to
match this.

Implement a kvm_x86_ops.gmem_validate_fault() hook for SEV that checks
for this condition and adjusts the mapping level accordingly.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 32 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  1 +
 arch/x86/kvm/svm/svm.h |  7 +++++++
 3 files changed, 40 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index ff9b8c68ae56..243369e302f4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4528,3 +4528,35 @@ void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
 		cond_resched();
 	}
 }
+
+/*
+ * Re-check whether an #NPF for a private/gmem page can still be serviced, and
+ * adjust maximum mapping level if needed.
+ */
+int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private,
+			    u8 *max_level)
+{
+	int level, rc;
+	bool assigned;
+
+	if (!sev_snp_guest(kvm))
+		return 0;
+
+	rc = snp_lookup_rmpentry(pfn, &assigned, &level);
+	if (rc) {
+		pr_err_ratelimited("SEV: RMP entry not found: GFN %llx PFN %llx level %d error %d\n",
+				   gfn, pfn, level, rc);
+		return -ENOENT;
+	}
+
+	if (!assigned) {
+		pr_err_ratelimited("SEV: RMP entry is not assigned: GFN %llx PFN %llx level %d\n",
+				   gfn, pfn, level);
+		return -EINVAL;
+	}
+
+	if (level < *max_level)
+		*max_level = level;
+
+	return 0;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 29dc5fa28d97..c26a7a933b93 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5088,6 +5088,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.gmem_prepare = sev_gmem_prepare,
 	.gmem_invalidate = sev_gmem_invalidate,
+	.gmem_validate_fault = sev_gmem_validate_fault,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6721e5c6cf73..8a8ee475ad86 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -732,6 +732,8 @@ void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
 void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
+int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private,
+			    u8 *max_level);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -753,6 +755,11 @@ static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, in
 	return 0;
 }
 static inline void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) {}
+static inline int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn,
+					  bool is_private, u8 *max_level)
+{
+	return 0;
+}
 
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (15 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-25  0:17   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 18/22] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

With SNP/guest_memfd, private/encrypted memory should not be mappable,
and MMU notifications for HVA-mapped memory will only be relevant to
unencrypted guest memory. Therefore, the rationale behind issuing a
wbinvd_on_all_cpus() in sev_guest_memory_reclaimed() should not apply
for SNP guests and can be ignored.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[mdr: Add some clarifications in commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 243369e302f4..cf00a811aca5 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3042,7 +3042,14 @@ static void sev_flush_encrypted_page(struct kvm_vcpu *vcpu, void *va)
 
 void sev_guest_memory_reclaimed(struct kvm *kvm)
 {
-	if (!sev_guest(kvm))
+	/*
+	 * With SNP+gmem, private/encrypted memory should be
+	 * unreachable via the hva-based mmu notifiers. Additionally,
+	 * for shared->private translations, H/W coherency will ensure
+	 * first guest access to the page would clear out any existing
+	 * dirty copies of that cacheline.
+	 */
+	if (!sev_guest(kvm) || sev_snp_guest(kvm))
 		return;
 
 	wbinvd_on_all_cpus();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 18/22] KVM: SVM: Add module parameter to enable SEV-SNP
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (16 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 19/22] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Add a module parameter than can be used to enable or disable the SEV-SNP
feature. Now that KVM contains the support for the SNP set the GHCB
hypervisor feature flag to indicate that SNP is supported.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/kvm/svm/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index cf00a811aca5..c354aca721e5 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -48,7 +48,8 @@ static bool sev_es_enabled = true;
 module_param_named(sev_es, sev_es_enabled, bool, 0444);
 
 /* enable/disable SEV-SNP support */
-static bool sev_snp_enabled;
+static bool sev_snp_enabled = true;
+module_param_named(sev_snp, sev_snp_enabled, bool, 0444);
 
 /* enable/disable SEV-ES DebugSwap support */
 static bool sev_es_debug_swap_enabled = true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 19/22] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (17 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 18/22] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 20/22] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Alexey Kardashevskiy

From: Brijesh Singh <brijesh.singh@amd.com>

Version 2 of GHCB specification added support for the SNP Guest Request
Message NAE event. The event allows for an SEV-SNP guest to make
requests to the SEV-SNP firmware through hypervisor using the
SNP_GUEST_REQUEST API defined in the SEV-SNP firmware specification.

This is used by guests primarily to request attestation reports from
firmware. There are other request types are available as well, but the
specifics of what guest requests are being made are opaque to the
hypervisor, which only serves as a proxy for the guest requests and
firmware responses.

Implement handling for these events.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
[mdr: ensure FW command failures are indicated to guest, drop extended
 request handling to be re-written as separate patch, massage commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c         | 83 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h |  9 ++++
 2 files changed, 92 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c354aca721e5..68db390b19d0 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/misc_cgroup.h>
 #include <linux/processor.h>
 #include <linux/trace_events.h>
+#include <uapi/linux/sev-guest.h>
 
 #include <asm/pkru.h>
 #include <asm/trapnr.h>
@@ -3290,6 +3291,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_HV_FEATURES:
 	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_TERM_REQUEST:
+	case SVM_VMGEXIT_GUEST_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3733,6 +3735,83 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
 	return ret;
 }
 
+static bool snp_setup_guest_buf(struct kvm *kvm, struct sev_data_snp_guest_request *data,
+				gpa_t req_gpa, gpa_t resp_gpa)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	kvm_pfn_t req_pfn, resp_pfn;
+
+	if (!PAGE_ALIGNED(req_gpa) || !PAGE_ALIGNED(resp_gpa))
+		return false;
+
+	req_pfn = gfn_to_pfn(kvm, gpa_to_gfn(req_gpa));
+	if (is_error_noslot_pfn(req_pfn))
+		return false;
+
+	resp_pfn = gfn_to_pfn(kvm, gpa_to_gfn(resp_gpa));
+	if (is_error_noslot_pfn(resp_pfn))
+		return false;
+
+	if (rmp_make_private(resp_pfn, 0, PG_LEVEL_4K, 0, true))
+		return false;
+
+	data->gctx_paddr = __psp_pa(sev->snp_context);
+	data->req_paddr = __sme_set(req_pfn << PAGE_SHIFT);
+	data->res_paddr = __sme_set(resp_pfn << PAGE_SHIFT);
+
+	return true;
+}
+
+static bool snp_cleanup_guest_buf(struct sev_data_snp_guest_request *data)
+{
+	u64 pfn = __sme_clr(data->res_paddr) >> PAGE_SHIFT;
+
+	if (snp_page_reclaim(pfn))
+		return false;
+
+	if (rmp_make_shared(pfn, PG_LEVEL_4K))
+		return false;
+
+	return true;
+}
+
+static bool __snp_handle_guest_req(struct kvm *kvm, gpa_t req_gpa, gpa_t resp_gpa,
+				   sev_ret_code *fw_err)
+{
+	struct sev_data_snp_guest_request data = {0};
+	struct kvm_sev_info *sev;
+	bool ret = true;
+
+	if (!sev_snp_guest(kvm))
+		return false;
+
+	sev = &to_kvm_svm(kvm)->sev_info;
+
+	if (!snp_setup_guest_buf(kvm, &data, req_gpa, resp_gpa))
+		return false;
+
+	if (sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &data, fw_err))
+		ret = false;
+
+	if (!snp_cleanup_guest_buf(&data))
+		ret = false;
+
+	return ret;
+}
+
+static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp_gpa)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm *kvm = vcpu->kvm;
+	sev_ret_code fw_err = 0;
+	int vmm_ret = 0;
+
+	if (!__snp_handle_guest_req(kvm, req_gpa, resp_gpa, &fw_err))
+		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3993,6 +4072,10 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->system_event.ndata = 1;
 		vcpu->run->system_event.data[0] = control->ghcb_gpa;
 		break;
+	case SVM_VMGEXIT_GUEST_REQUEST:
+		snp_handle_guest_req(svm, control->exit_info_1, control->exit_info_2);
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 154a87a1eca9..7bd78e258569 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -89,8 +89,17 @@ struct snp_ext_report_req {
 #define SNP_GUEST_FW_ERR_MASK		GENMASK_ULL(31, 0)
 #define SNP_GUEST_VMM_ERR_SHIFT		32
 #define SNP_GUEST_VMM_ERR(x)		(((u64)x) << SNP_GUEST_VMM_ERR_SHIFT)
+#define SNP_GUEST_FW_ERR(x)		((x) & SNP_GUEST_FW_ERR_MASK)
+#define SNP_GUEST_ERR(vmm_err, fw_err)	(SNP_GUEST_VMM_ERR(vmm_err) | \
+					 SNP_GUEST_FW_ERR(fw_err))
 
+/*
+ * The GHCB spec only formally defines INVALID_LEN/BUSY VMM errors, but define
+ * a GENERIC error code such that it won't ever conflict with GHCB-defined
+ * errors if any get added in the future.
+ */
 #define SNP_GUEST_VMM_ERR_INVALID_LEN	1
 #define SNP_GUEST_VMM_ERR_BUSY		2
+#define SNP_GUEST_VMM_ERR_GENERIC	BIT(31)
 
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 20/22] crypto: ccp: Add the SNP_VLEK_LOAD command
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (18 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 19/22] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-21 18:01 ` [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

When requesting an attestation report a guest is able to specify whether
it wants SNP firmware to sign the report using either a Versioned Chip
Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
Key Derivation Service (KDS) and derived from seeds allocated to
enrolled cloud service providers (CSPs).

For VLEK keys, an SNP_VLEK_LOAD SNP firmware command is used to load
them into the system after obtaining them from the KDS. Add a
corresponding userspace interface so to allow the loading of VLEK keys
into the system.

See SEV-SNP Firmware ABI 1.54, SNP_VLEK_LOAD for more details.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 19 ++++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 36 +++++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h          | 27 ++++++++++++++++++++
 3 files changed, 82 insertions(+)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index e1eaf6a830ce..de68d3a4b540 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -176,6 +176,25 @@ to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
 the firmware parameters affected by this command can be queried via
 SNP_PLATFORM_STATUS.
 
+2.7 SNP_VLEK_LOAD
+-----------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (in): struct sev_user_data_snp_vlek_load
+:Returns (out): 0 on success, -negative on error
+
+When requesting an attestation report a guest is able to specify whether
+it wants SNP firmware to sign the report using either a Versioned Chip
+Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
+Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
+Key Derivation Service (KDS) and derived from seeds allocated to
+enrolled cloud service providers.
+
+In the case of VLEK keys, the SNP_VLEK_LOAD SNP command is used to load
+them into the system after obtaining them from the KDS, and corresponds
+closely to the SNP_VLEK_LOAD firmware command specified in the SEV-SNP
+spec.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 2102377f727b..97a7959406ee 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -2027,6 +2027,39 @@ static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable
 	return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
 }
 
+static int sev_ioctl_do_snp_vlek_load(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_user_data_snp_vlek_load input;
+	void *blob;
+	int ret;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	if (copy_from_user(&input, u64_to_user_ptr(argp->data), sizeof(input)))
+		return -EFAULT;
+
+	if (input.len != sizeof(input) || input.vlek_wrapped_version != 0)
+		return -EINVAL;
+
+	blob = psp_copy_user_blob(input.vlek_wrapped_address,
+				  sizeof(struct sev_user_data_snp_wrapped_vlek_hashstick));
+	if (IS_ERR(blob))
+		return PTR_ERR(blob);
+
+	input.vlek_wrapped_address = __psp_pa(blob);
+
+	ret = __sev_do_cmd_locked(SEV_CMD_SNP_VLEK_LOAD, &input, &argp->error);
+
+	kfree(blob);
+
+	return ret;
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2087,6 +2120,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_SET_CONFIG:
 		ret = sev_ioctl_do_snp_set_config(&input, writable);
 		break;
+	case SNP_VLEK_LOAD:
+		ret = sev_ioctl_do_snp_vlek_load(&input, writable);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index b7a2c2ee35b7..2289b7c76c59 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -31,6 +31,7 @@ enum {
 	SNP_PLATFORM_STATUS,
 	SNP_COMMIT,
 	SNP_SET_CONFIG,
+	SNP_VLEK_LOAD,
 
 	SEV_MAX,
 };
@@ -214,6 +215,32 @@ struct sev_user_data_snp_config {
 	__u8 rsvd1[52];
 } __packed;
 
+/**
+ * struct sev_data_snp_vlek_load - SNP_VLEK_LOAD structure
+ *
+ * @len: length of the command buffer read by the PSP
+ * @vlek_wrapped_version: version of wrapped VLEK hashstick (Must be 0h)
+ * @rsvd: reserved
+ * @vlek_wrapped_address: address of a wrapped VLEK hashstick
+ *                        (struct sev_user_data_snp_wrapped_vlek_hashstick)
+ */
+struct sev_user_data_snp_vlek_load {
+	__u32 len;				/* In */
+	__u8 vlek_wrapped_version;		/* In */
+	__u8 rsvd[3];				/* In */
+	__u64 vlek_wrapped_address;		/* In */
+} __packed;
+
+/**
+ * struct sev_user_data_snp_vlek_wrapped_vlek_hashstick - Wrapped VLEK data
+ *
+ * @data: Opaque data provided by AMD KDS (as described in SEV-SNP Firmware ABI
+ *        1.54, SNP_VLEK_LOAD)
+ */
+struct sev_user_data_snp_wrapped_vlek_hashstick {
+	__u8 data[432];				/* In */
+} __packed;
+
 /**
  * struct sev_issue_cmd - SEV ioctl parameters
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (19 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 20/22] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-25  0:15   ` Sean Christopherson
  2024-04-21 18:01 ` [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

These commands can be used to pause servicing of guest attestation
requests. This useful when updating the reported TCB or signing key with
commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
in turn require updates to userspace-supplied certificates, and if an
attestation request happens to be in-flight at the time those updates
are occurring there is potential for a guest to receive a certificate
blob that is out of sync with the effective signing key for the
attestation report.

These interfaces also provide some versatility with how similar
firmware/certificate update activities can be handled in the future.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 50 +++++++++++++++++++++++++--
 arch/x86/include/asm/sev.h            |  6 ++++
 arch/x86/virt/svm/sev.c               | 43 +++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 47 +++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h          | 12 +++++++
 5 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index de68d3a4b540..ab192a008ba7 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -128,8 +128,6 @@ the SEV-SNP specification for further details.
 
 The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
 related to the additional certificate data that is returned with the report.
-The certificate data returned is being provided by the hypervisor through the
-SNP_SET_EXT_CONFIG.
 
 The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
 firmware to get the attestation report.
@@ -195,6 +193,54 @@ them into the system after obtaining them from the KDS, and corresponds
 closely to the SNP_VLEK_LOAD firmware command specified in the SEV-SNP
 spec.
 
+2.8 SNP_PAUSE_ATTESTATION / SNP_RESUME_ATTESTATION
+--------------------------------------------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (out): struct sev_user_data_snp_pause_transaction
+:Returns (out): 0 on success, -negative on error
+
+When requesting attestation reports, SNP guests have the option of issuing
+an extended guest request which allows host userspace to supply additional
+certificate data that can be used to validate the signature used to sign
+the attestation report. This signature is generated using a key that is
+derived from the reported TCB that can be set via the SNP_SET_CONFIG and
+SNP_COMMIT ioctls, so the accompanying certificate data needs to be kept in
+sync with the changes made to the reported TCB via these ioctls.
+
+Similarly, interfaces like SNP_LOAD_VLEK can modify the key used to sign
+the attestation reports, which may in turn require updating the certificate
+data provided to guests via extended guest requests.
+
+To allow for updating the reported TCB, endorsement key, and any certificate
+data in a manner that is atomic to guests, the SNP_PAUSE_ATTESTATION and
+SNP_RESUME_ATTESTATION commands are provided.
+
+After SNP_PAUSE_ATTESTATION is issued, any attestation report requests via
+extended guest requests that are in-progress, or received after
+SNP_PAUSE_ATTESTATION is issued, will result in the guest receiving a
+GHCB-defined error message instructing it to retry the request. Once all
+the desired reported TCB, endorsement keys, or certificate data updates
+are completed on the host, the SNP_RESUME_ATTESTATION command must be
+issued to allow guest attestation requests to proceed.
+
+In general, hosts should serialize updates of this sort and never have more
+than 1 outstanding transaction in flight that could result in the
+interleaving of multiple SNP_PAUSE_ATTESTATION/SNP_RESUME_ATTESTATION pairs.
+To guard against this, SNP_PAUSE_ATTESTATION will fail if another process
+has already paused attestation requests.
+
+However, there may be occassions where a transaction needs to be aborted due
+to unexpected activity in userspace such as timeouts, crashes, etc., so
+SNP_RESUME_ATTESTATION will always succeed. Nonetheless, this could
+potentially lead to SNP_RESUME_ATTESTATION being called out of sequence, so
+to allow for callers of SNP_{PAUSE,RESUME}_ATTESTATION to detect such
+occurrences, each ioctl will return a transaction ID in the response so the
+caller can monitor whether the start/end ID both match. If they don't, the
+caller should assume that attestation has been paused/resumed unexpectedly,
+and take whatever measures it deems necessary such as logging, reporting,
+auditing the sequence of events.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 3a06f06b847a..ee24ef815e35 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -82,6 +82,8 @@ extern void vc_no_ghcb(void);
 extern void vc_boot_ghcb(void);
 extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
+extern struct mutex snp_pause_attestation_lock;
+
 /* PVALIDATE return codes */
 #define PVALIDATE_FAIL_SIZEMISMATCH	6
 
@@ -272,6 +274,8 @@ int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immut
 int rmp_make_shared(u64 pfn, enum pg_level level);
 void snp_leak_pages(u64 pfn, unsigned int npages);
 void kdump_sev_callback(void);
+int snp_pause_attestation(u64 *transaction_id);
+void snp_resume_attestation(u64 *transaction_id);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -285,6 +289,8 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 as
 static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
 static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 static inline void kdump_sev_callback(void) { }
+static inline int snp_pause_attestation(u64 *transaction_id) { return 0; }
+static inline void snp_resume_attestation(u64 *transaction_id) {}
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index ab0e8448bb6e..b75f2e7d4012 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -70,6 +70,11 @@ static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
 
 static unsigned long snp_nr_leaked_pages;
 
+/* For synchronizing TCB/certificate updates with extended guest requests */
+DEFINE_MUTEX(snp_pause_attestation_lock);
+static u64 snp_transaction_id;
+static bool snp_attestation_paused;
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"SEV-SNP: " fmt
 
@@ -568,3 +573,41 @@ void kdump_sev_callback(void)
 	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		wbinvd();
 }
+
+int snp_pause_attestation(u64 *transaction_id)
+{
+	mutex_lock(&snp_pause_attestation_lock);
+
+	if (snp_attestation_paused) {
+		mutex_unlock(&snp_pause_attestation_lock);
+		return -EBUSY;
+	}
+
+	/*
+	 * The actual transaction ID update will happen when
+	 * snp_resume_attestation() is called, so return
+	 * the *anticipated* transaction ID that will be
+	 * returned by snp_resume_attestation(). This is
+	 * to ensure that unbalanced/aborted transactions will
+	 * be noticeable when the caller that started the
+	 * transaction calls snp_resume_attestation().
+	 */
+	*transaction_id = snp_transaction_id + 1;
+	snp_attestation_paused = true;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_pause_attestation);
+
+void snp_resume_attestation(u64 *transaction_id)
+{
+	mutex_lock(&snp_pause_attestation_lock);
+
+	snp_attestation_paused = false;
+	*transaction_id = ++snp_transaction_id;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+}
+EXPORT_SYMBOL_GPL(snp_resume_attestation);
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 97a7959406ee..7eb18a273731 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -2060,6 +2060,47 @@ static int sev_ioctl_do_snp_vlek_load(struct sev_issue_cmd *argp, bool writable)
 	return ret;
 }
 
+static int sev_ioctl_do_snp_pause_attestation(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_user_data_snp_pause_attestation transaction = {0};
+	struct sev_device *sev = psp_master->sev_data;
+	int ret;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	ret = snp_pause_attestation(&transaction.id);
+	if (ret)
+		return ret;
+
+	if (copy_to_user((void __user *)argp->data, &transaction, sizeof(transaction)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int sev_ioctl_do_snp_resume_attestation(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_user_data_snp_pause_attestation transaction = {0};
+	struct sev_device *sev = psp_master->sev_data;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	snp_resume_attestation(&transaction.id);
+
+	if (copy_to_user((void __user *)argp->data, &transaction, sizeof(transaction)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2123,6 +2164,12 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_VLEK_LOAD:
 		ret = sev_ioctl_do_snp_vlek_load(&input, writable);
 		break;
+	case SNP_PAUSE_ATTESTATION:
+		ret = sev_ioctl_do_snp_pause_attestation(&input, writable);
+		break;
+	case SNP_RESUME_ATTESTATION:
+		ret = sev_ioctl_do_snp_resume_attestation(&input, writable);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 2289b7c76c59..7b35b2814a99 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -32,6 +32,8 @@ enum {
 	SNP_COMMIT,
 	SNP_SET_CONFIG,
 	SNP_VLEK_LOAD,
+	SNP_PAUSE_ATTESTATION,
+	SNP_RESUME_ATTESTATION,
 
 	SEV_MAX,
 };
@@ -241,6 +243,16 @@ struct sev_user_data_snp_wrapped_vlek_hashstick {
 	__u8 data[432];				/* In */
 } __packed;
 
+/**
+ * struct sev_user_data_snp_pause_attestation - metadata for pausing attestation
+ *
+ * @id: the ID of the transaction started/ended by a call to SNP_PAUSE_ATTESTATION
+ *	or SNP_RESUME_ATTESTATION, respectively.
+ */
+struct sev_user_data_snp_pause_attestation {
+	__u64 id;				/* Out */
+} __packed;
+
 /**
  * struct sev_issue_cmd - SEV ioctl parameters
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (20 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
@ 2024-04-21 18:01 ` Michael Roth
  2024-04-25  0:10   ` Sean Christopherson
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
  2024-04-23 16:31 ` [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-21 18:01 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Version 2 of GHCB specification added support for the SNP Extended Guest
Request Message NAE event. This event serves a nearly identical purpose
to the previously-added SNP_GUEST_REQUEST event, but allows for
additional certificate data to be supplied via an additional
guest-supplied buffer to be used mainly for verifying the signature of
an attestation report as returned by firmware.

This certificate data is supplied by userspace, so unlike with
SNP_GUEST_REQUEST events, SNP_EXTENDED_GUEST_REQUEST events are first
forwarded to userspace via a KVM_EXIT_VMGEXIT exit type, and then the
firmware request is made only afterward.

Implement handling for these events.

Since there is a potential for race conditions where the
userspace-supplied certificate data may be out-of-sync relative to the
reported TCB or VLEK that firmware will use when signing attestation
reports, make use of the synchronization mechanisms wired up to the
SNP_{PAUSE,RESUME}_ATTESTATION SEV device ioctls such that the guest
will be told to retry the request while attestation has been paused due
to an update being underway on the system.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/kvm/api.rst | 26 +++++++++++
 arch/x86/include/asm/sev.h     |  6 +++
 arch/x86/kvm/svm/sev.c         | 82 ++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.h         |  3 ++
 arch/x86/virt/svm/sev.c        | 37 +++++++++++++++
 include/uapi/linux/kvm.h       |  6 +++
 6 files changed, 160 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 85099198a10f..6cf186ed8f66 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 		struct kvm_user_vmgexit {
 		#define KVM_USER_VMGEXIT_PSC_MSR	1
 		#define KVM_USER_VMGEXIT_PSC		2
+		#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3
 			__u32 type; /* KVM_USER_VMGEXIT_* type */
 			union {
 				struct {
@@ -7079,6 +7080,11 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 					__u64 shared_gpa;
 					__u64 ret;
 				} psc;
+				struct {
+					__u64 data_gpa;
+					__u64 data_npages;
+					__u32 ret;
+				} ext_guest_req;
 			};
 		};
 
@@ -7108,6 +7114,26 @@ private/shared state. Userspace will return a value in 'ret' that is in
 agreement with the GHCB-defined return values that the guest will expect
 in the SW_EXITINFO2 field of the GHCB in response to these requests.
 
+For the KVM_USER_VMGEXIT_EXT_GUEST_REQ type, the ext_guest_req union type
+is used. The kernel will supply in 'data_gpa' the value the guest supplies
+via the RAX field of the GHCB when issued extended guest requests.
+'data_npages' will similarly contain the value the guest supplies in RBX
+denoting the number of shared pages available to write the certificate
+data into.
+
+  - If the supplied number of pages is sufficient, userspace should write
+    the certificate data blob (in the format defined by the GHCB spec) in
+    the address indicated by 'data_gpa' and set 'ret' to 0.
+
+  - If the number of pages supplied is not sufficient, userspace must write
+    the required number of pages in 'data_npages' and then set 'ret' to 1.
+
+  - If userspace is temporarily unable to handle the request, 'ret' should
+    be set to 2 to inform the guest to retry later.
+
+  - If some other error occurred, userspace should set 'ret' to a non-zero
+    value that is distinct from the specific return values mentioned above.
+
 6. Capabilities that can be enabled on vCPUs
 ============================================
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ee24ef815e35..dfc28ac4dd0e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -276,6 +276,9 @@ void snp_leak_pages(u64 pfn, unsigned int npages);
 void kdump_sev_callback(void);
 int snp_pause_attestation(u64 *transaction_id);
 void snp_resume_attestation(u64 *transaction_id);
+u64 snp_transaction_get_id(void);
+bool __snp_transaction_is_stale(u64 transaction_id);
+bool snp_transaction_is_stale(u64 transaction_id);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -291,6 +294,9 @@ static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 static inline void kdump_sev_callback(void) { }
 static inline int snp_pause_attestation(u64 *transaction_id) { return 0; }
 static inline void snp_resume_attestation(u64 *transaction_id) {}
+static inline u64 snp_transaction_get_id(void) { return 0; }
+static inline bool __snp_transaction_is_stale(u64 transaction_id) { return false; }
+static inline bool snp_transaction_is_stale(u64 transaction_id) { return false; }
 #endif
 
 #endif
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 68db390b19d0..1cec466e593b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3292,6 +3292,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_TERM_REQUEST:
 	case SVM_VMGEXIT_GUEST_REQUEST:
+	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3812,6 +3813,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
 }
 
+static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control;
+	struct kvm *kvm = vcpu->kvm;
+	sev_ret_code fw_err = 0;
+	int vmm_ret;
+
+	vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret;
+	if (vmm_ret) {
+		if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN)
+			vcpu->arch.regs[VCPU_REGS_RBX] =
+				vcpu->run->vmgexit.ext_guest_req.data_npages;
+		goto abort_request;
+	}
+
+	control = &svm->vmcb->control;
+
+	/*
+	 * To avoid the message sequence number getting out of sync between the
+	 * actual value seen by firmware verses the value expected by the guest,
+	 * make sure attestations can't get paused on the write-side at this
+	 * point by holding the lock for the entire duration of the firmware
+	 * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY
+	 * would need to be returned after firmware sees the request.
+	 */
+	mutex_lock(&snp_pause_attestation_lock);
+
+	if (__snp_transaction_is_stale(svm->snp_transaction_id))
+		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
+	else if (!__snp_handle_guest_req(kvm, control->exit_info_1,
+					 control->exit_info_2, &fw_err))
+		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+abort_request:
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+
+	return 1; /* resume guest */
+}
+
+static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu)
+{
+	int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+	struct vcpu_svm *svm = to_svm(vcpu);
+	unsigned long data_npages;
+	sev_ret_code fw_err;
+	gpa_t data_gpa;
+
+	if (!sev_snp_guest(vcpu->kvm))
+		goto abort_request;
+
+	data_gpa = vcpu->arch.regs[VCPU_REGS_RAX];
+	data_npages = vcpu->arch.regs[VCPU_REGS_RBX];
+
+	if (!IS_ALIGNED(data_gpa, PAGE_SIZE))
+		goto abort_request;
+
+	svm->snp_transaction_id = snp_transaction_get_id();
+	if (snp_transaction_is_stale(svm->snp_transaction_id)) {
+		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
+		goto abort_request;
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_EXT_GUEST_REQ;
+	vcpu->run->vmgexit.ext_guest_req.data_gpa = data_gpa;
+	vcpu->run->vmgexit.ext_guest_req.data_npages = data_npages;
+	vcpu->arch.complete_userspace_io = snp_complete_ext_guest_req;
+
+	return 0; /* forward request to userspace */
+
+abort_request:
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+	return 1; /* resume guest */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -4076,6 +4155,9 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		snp_handle_guest_req(svm, control->exit_info_1, control->exit_info_2);
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
+		ret = snp_begin_ext_guest_req(vcpu);
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8a8ee475ad86..28140bc8af27 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -303,6 +303,9 @@ struct vcpu_svm {
 
 	/* Guest GIF value, used when vGIF is not enabled */
 	bool guest_gif;
+
+	/* Transaction ID associated with SNP config updates */
+	u64 snp_transaction_id;
 };
 
 struct svm_cpu_data {
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index b75f2e7d4012..f1f7486a3dcf 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -72,6 +72,7 @@ static unsigned long snp_nr_leaked_pages;
 
 /* For synchronizing TCB/certificate updates with extended guest requests */
 DEFINE_MUTEX(snp_pause_attestation_lock);
+EXPORT_SYMBOL_GPL(snp_pause_attestation_lock);
 static u64 snp_transaction_id;
 static bool snp_attestation_paused;
 
@@ -611,3 +612,39 @@ void snp_resume_attestation(u64 *transaction_id)
 	mutex_unlock(&snp_pause_attestation_lock);
 }
 EXPORT_SYMBOL_GPL(snp_resume_attestation);
+
+u64 snp_transaction_get_id(void)
+{
+	u64 id;
+
+	mutex_lock(&snp_pause_attestation_lock);
+	id = snp_transaction_id;
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return id;
+}
+EXPORT_SYMBOL_GPL(snp_transaction_get_id);
+
+/* Must be called with snp_pause_attestion_lock held */
+bool __snp_transaction_is_stale(u64 transaction_id)
+{
+	lockdep_assert_held(&snp_pause_attestation_lock);
+
+	return (snp_attestation_paused ||
+		transaction_id != snp_transaction_id);
+}
+EXPORT_SYMBOL_GPL(__snp_transaction_is_stale);
+
+bool snp_transaction_is_stale(u64 transaction_id)
+{
+	bool stale;
+
+	mutex_lock(&snp_pause_attestation_lock);
+
+	stale = __snp_transaction_is_stale(transaction_id);
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return stale;
+}
+EXPORT_SYMBOL_GPL(snp_transaction_is_stale);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e33c48bfbd67..585de3a2591e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -138,6 +138,7 @@ struct kvm_xen_exit {
 struct kvm_user_vmgexit {
 #define KVM_USER_VMGEXIT_PSC_MSR	1
 #define KVM_USER_VMGEXIT_PSC		2
+#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3
 	__u32 type; /* KVM_USER_VMGEXIT_* type */
 	union {
 		struct {
@@ -151,6 +152,11 @@ struct kvm_user_vmgexit {
 			__u64 shared_gpa;
 			__u64 ret;
 		} psc;
+		struct {
+			__u64 data_gpa;
+			__u64 data_npages;
+			__u32 ret;
+		} ext_guest_req;
 	};
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (21 preceding siblings ...)
  2024-04-21 18:01 ` [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-23 16:21 ` Michael Roth
  2024-04-23 16:21   ` [PATCH v14 24/22] [SQUASH] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
                     ` (5 more replies)
  2024-04-23 16:31 ` [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  23 siblings, 6 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Terminate if an non-SNP guest attempts to register a GHCB page; this
is an SNP-only GHCB request.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1cec466e593b..088eca85a6ac 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3970,6 +3970,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
 		break;
 	case GHCB_MSR_PREF_GPA_REQ:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_NONE, GHCB_MSR_GPA_VALUE_MASK,
 				  GHCB_MSR_GPA_VALUE_POS);
 		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_RESP, GHCB_MSR_INFO_MASK,
@@ -3978,6 +3981,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 	case GHCB_MSR_REG_GPA_REQ: {
 		u64 gfn;
 
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		gfn = get_ghcb_msr_bits(svm, GHCB_MSR_GPA_VALUE_MASK,
 					GHCB_MSR_GPA_VALUE_POS);
 
@@ -4004,12 +4010,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		pr_info("SEV-ES guest requested termination: %#llx:%#llx\n",
 			reason_set, reason_code);
 
-		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
-		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
-		vcpu->run->system_event.ndata = 1;
-		vcpu->run->system_event.data[0] = control->ghcb_gpa;
-
-		return 0;
+		goto out_terminate;
 	}
 	default:
 		/* Error, keep GHCB MSR value as-is */
@@ -4020,6 +4021,14 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 					    control->ghcb_gpa, ret);
 
 	return ret;
+
+out_terminate:
+	vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+	vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
+	vcpu->run->system_event.ndata = 1;
+	vcpu->run->system_event.data[0] = control->ghcb_gpa;
+
+	return 0;
 }
 
 int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 24/22] [SQUASH] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  2024-04-23 16:21   ` [PATCH v14 25/22] [SQUASH] KVM: SEV: Add support to handle " Michael Roth
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Terminate if an non-SNP guest attempts to issue a Page State Change
GHCB request; this is only allowed for SNP.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 088eca85a6ac..0d8fbd5e25fe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3996,6 +3996,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		break;
 	}
 	case GHCB_MSR_PSC_REQ:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		ret = snp_begin_psc_msr(vcpu, control->ghcb_gpa);
 		break;
 	case GHCB_MSR_TERM_REQ: {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 25/22] [SQUASH] KVM: SEV: Add support to handle Page State Change VMGEXIT
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
  2024-04-23 16:21   ` [PATCH v14 24/22] [SQUASH] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  2024-04-23 16:21   ` [PATCH v14 26/22] [SQUASH] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Report an error to guest if non-SNP attempts to issue GHCB-based Page
State Change.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0d8fbd5e25fe..c00081248ffe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3289,7 +3289,11 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
+		break;
 	case SVM_VMGEXIT_PSC:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
+		break;
 	case SVM_VMGEXIT_TERM_REQUEST:
 	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 26/22] [SQUASH] KVM: SEV: Add support for GHCB-based termination requests
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
  2024-04-23 16:21   ` [PATCH v14 24/22] [SQUASH] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
  2024-04-23 16:21   ` [PATCH v14 25/22] [SQUASH] KVM: SEV: Add support to handle " Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  2024-04-23 16:21   ` [PATCH v14 27/22] [SQUASH] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Move the case statement out of the SNP-specific block so it can be
handled the same way other SEV-ES requests are handled.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c00081248ffe..0e22f588dbe4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3289,12 +3289,12 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
+	case SVM_VMGEXIT_TERM_REQUEST:
 		break;
 	case SVM_VMGEXIT_PSC:
 		if (!sev_snp_guest(vcpu->kvm))
 			goto vmgexit_err;
 		break;
-	case SVM_VMGEXIT_TERM_REQUEST:
 	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 27/22] [SQUASH] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
                     ` (2 preceding siblings ...)
  2024-04-23 16:21   ` [PATCH v14 26/22] [SQUASH] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  2024-04-23 16:21   ` [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST " Michael Roth
  2024-04-23 16:21   ` [PATCH v14 29/22] [SQUASH] KVM: SEV: Support SEV-SNP AP Creation " Michael Roth
  5 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

As with SVM_VMGEXIT_PSC, ensure that SVM_VMGEXIT_GUEST_REQUEST can only
be issued by an SNP guest.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0e22f588dbe4..2b30b3b0eec8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3292,10 +3292,10 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_TERM_REQUEST:
 		break;
 	case SVM_VMGEXIT_PSC:
+	case SVM_VMGEXIT_GUEST_REQUEST:
 		if (!sev_snp_guest(vcpu->kvm))
 			goto vmgexit_err;
 		break;
-	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
 		break;
 	default:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
                     ` (3 preceding siblings ...)
  2024-04-23 16:21   ` [PATCH v14 27/22] [SQUASH] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  2024-04-23 21:36     ` Jarkko Sakkinen
  2024-04-23 16:21   ` [PATCH v14 29/22] [SQUASH] KVM: SEV: Support SEV-SNP AP Creation " Michael Roth
  5 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Ensure an error is returned if a non-SNP guest attempts to issue an
Extended Guest Request. Also add input validation for RAX/RBX.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2b30b3b0eec8..ff64ed8df301 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3297,6 +3297,11 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			goto vmgexit_err;
 		break;
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
+		if (!kvm_ghcb_rax_is_valid(svm) ||
+		    !kvm_ghcb_rbx_is_valid(svm))
+			goto vmgexit_err;
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v14 29/22] [SQUASH] KVM: SEV: Support SEV-SNP AP Creation NAE event
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
                     ` (4 preceding siblings ...)
  2024-04-23 16:21   ` [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST " Michael Roth
@ 2024-04-23 16:21   ` Michael Roth
  5 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:21 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Return an error if non-SNP guest issues AP Creation request.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index ff64ed8df301..1137a7f4136b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3280,6 +3280,8 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			goto vmgexit_err;
 		break;
 	case SVM_VMGEXIT_AP_CREATION:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
 		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
 			if (!kvm_ghcb_rax_is_valid(svm))
 				goto vmgexit_err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
  2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (22 preceding siblings ...)
  2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
@ 2024-04-23 16:31 ` Michael Roth
  2024-04-24 16:51   ` Paolo Bonzini
  23 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-23 16:31 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

Quoting Michael Roth (2024-04-21 13:01:00)
> This patchset is also available at:
> 
>   https://github.com/amdese/linux/commits/snp-host-v14
> 
> and is based on commit 20cc50a0410f (just before the v13 SNP patches) from:
> 
>   https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue
> 
> 
> Patch Layout
> ------------
> 
> 01-04: These patches add some basic infrastructure and introduces a new
>        KVM_X86_SNP_VM vm_type to handle differences verses the existing
>        KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types.
> 
> 05-07: These implement the KVM API to handle the creation of a
>        cryptographic launch context, encrypt/measure the initial image
>        into guest memory, and finalize it before launching it.
> 
> 08-13: These implement handling for various guest-generated events such
>        as page state changes, onlining of additional vCPUs, etc.
> 
> 14-17: These implement the gmem hooks needed to prepare gmem-allocated
>        pages before mapping them into guest private memory ranges as
>        well as cleaning them up prior to returning them to the host for
>        use as normal memory. Because this supplants certain activities
>        like issued WBINVDs during KVM MMU invalidations, there's also
>        a patch to avoid duplicating that work to avoid unecessary
>        overhead.
> 
> 18:    With all the core support in place, the patch adds a kvm_amd module
>        parameter to enable SNP support.
> 
> 19-22: These patches all deal with the servicing of guest requests to handle
>        things like attestation, as well as some related host-management
>        interfaces.

I just sent an additional set of fixups, patches 23-29. These add some
additional input validation on GHCB requests, mainly ensuring that
SNP-specific requests from non-SNP guests result in an error as soon as
they are received rather than reaching an error state indirectly further
into the call stack.

It's a small diff (included below), but a bit of a pain to squash in
patch by patch due to close proximity with each other, so I've pushed an
updated branch here that already has them squashed in:

  https://github.com/amdese/linux/commits/snp-host-v14b

If preferred I can also resubmit as v15, just let me know.

Full diff is below.

Thanks!

-Mike


diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1cec466e593b..1137a7f4136b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3280,6 +3280,8 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			goto vmgexit_err;
 		break;
 	case SVM_VMGEXIT_AP_CREATION:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
 		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
 			if (!kvm_ghcb_rax_is_valid(svm))
 				goto vmgexit_err;
@@ -3289,10 +3291,19 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
-	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_TERM_REQUEST:
+		break;
+	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_GUEST_REQUEST:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
+		break;
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto vmgexit_err;
+		if (!kvm_ghcb_rax_is_valid(svm) ||
+		    !kvm_ghcb_rbx_is_valid(svm))
+			goto vmgexit_err;
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3970,6 +3981,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
 		break;
 	case GHCB_MSR_PREF_GPA_REQ:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_NONE, GHCB_MSR_GPA_VALUE_MASK,
 				  GHCB_MSR_GPA_VALUE_POS);
 		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_RESP, GHCB_MSR_INFO_MASK,
@@ -3978,6 +3992,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 	case GHCB_MSR_REG_GPA_REQ: {
 		u64 gfn;
 
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		gfn = get_ghcb_msr_bits(svm, GHCB_MSR_GPA_VALUE_MASK,
 					GHCB_MSR_GPA_VALUE_POS);
 
@@ -3990,6 +4007,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		break;
 	}
 	case GHCB_MSR_PSC_REQ:
+		if (!sev_snp_guest(vcpu->kvm))
+			goto out_terminate;
+
 		ret = snp_begin_psc_msr(vcpu, control->ghcb_gpa);
 		break;
 	case GHCB_MSR_TERM_REQ: {
@@ -4004,12 +4024,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		pr_info("SEV-ES guest requested termination: %#llx:%#llx\n",
 			reason_set, reason_code);
 
-		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
-		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
-		vcpu->run->system_event.ndata = 1;
-		vcpu->run->system_event.data[0] = control->ghcb_gpa;
-
-		return 0;
+		goto out_terminate;
 	}
 	default:
 		/* Error, keep GHCB MSR value as-is */
@@ -4020,6 +4035,14 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 					    control->ghcb_gpa, ret);
 
 	return ret;
+
+out_terminate:
+	vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+	vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
+	vcpu->run->system_event.ndata = 1;
+	vcpu->run->system_event.data[0] = control->ghcb_gpa;
+
+	return 0;
 }
 
 int sev_handle_vmgexit(struct kvm_vcpu *vcpu)

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-23 16:21   ` [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST " Michael Roth
@ 2024-04-23 21:36     ` Jarkko Sakkinen
  0 siblings, 0 replies; 56+ messages in thread
From: Jarkko Sakkinen @ 2024-04-23 21:36 UTC (permalink / raw)
  To: Michael Roth, kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Tue Apr 23, 2024 at 7:21 PM EEST, Michael Roth wrote:
> Ensure an error is returned if a non-SNP guest attempts to issue an
> Extended Guest Request. Also add input validation for RAX/RBX.
>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>  arch/x86/kvm/svm/sev.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 2b30b3b0eec8..ff64ed8df301 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3297,6 +3297,11 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
>  			goto vmgexit_err;
>  		break;
>  	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
> +		if (!sev_snp_guest(vcpu->kvm))
> +			goto vmgexit_err;
> +		if (!kvm_ghcb_rax_is_valid(svm) ||
> +		    !kvm_ghcb_rbx_is_valid(svm))
> +			goto vmgexit_err;

Hmm... maybe I'm ignoring something but why this is not just:

	if (!sev_snp_guest(vcpu->kvm) ||
	    !kvm_ghcb_rax_is_valid(svm) ||
	    !kvm_ghcb_rbx_is_valid(svm)))

since they branch to the same location.

BR, Jarkko

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
  2024-04-23 16:31 ` [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
@ 2024-04-24 16:51   ` Paolo Bonzini
  0 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2024-04-24 16:51 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Tue, Apr 23, 2024 at 6:32 PM Michael Roth <michael.roth@amd.com> wrote:
> I just sent an additional set of fixups, patches 23-29. These add some
> additional input validation on GHCB requests, mainly ensuring that
> SNP-specific requests from non-SNP guests result in an error as soon as
> they are received rather than reaching an error state indirectly further
> into the call stack.
>
> It's a small diff (included below), but a bit of a pain to squash in
> patch by patch due to close proximity with each other, so I've pushed an
> updated branch here that already has them squashed in:
>
>   https://github.com/amdese/linux/commits/snp-host-v14b

Thanks, I pushed that to kvm-coco-queue. There was a missing signoff -
I just added it since you actually added it in the past[1] and the
patch only differs in context.

Now off to getting those mm acks.

Paolo

[1] https://patchew.org/linux/20231230172351.574091-1-michael.roth@amd.com/20231230172351.574091-33-michael.roth@amd.com/


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  2024-04-21 18:01 ` [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
@ 2024-04-24 20:21   ` Sean Christopherson
  2024-04-25 20:52     ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-24 20:21 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Sun, Apr 21, 2024, Michael Roth wrote:
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 6e31cb408dd8..1d2264e93afe 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -33,9 +33,11 @@
>  #include "cpuid.h"
>  #include "trace.h"
>  
> -#define GHCB_VERSION_MAX	1ULL
> +#define GHCB_VERSION_MAX	2ULL
>  #define GHCB_VERSION_MIN	1ULL

This needs a userspace control.  Being unable to limit the GHCB version advertised
to the guest is going to break live migration of SEV-ES VMs, e.g. if a pool of
hosts has some kernels running this flavor of KVM, and some hosts running an
older KVM that doesn't support v2.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-21 18:01 ` [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
@ 2024-04-24 20:59   ` Sean Christopherson
  2024-04-25 22:00     ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-24 20:59 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Sun, Apr 21, 2024, Michael Roth wrote:
> +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> +{
> +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> +		return 1; /* resume guest */
> +	}
> +
> +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> +	vcpu->run->vmgexit.psc_msr.op = op;

Argh, no.

This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
and extend as *needed*.  There is no good reason page state change requests need
*two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
KVM_EXIT_MEMORY_FAULT.

The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
majority of requests are contiguous, i.e. can be combined into a range by KVM,
and that handling any outliers by performing multiple exits to userspace will
provide sufficient performance.

And the non-MSR version that comes in later patch is a complete mess.  It kicks
the PSC out to userspace without *any* validation.  As I complained in the TDX
thread, that will create an unmaintable ABI for KVM.

KVM needs to have its own, well-defined ABI.  Splitting functionality between
KVM and userspace at seemingly random points is not maintainable.

E.g. if/when KVM supports UNSMASH, upgrading to the KVM would arguably break
userspace as PSC requests that previously exited would suddenly be handled by
KVM.  Maybe.  It's impossible to review this because there's no KVM ABI, KVM is
little more than a dumb pipe parroting information to userspace.

I truly do not understand why we would even consider allowing this.  We push back
on people wanting new hypercalls for some specific use case, because we already
have generic ways to achieve things, but then CoCo comes along and we apparently
throw out any thought of maintainability.  I don't get it.

[*] https://lore.kernel.org/all/Zg18ul8Q4PGQMWam@google.com

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-21 18:01 ` [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
@ 2024-04-24 23:26   ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-24 23:26 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Sun, Apr 21, 2024, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> KVM_SEV_SNP_LAUNCH_START begins the launch process for an SEV-SNP guest.
> The command initializes a cryptographic digest context used to construct
> the measurement of the guest. Other commands can then at that point be
> used to load/encrypt data into the guest's initial launch image.
> 
> For more information see the SEV-SNP specification.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Co-developed-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---

I somehow ended up in the v13 series and gave feedback on that version, but AFAICT
all of the feedback still applies to v14.

https://lore.kernel.org/all/Zil8MnPXkCbqw3Ka@google.com

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-21 18:01 ` [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
@ 2024-04-24 23:58   ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-24 23:58 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Sun, Apr 21, 2024, Michael Roth wrote:
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 9d08d1202544..d3ae4ded91df 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -258,6 +258,35 @@ static void sev_decommission(unsigned int handle)
>  	sev_guest_decommission(&decommission, NULL);
>  }
>  
> +static int snp_page_reclaim(u64 pfn)
> +{
> +	struct sev_data_snp_page_reclaim data = {0};
> +	int err, rc;
> +
> +	data.paddr = __sme_set(pfn << PAGE_SHIFT);
> +	rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> +	if (WARN_ON_ONCE(rc)) {

This now WARNs here *and* in the caller, which is overkill.  Pick one.

> +		/*
> +		 * This shouldn't happen under normal circumstances, but if the
> +		 * reclaim failed, then the page is no longer safe to use.

Explain _why_ it's unsafe.  The code makes it quite clear that reclaim shouldn't
fail.  E.g. I assume it's bound to the ASID still?  Something else?

This is all messy, too.  snp_launch_update_vmsa() does RECLAIM but doesn't
convert the page back to shared, which seems wrong.  sev_gmem_post_populate()
open codes the calls separately, and then snp_cleanup_guest_buf() bundles them
together.

Yeesh, and the return types are all mixed.  snp_cleanup_guest_buf() returns a
bool on success, but the helpers it uses return 0/-errno.  Please convert
everything to return 0/-errno, boolean "error codes" are hard to follow and make
the code unnecessarily brittle.

> +	if (WARN_ON_ONCE(sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src)) {
> +		ret = -EINVAL;

Just return -EINVAL, no?

> +		goto out;
> +	}
> +
> +	for (gfn = gfn_start, i = 0; gfn < gfn_start + npages; gfn++, i++) {
> +		struct sev_data_snp_launch_update fw_args = {0};
> +		bool assigned;
> +		void *vaddr;
> +		int level;
> +
> +		if (!kvm_mem_is_private(kvm, gfn)) {
> +			pr_debug("%s: Failed to ensure GFN 0x%llx has private memory attribute set\n",
> +				 __func__, gfn);
> +			ret = -EINVAL;
> +			break;
> +		}
> +
> +		ret = snp_lookup_rmpentry((u64)pfn + i, &assigned, &level);
> +		if (ret || assigned) {
> +			pr_debug("%s: Failed to ensure GFN 0x%llx RMP entry is initial shared state, ret: %d assigned: %d\n",
> +				 __func__, gfn, ret, assigned);
> +			ret = -EINVAL;
> +			break;
> +		}
> +
> +		if (src) {
> +			vaddr = kmap_local_pfn(pfn + i);
> +			ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
> +			if (ret) {
> +				pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
> +				goto out_unmap;
> +			}
> +		}
> +
> +		ret = rmp_make_private(pfn + i, gfn << PAGE_SHIFT, PG_LEVEL_4K,
> +				       sev_get_asid(kvm), true);
> +		if (ret) {
> +			pr_debug("%s: Failed to mark RMP entry for GFN 0x%llx as private, ret: %d\n",
> +				 __func__, gfn, ret);
> +			goto out_unmap;
> +		}
> +
> +		n_private++;
> +
> +		fw_args.gctx_paddr = __psp_pa(sev->snp_context);
> +		fw_args.address = __sme_set(pfn_to_hpa(pfn + i));
> +		fw_args.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
> +		fw_args.page_type = sev_populate_args->type;
> +		ret = __sev_issue_cmd(sev_populate_args->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
> +				      &fw_args, &sev_populate_args->fw_error);
> +		if (ret) {
> +			pr_debug("%s: SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
> +				 __func__, ret, sev_populate_args->fw_error);
> +
> +			if (WARN_ON_ONCE(snp_page_reclaim(pfn + i)))
> +				goto out_unmap;
> +
> +			/*
> +			 * When invalid CPUID function entries are detected,
> +			 * firmware writes the expected values into the page and
> +			 * leaves it unencrypted so it can be used for debugging
> +			 * and error-reporting.
> +			 *
> +			 * Copy this page back into the source buffer so
> +			 * userspace can use this information to provide
> +			 * information on which CPUID leaves/fields failed CPUID
> +			 * validation.
> +			 */
> +			if (sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
> +			    sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
> +				if (WARN_ON_ONCE(host_rmp_make_shared(pfn + i, PG_LEVEL_4K)))
> +					goto out_unmap;
> +
> +				if (copy_to_user(src + i * PAGE_SIZE, vaddr,
> +						 PAGE_SIZE))
> +					pr_debug("Failed to write CPUID page back to userspace\n");
> +
> +				/* PFN is hypervisor-owned at this point, skip cleanup for it. */
> +				n_private--;

If reclaim or make_shared fails, KVM will attempt make_shared again.  And I am
not a fan of keeping vaddr mapped after making the pfn private.  Functionally
it probably changes nothing, but conceptually it's odd.  And "mapping" is free
(AFAIK).  Oh, and vaddr is subtly guaranteed to be valid based on the type, which
is fun.

So why not unmap immediately after the first copy_from_user(), and then this
can be:


		if (ret)
			goto err;

	}

	return 0;

err:
	if (!<reclaim and make shared>() &&
	    sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
	    sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
		vaddr = kmap_local(...);
		copy_to_user(...);
		kunmap_local(vaddr);
	}
	for (i = 0; i < n_private - 1; i++)
		<reclaim? and make shared()>

	return ret;
	
> +                         sev_populate_args->fw_error == SEV_RET_INVALID_PARAM)
> +			}
> +		}
> +
> +out_unmap:
> +		kunmap_local(vaddr);
> +		if (ret)
> +			break;
> +	}
> +
> +out:
> +	if (ret) {
> +		pr_debug("%s: exiting with error ret %d, restoring %d gmem PFNs to shared.\n",
> +			 __func__, ret, n_private);
> +		for (i = 0; i < n_private; i++)
> +			WARN_ON_ONCE(host_rmp_make_shared(pfn + i, PG_LEVEL_4K));
> +	}
> +
> +	return ret;
> +}

...

> +	src = params.type == KVM_SEV_SNP_PAGE_TYPE_ZERO ? NULL : u64_to_user_ptr(params.uaddr);
> +
> +	count = kvm_gmem_populate(kvm, params.gfn_start, src, npages,
> +				  sev_gmem_post_populate, &sev_populate_args);
> +	if (count < 0) {
> +		argp->error = sev_populate_args.fw_error;
> +		pr_debug("%s: kvm_gmem_populate failed, ret %ld (fw_error %d)\n",
> +			 __func__, count, argp->error);
> +		ret = -EIO;
> +	} else if (count <= npages) {

This seems like excessive paranoia.

> +		params.gfn_start += count;
> +		params.len -= count * PAGE_SIZE;
> +		if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
> +			params.uaddr += count * PAGE_SIZE;
> +
> +		ret = copy_to_user(u64_to_user_ptr(argp->data), &params, sizeof(params))
> +		      ? -EIO : 0;

copy_to_user() failures typically return -EFAULT.  The style is not the most
readable.  Maybe this?

		ret = 0;
		if (copy_to_user(...))
			ret = -EFAULT;

> +	} else {
> +		WARN_ONCE(1, "Completed page count %ld exceeds requested amount %ld",
> +			  count, npages);
> +		ret = -EINVAL;
> +	}
> +
> +out:
> +	mutex_unlock(&kvm->slots_lock);
> +
> +	return ret;
> +}
> +
>  int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  {
>  	struct kvm_sev_cmd sev_cmd;
> @@ -2217,6 +2451,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  	case KVM_SEV_SNP_LAUNCH_START:
>  		r = snp_launch_start(kvm, &sev_cmd);
>  		break;
> +	case KVM_SEV_SNP_LAUNCH_UPDATE:
> +		r = snp_launch_update(kvm, &sev_cmd);
> +		break;
>  	default:
>  		r = -EINVAL;
>  		goto out;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-21 18:01 ` [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-25  0:10   ` Sean Christopherson
  2024-04-26 17:57     ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25  0:10 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sun, Apr 21, 2024, Michael Roth wrote:
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 85099198a10f..6cf186ed8f66 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
>  		struct kvm_user_vmgexit {
>  		#define KVM_USER_VMGEXIT_PSC_MSR	1
>  		#define KVM_USER_VMGEXIT_PSC		2
> +		#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3

Assuming we can't get massage this into a vendor agnostic exit, there's gotta be
a better name than EXT_GUEST_REQ, which is completely meaningless to me and probably
most other people that aren't intimately familar with the specs.  Request what?

>  			__u32 type; /* KVM_USER_VMGEXIT_* type */
>  			union {
>  				struct {
> @@ -3812,6 +3813,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp
>  	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
>  }
>  
> +static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct vmcb_control_area *control;
> +	struct kvm *kvm = vcpu->kvm;
> +	sev_ret_code fw_err = 0;
> +	int vmm_ret;
> +
> +	vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret;
> +	if (vmm_ret) {
> +		if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN)
> +			vcpu->arch.regs[VCPU_REGS_RBX] =
> +				vcpu->run->vmgexit.ext_guest_req.data_npages;
> +		goto abort_request;
> +	}
> +
> +	control = &svm->vmcb->control;
> +
> +	/*
> +	 * To avoid the message sequence number getting out of sync between the
> +	 * actual value seen by firmware verses the value expected by the guest,
> +	 * make sure attestations can't get paused on the write-side at this
> +	 * point by holding the lock for the entire duration of the firmware
> +	 * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY
> +	 * would need to be returned after firmware sees the request.
> +	 */
> +	mutex_lock(&snp_pause_attestation_lock);

Why is this in KVM?  IIUC, KVM only needs to get involved to translate GFNs to
PFNs, the rest can go in the sev-dev driver, no?  The whole split is weird,
seemingly due to KVM "needing" to take this lock.  E.g. why is core kernel code
in arch/x86/virt/svm/sev.c at all dealing with attestation goo, when pretty much
all of the actual usage is (or can be) in sev-dev.c

> +
> +	if (__snp_transaction_is_stale(svm->snp_transaction_id))
> +		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
> +	else if (!__snp_handle_guest_req(kvm, control->exit_info_1,
> +					 control->exit_info_2, &fw_err))
> +		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> +
> +	mutex_unlock(&snp_pause_attestation_lock);
> +
> +abort_request:
> +	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
> +
> +	return 1; /* resume guest */
> +}
> +
> +static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu)
> +{
> +	int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +	unsigned long data_npages;
> +	sev_ret_code fw_err;
> +	gpa_t data_gpa;
> +
> +	if (!sev_snp_guest(vcpu->kvm))
> +		goto abort_request;
> +
> +	data_gpa = vcpu->arch.regs[VCPU_REGS_RAX];
> +	data_npages = vcpu->arch.regs[VCPU_REGS_RBX];
> +
> +	if (!IS_ALIGNED(data_gpa, PAGE_SIZE))
> +		goto abort_request;
> +
> +	svm->snp_transaction_id = snp_transaction_get_id();
> +	if (snp_transaction_is_stale(svm->snp_transaction_id)) {

Why bother?  I assume this is rare, so why not handle it on the backend, i.e.
after userspace does its thing?  Then KVM doesn't even have to care about
checking for stale IDs, I think?

None of this makes much sense to my eyes, e.g. AFAICT, the only thing that can
pause attestation is userspace, yet the kernel is responsible for tracking whether
or not a transaction is fresh?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-21 18:01 ` [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
@ 2024-04-25  0:15   ` Sean Christopherson
  2024-04-26 17:35     ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25  0:15 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sun, Apr 21, 2024, Michael Roth wrote:
> These commands can be used to pause servicing of guest attestation
> requests. This useful when updating the reported TCB or signing key with
> commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
> in turn require updates to userspace-supplied certificates, and if an
> attestation request happens to be in-flight at the time those updates
> are occurring there is potential for a guest to receive a certificate
> blob that is out of sync with the effective signing key for the
> attestation report.
> 
> These interfaces also provide some versatility with how similar
> firmware/certificate update activities can be handled in the future.

Wait, IIUC, this is using the kernel to get two userspace components to not
stomp over each other.   Why is this the kernel's problem to solve?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP
  2024-04-21 18:01 ` [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
@ 2024-04-25  0:17   ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25  0:17 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sun, Apr 21, 2024, Michael Roth wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> With SNP/guest_memfd, private/encrypted memory should not be mappable,
> and MMU notifications for HVA-mapped memory will only be relevant to
> unencrypted guest memory. Therefore, the rationale behind issuing a
> wbinvd_on_all_cpus() in sev_guest_memory_reclaimed() should not apply
> for SNP guests and can be ignored.
> 
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> [mdr: Add some clarifications in commit]
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>  arch/x86/kvm/svm/sev.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 243369e302f4..cf00a811aca5 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3042,7 +3042,14 @@ static void sev_flush_encrypted_page(struct kvm_vcpu *vcpu, void *va)
>  
>  void sev_guest_memory_reclaimed(struct kvm *kvm)
>  {
> -	if (!sev_guest(kvm))
> +	/*
> +	 * With SNP+gmem, private/encrypted memory should be

Not should, *is*.

> +	 * unreachable via the hva-based mmu notifiers. Additionally,
> +	 * for shared->private translations, H/W coherency will ensure
> +	 * first guest access to the page would clear out any existing
> +	 * dirty copies of that cacheline.

I don't see how this second part is relevant.

> +	 */
> +	if (!sev_guest(kvm) || sev_snp_guest(kvm))
>  		return;
>  
>  	wbinvd_on_all_cpus();
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level
  2024-04-21 18:01 ` [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
@ 2024-04-25  0:45   ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25  0:45 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sun, Apr 21, 2024, Michael Roth wrote:
> ---
>  arch/x86/kvm/svm/sev.c | 32 ++++++++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c |  1 +
>  arch/x86/kvm/svm/svm.h |  7 +++++++
>  3 files changed, 40 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index ff9b8c68ae56..243369e302f4 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4528,3 +4528,35 @@ void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
>  		cond_resched();
>  	}
>  }
> +
> +/*
> + * Re-check whether an #NPF for a private/gmem page can still be serviced, and
> + * adjust maximum mapping level if needed.
> + */
> +int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private,

This is a misleading name.  The primary purpose is not to validate the fault, the
primary purpose is to get the max mapping level.  The fact that this can fail
should not dictate the name.

I also think we should skip the call if the max level is already PG_LEVEL_4K.
Something _could_ race and invalidate the RMP, but that's _exactly_ why KVM
guards the page fault path with mmu_invalidate_seq.

Actually, is returning an error in this case even correct?  Me thinks no.  If
something invalidates the RMP between kvm_gmem_get_pfn() and getting the mapping
level, then KVM should retry, which mmu_invalidate_seq handles.  Returning
-EINVAL and killing the VM is wrong.

And IMO, "gmem" shouldn't be in the name, this is not a hook from guest_memfd,
it's a hook for mapping private memory.  And as someone called out somwhere else,
the "private" parameters is pointless.  And for that matter, so is the gfn.

And even _if_ we want to return an error, we could even overload the return code
to handle this, e.g. in the caller:
	 
	r = static_call(kvm_x86_max_private_mapping_level)(vcpu->kvm, fault->pfn);
	if (r < 0) {
		kvm_release_pfn_clean(fault->pfn);
		return r;
	}

	fault->max_level = min(fault->max_level, r);

but what I think we want is:

---
int sev_snp_private_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn)
{
	int level, rc;
	bool assigned;

	if (!sev_snp_guest(kvm))
		return 0;

	rc = snp_lookup_rmpentry(pfn, &assigned, &level);
	if (rc || !assigned)
		return PG_LEVEL_4K;

	return level;
}

static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn,
					u8 max_level, int gmem_order)
{
	if (max_level == PG_LEVEL_4K)
		return PG_LEVEL_4K;

	max_level = min(kvm_max_level_for_order(gmem_order),max_level);
	if (max_level == PG_LEVEL_4K)
		return PG_LEVEL_4K;

	return min(max_level,
		   static_call(kvm_x86_private_max_mapping_level)(kvm, pfn);
}

static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu,
				   struct kvm_page_fault *fault)
{
	struct kvm *kvm = vcpu->kvm;
	int max_order, r;

	if (!kvm_slot_can_be_private(fault->slot)) {
		kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
		return -EFAULT;
	}

	r = kvm_gmem_get_pfn(kvm, fault->slot, fault->gfn, &fault->pfn, &max_order);
	if (r) {
		kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
		return r;
	}

	fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY);
	fault->max_level = kvm_max_private_mapping_level(kvm, fault->pfn,
							 fault->max_level, order);
	return RET_PF_CONTINUE;
}

---

Side topic, the KVM_MEM_READONLY check is unnecessary, KVM doesn't allow RO memslots
to coincide with guest_memfd.  I missed that in commit e563592224e0 ("KVM: Make
KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY").

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  2024-04-24 20:21   ` Sean Christopherson
@ 2024-04-25 20:52     ` Michael Roth
  2024-04-25 21:55       ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-25 20:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Wed, Apr 24, 2024 at 01:21:49PM -0700, Sean Christopherson wrote:
> On Sun, Apr 21, 2024, Michael Roth wrote:
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 6e31cb408dd8..1d2264e93afe 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -33,9 +33,11 @@
> >  #include "cpuid.h"
> >  #include "trace.h"
> >  
> > -#define GHCB_VERSION_MAX	1ULL
> > +#define GHCB_VERSION_MAX	2ULL
> >  #define GHCB_VERSION_MIN	1ULL
> 
> This needs a userspace control.  Being unable to limit the GHCB version advertised
> to the guest is going to break live migration of SEV-ES VMs, e.g. if a pool of
> hosts has some kernels running this flavor of KVM, and some hosts running an
> older KVM that doesn't support v2.
> 

The requirements for implementing the non-SNP aspects of the GHCB
version 2 protocol are fairly minimal, and KVM_SEV_INIT2 is already
migration incompatible with older kernels running KVM_SEV_ES_INIT (e.g.
migrate to newer host, shutdown, start -> measurement failure). There
are QEMU patches here that allow for controlling this via QEMU versioned
machine types to handle this [1]

So I think it makes sense to go ahead move to GHCB version 2 as the base
version for all SEV-ES/SNP guests created via KVM_SEV_INIT2, and leave
KVM_SEV_ES_INIT restricted to GHCB version 1.

This could be done in a pretty self-contained way for SEV-ES by applying
the following patches from this series which are the version 2 protocol
interfaces also applicable to SEV-ES:

  KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  KVM: SEV: Add support to handle AP reset MSR protocol
  KVM: SEV: Add support for GHCB-based termination requests

And then applying the below patch on top to set GHCB version 1 or 2
accordingly for SEV-ES. (and relocating the GHCB_VERSION_MAX bump to the
below patch as well, although it's not really used at that point so
could also just be dropped completely).

Then in the future we can extend KVM_SEV_INIT2 to allow specifying
specific/newer versions of the GHCB protocol when that becomes needed.

If that sounds appropriate I can submit as a separate standalone patchset
for SEV-ES and then make that a prereq for the remaining SNP-specific bits.

-Mike

[1] https://lore.kernel.org/kvm/20240409230743.962513-1-michael.roth@amd.com/


commit 114da695c065595f74fe8aa0e9203f3c65175a95
Author: Michael Roth <michael.roth@amd.com>
Date:   Thu Apr 25 14:42:17 2024 -0500

    KVM: SEV: Allow per-instance tracking of GHCB protocol version
    
    The GHCB protocol version may be different from one guest to the next.
    Add a field to track it and initialize it accordingly when KVM_SEV_INIT,
    KVM_SEV_ES_INIT, and KVM_SEV_INIT2 are called.
    
    Signed-off-by: Michael Roth <michael.roth@amd.com>

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1137a7f4136b..5c16abc47541 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -310,7 +310,7 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 
 static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
 			    struct kvm_sev_init *data,
-			    unsigned long vm_type)
+			    unsigned long vm_type, u16 ghcb_version)
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
 	struct sev_platform_init_args init_args = {0};
@@ -333,6 +333,7 @@ static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
 	sev->active = true;
 	sev->es_active = es_active;
 	sev->vmsa_features = data->vmsa_features;
+	sev->ghcb_version = ghcb_version;
 
 	if (vm_type == KVM_X86_SNP_VM)
 		sev->vmsa_features |= SVM_SEV_FEAT_SNP_ACTIVE;
@@ -376,7 +377,13 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 		return -EINVAL;
 
 	vm_type = (argp->id == KVM_SEV_INIT ? KVM_X86_SEV_VM : KVM_X86_SEV_ES_VM);
-	return __sev_guest_init(kvm, argp, &data, vm_type);
+
+	/*
+	 * KVM_SEV_ES_INIT has been deprecated by KVM_SEV_INIT2, so it will
+	 * continue to only ever support the minimal GHCB protocol version.
+	 */
+	return __sev_guest_init(kvm, argp, &data, vm_type,
+				vm_type == KVM_X86_SEV_ES_VM ? GHCB_VERSION_MIN : 0);
 }
 
 static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
@@ -395,7 +402,15 @@ static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	if (copy_from_user(&data, u64_to_user_ptr(argp->data), sizeof(data)))
 		return -EFAULT;
 
-	return __sev_guest_init(kvm, argp, &data, kvm->arch.vm_type);
+	/*
+	 * Currently KVM supports the full range of mandatory features defined by
+	 * version 2 of the GHCB protocol, so default to that for SEV/SNP guests
+	 * created via KVM_SEV_INIT2. Care should be taken that support for future
+	 * versions of the GHCB protocol are configurable via KVM_SEV_INIT2 to
+	 * allow limiting the guests to a particular version to support things
+	 * like live migration.
+	 */
+	return __sev_guest_init(kvm, argp, &data, kvm->arch.vm_type, 2);
 }
 
 static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
@@ -3906,6 +3921,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
 	u64 ghcb_info;
 	int ret = 1;
 
@@ -3916,7 +3932,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 
 	switch (ghcb_info) {
 	case GHCB_MSR_SEV_INFO_REQ:
-		set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
+		set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(sev->ghcb_version,
 						    GHCB_VERSION_MIN,
 						    sev_enc_bit));
 		break;
@@ -4341,11 +4357,14 @@ void sev_init_vmcb(struct vcpu_svm *svm)
 
 void sev_es_vcpu_reset(struct vcpu_svm *svm)
 {
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
+
 	/*
 	 * Set the GHCB MSR value as per the GHCB specification when emulating
 	 * vCPU RESET for an SEV-ES guest.
 	 */
-	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
+	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(sev->ghcb_version,
 					    GHCB_VERSION_MIN,
 					    sev_enc_bit));
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 28140bc8af27..229cb630b540 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -87,6 +87,7 @@ struct kvm_sev_info {
 	struct list_head regions_list;  /* List of registered regions */
 	u64 ap_jump_table;	/* SEV-ES AP Jump Table address */
 	u64 vmsa_features;
+	u64 ghcb_version;	/* Highest guest GHCB protocol version allowed */
 	struct kvm *enc_context_owner; /* Owner of copied encryption context */
 	struct list_head mirror_vms; /* List of VMs mirroring */
 	struct list_head mirror_entry; /* Use as a list entry of mirrors */

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  2024-04-25 20:52     ` Michael Roth
@ 2024-04-25 21:55       ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25 21:55 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Thu, Apr 25, 2024, Michael Roth wrote:
> On Wed, Apr 24, 2024 at 01:21:49PM -0700, Sean Christopherson wrote:
> > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > index 6e31cb408dd8..1d2264e93afe 100644
> > > --- a/arch/x86/kvm/svm/sev.c
> > > +++ b/arch/x86/kvm/svm/sev.c
> > > @@ -33,9 +33,11 @@
> > >  #include "cpuid.h"
> > >  #include "trace.h"
> > >  
> > > -#define GHCB_VERSION_MAX	1ULL
> > > +#define GHCB_VERSION_MAX	2ULL
> > >  #define GHCB_VERSION_MIN	1ULL
> > 
> > This needs a userspace control.  Being unable to limit the GHCB version advertised
> > to the guest is going to break live migration of SEV-ES VMs, e.g. if a pool of
> > hosts has some kernels running this flavor of KVM, and some hosts running an
> > older KVM that doesn't support v2.
> > 
> 
> The requirements for implementing the non-SNP aspects of the GHCB
> version 2 protocol are fairly minimal, and KVM_SEV_INIT2 is already
> migration incompatible with older kernels running KVM_SEV_ES_INIT (e.g.
> migrate to newer host, shutdown, start -> measurement failure). There
> are QEMU patches here that allow for controlling this via QEMU versioned
> machine types to handle this [1]
> 
> So I think it makes sense to go ahead move to GHCB version 2 as the base
> version for all SEV-ES/SNP guests created via KVM_SEV_INIT2, and leave
> KVM_SEV_ES_INIT restricted to GHCB version 1.

Hmm, I like that.  Dangle a carrot to get folks to switch to KVM_SEV_INIT2.

> This could be done in a pretty self-contained way for SEV-ES by applying
> the following patches from this series which are the version 2 protocol
> interfaces also applicable to SEV-ES:
> 
>   KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
>   KVM: SEV: Add support to handle AP reset MSR protocol
>   KVM: SEV: Add support for GHCB-based termination requests
> 
> And then applying the below patch on top to set GHCB version 1 or 2
> accordingly for SEV-ES. (and relocating the GHCB_VERSION_MAX bump to the
> below patch as well, although it's not really used at that point so
> could also just be dropped completely).
> 
> Then in the future we can extend KVM_SEV_INIT2 to allow specifying
> specific/newer versions of the GHCB protocol when that becomes needed.

Any reason not to let userspace restrict the GHCB protocol from the get-go?  It
seems inevitable that KVM will need to support that at some point, we'd have a
wee bit more confidence that we didn't botch the definition of KVM_SEV_INIT2 and
end up with KVM_SEV_INIT3, and in the unlikely event some poor provider gets into
a situation where guests are crashing because they don't handle v2 correctly,
userspace can workaround the issue without need to extend KVM's uAPI.

> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 28140bc8af27..229cb630b540 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -87,6 +87,7 @@ struct kvm_sev_info {
>  	struct list_head regions_list;  /* List of registered regions */
>  	u64 ap_jump_table;	/* SEV-ES AP Jump Table address */
>  	u64 vmsa_features;
> +	u64 ghcb_version;	/* Highest guest GHCB protocol version allowed */

This can/should be a u16, no?

>  	struct kvm *enc_context_owner; /* Owner of copied encryption context */
>  	struct list_head mirror_vms; /* List of VMs mirroring */
>  	struct list_head mirror_entry; /* Use as a list entry of mirrors */

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-24 20:59   ` Sean Christopherson
@ 2024-04-25 22:00     ` Michael Roth
  2024-04-25 22:13       ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-25 22:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> On Sun, Apr 21, 2024, Michael Roth wrote:
> > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > +{
> > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > +	struct vcpu_svm *svm = to_svm(vcpu);
> > +
> > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > +		return 1; /* resume guest */
> > +	}
> > +
> > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > +	vcpu->run->vmgexit.psc_msr.op = op;
> 
> Argh, no.
> 
> This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> and extend as *needed*.  There is no good reason page state change requests need
> *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> KVM_EXIT_MEMORY_FAULT.
> 
> The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> majority of requests are contiguous, i.e. can be combined into a range by KVM,
> and that handling any outliers by performing multiple exits to userspace will
> provide sufficient performance.

That does tend to be the case. We won't have as much granularity with
the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
expected to be for the entire range anyway, and if that fails for
whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
for cases where the entries aren't contiguous however, which would
involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
not a huge deal since it doesn't seem to be a common case.

KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
flexibility to just issue that directly within a guest rather than
relying on SNP/TDX specific hcalls. I don't know if that approach is
practical for a real guest, but it could be useful for having re-usable
guest code in KVM selftests that "just works" for all variants of
SNP/TDX/sw-protected. (though we'd still want stuff that exercises
SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).

I think we'd there is some potential baggage there with the previous SEV
live migration use cases. There's some potential that existing guest kernels
will use it once it gets advertised and issue them alongside GHCB-based
page-state changes. It might make sense to use one of the reserved bits
to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
hardware/software-protected VMs and not interchangeable with calls that
were used for SEV live migration stuff.

If this seems reasonable I'll give it a go and see what it looks like.

> 
> And the non-MSR version that comes in later patch is a complete mess.  It kicks
> the PSC out to userspace without *any* validation.  As I complained in the TDX
> thread, that will create an unmaintable ABI for KVM.
> 
> KVM needs to have its own, well-defined ABI.  Splitting functionality between
> KVM and userspace at seemingly random points is not maintainable.
> 
> E.g. if/when KVM supports UNSMASH, upgrading to the KVM would arguably break
> userspace as PSC requests that previously exited would suddenly be handled by
> KVM.  Maybe.  It's impossible to review this because there's no KVM ABI, KVM is
> little more than a dumb pipe parroting information to userspace.

It leans on the GHCB spec to avoid re-inventing structs/documentation
for things like Page State Change buffers, but do have some control
as we want over how much we farm out versus lock into the KVM ABI. For
instance the accompanying Documentation/ update mentions we only send a
subset of GHCB requests that need to be handled by userspace, so we
could handle SMASH/UNSMASH in KVM without breaking expectations (or if
SMASH/UNSMASH were intermixed with PSCs, documentation that only PSC
opcodes could be updated by userspace).

But I'm certainly not arguing it wouldn't be better to have a
guest-agnostic alternative if we can reach an agreement on that, and
KVM_HC_MAP_GPA_RANGE seems like it could work.

-Mike

> 
> I truly do not understand why we would even consider allowing this.  We push back
> on people wanting new hypercalls for some specific use case, because we already
> have generic ways to achieve things, but then CoCo comes along and we apparently
> throw out any thought of maintainability.  I don't get it.
> 
> [*] https://lore.kernel.org/all/Zg18ul8Q4PGQMWam@google.com

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-25 22:00     ` Michael Roth
@ 2024-04-25 22:13       ` Sean Christopherson
  2024-04-26 17:16         ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-25 22:13 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Thu, Apr 25, 2024, Michael Roth wrote:
> On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > > +{
> > > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > > +	struct vcpu_svm *svm = to_svm(vcpu);
> > > +
> > > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > > +		return 1; /* resume guest */
> > > +	}
> > > +
> > > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > > +	vcpu->run->vmgexit.psc_msr.op = op;
> > 
> > Argh, no.
> > 
> > This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> > and extend as *needed*.  There is no good reason page state change requests need
> > *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> > conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> > KVM_EXIT_MEMORY_FAULT.
> > 
> > The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> > majority of requests are contiguous, i.e. can be combined into a range by KVM,
> > and that handling any outliers by performing multiple exits to userspace will
> > provide sufficient performance.
> 
> That does tend to be the case. We won't have as much granularity with
> the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
> expected to be for the entire range anyway, and if that fails for
> whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
> for cases where the entries aren't contiguous however, which would
> involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
> not a huge deal since it doesn't seem to be a common case.

If it was less complex overall, I wouldn't be opposed to KVM marshalling everything
into a buffer, but I suspect it will be simpler to just have KVM loop until the
PSC request is complete.

> KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
> flexibility to just issue that directly within a guest rather than
> relying on SNP/TDX specific hcalls. I don't know if that approach is
> practical for a real guest, but it could be useful for having re-usable
> guest code in KVM selftests that "just works" for all variants of
> SNP/TDX/sw-protected. (though we'd still want stuff that exercises
> SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).
> 
> I think we'd there is some potential baggage there with the previous SEV
> live migration use cases. There's some potential that existing guest kernels
> will use it once it gets advertised and issue them alongside GHCB-based
> page-state changes. It might make sense to use one of the reserved bits
> to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
> hardware/software-protected VMs and not interchangeable with calls that
> were used for SEV live migration stuff.

I don't think I follow, what exactly wouldn't be interchangeable, and why?

> If this seems reasonable I'll give it a go and see what it looks like.
> 
> > 
> > And the non-MSR version that comes in later patch is a complete mess.  It kicks
> > the PSC out to userspace without *any* validation.  As I complained in the TDX
> > thread, that will create an unmaintable ABI for KVM.
> > 
> > KVM needs to have its own, well-defined ABI.  Splitting functionality between
> > KVM and userspace at seemingly random points is not maintainable.
> > 
> > E.g. if/when KVM supports UNSMASH, upgrading to the KVM would arguably break
> > userspace as PSC requests that previously exited would suddenly be handled by
> > KVM.  Maybe.  It's impossible to review this because there's no KVM ABI, KVM is
> > little more than a dumb pipe parroting information to userspace.
> 
> It leans on the GHCB spec to avoid re-inventing structs/documentation
> for things like Page State Change buffers, but do have some control
> as we want over how much we farm out versus lock into the KVM ABI. For
> instance the accompanying Documentation/ update mentions we only send a
> subset of GHCB requests that need to be handled by userspace, so we
> could handle SMASH/UNSMASH in KVM without breaking expectations (or if
> SMASH/UNSMASH were intermixed with PSCs, documentation that only PSC
> opcodes could be updated by userspace).
> 
> But I'm certainly not arguing it wouldn't be better to have a
> guest-agnostic alternative if we can reach an agreement on that, and
> KVM_HC_MAP_GPA_RANGE seems like it could work.

Yeah, I want to at least _try_ to achieve common ground, because the basic
functionality of all this stuff is the exact same.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-25 22:13       ` Sean Christopherson
@ 2024-04-26 17:16         ` Michael Roth
  2024-04-26 20:14           ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-26 17:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Thu, Apr 25, 2024 at 03:13:40PM -0700, Sean Christopherson wrote:
> On Thu, Apr 25, 2024, Michael Roth wrote:
> > On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > > > +{
> > > > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > > > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > > > +	struct vcpu_svm *svm = to_svm(vcpu);
> > > > +
> > > > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > > > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > > > +		return 1; /* resume guest */
> > > > +	}
> > > > +
> > > > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > > > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > > > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > > > +	vcpu->run->vmgexit.psc_msr.op = op;
> > > 
> > > Argh, no.
> > > 
> > > This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> > > and extend as *needed*.  There is no good reason page state change requests need
> > > *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> > > conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> > > KVM_EXIT_MEMORY_FAULT.
> > > 
> > > The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> > > majority of requests are contiguous, i.e. can be combined into a range by KVM,
> > > and that handling any outliers by performing multiple exits to userspace will
> > > provide sufficient performance.
> > 
> > That does tend to be the case. We won't have as much granularity with
> > the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
> > expected to be for the entire range anyway, and if that fails for
> > whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
> > for cases where the entries aren't contiguous however, which would
> > involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
> > not a huge deal since it doesn't seem to be a common case.
> 
> If it was less complex overall, I wouldn't be opposed to KVM marshalling everything
> into a buffer, but I suspect it will be simpler to just have KVM loop until the
> PSC request is complete.

Agreed. But *if* we decided to introduce a buffer, where would you
suggest adding it? The kvm_run union fields are set to 256 bytes, and
we'd need close to 4K to handle a full GHCB PSC buffer in 1 go. Would
additional storage at the end of struct kvm_run be acceptable?

> 
> > KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
> > flexibility to just issue that directly within a guest rather than
> > relying on SNP/TDX specific hcalls. I don't know if that approach is
> > practical for a real guest, but it could be useful for having re-usable
> > guest code in KVM selftests that "just works" for all variants of
> > SNP/TDX/sw-protected. (though we'd still want stuff that exercises
> > SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).
> > 
> > I think we'd there is some potential baggage there with the previous SEV
> > live migration use cases. There's some potential that existing guest kernels
> > will use it once it gets advertised and issue them alongside GHCB-based
> > page-state changes. It might make sense to use one of the reserved bits
> > to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
> > hardware/software-protected VMs and not interchangeable with calls that
> > were used for SEV live migration stuff.
> 
> I don't think I follow, what exactly wouldn't be interchangeable, and why?

For instance, if KVM_FEATURE_MIGRATION_CONTROL is advertised, then when
amd_enc_status_change_finish() is triggered as a result of
set_memory_encrypted(), we'd see

  1) a GHCB PSC for SNP, which will get forwarded to userspace via
     KVM_HC_MAP_GPA_RANGE
  2) KVM_HC_MAP_GPA_RANGE issued directly by the guest.

In that case, we'd be duplicating PSCs but it wouldn't necessarily hurt
anything. But ideally we'd be able to distinguish the 2 cases so we
could rightly treat 1) as only being expected for SNP, and 2) as only
being expected for SEV/SEV-ES.

I'm also concerned there's potential for more serious issues, for instance
kvm_init_platform() has a loop that resets all E820_TYPE_RAM ranges to
encrypted, which may step on PSCs that SNP had to do earlier on for
setting up shared pages for things like GHCB buffers. For SEV live
migration use case, it's not a huge deal if a shared page gets processed
as private, it's just slower because it would need to go through the SEV
firmware APIs. There's also more leeway in that updates can happen at
opportunistic times since there's as KVM_MIGRATION_READY state that can
be set via MSR_KVM_MIGRATION_CONTROL once everything is ready is set
up.

For SNP, all those associates KVM_HC_MAP_GPA_RANGE calls would have
immediate effect if they are not in lock-step with the actual state of
each page before access, and there is no leeway for stuff like leaving
a shared page as private.

I also a lot of that was before lazy acceptance support which happens
out-of-band versus these pv_ops.mmu.notify_page_* hooks, and kexec
support for SNP guests has some notion of tracking state across kexec
boundaries that might get clobbered too.

I haven't looked too closely though, but it seems like a good idea to at
least set a bit for kvm->arch.has_private_mem use cases. If it's not
needed then worst case we have a bit that userspace doesn't necessarily
need to care about, but for SNP just filtering out the duplicate
requests seems like enough to justify having it.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-25  0:15   ` Sean Christopherson
@ 2024-04-26 17:35     ` Michael Roth
  2024-04-26 19:57       ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-26 17:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Wed, Apr 24, 2024 at 05:15:40PM -0700, Sean Christopherson wrote:
> On Sun, Apr 21, 2024, Michael Roth wrote:
> > These commands can be used to pause servicing of guest attestation
> > requests. This useful when updating the reported TCB or signing key with
> > commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
> > in turn require updates to userspace-supplied certificates, and if an
> > attestation request happens to be in-flight at the time those updates
> > are occurring there is potential for a guest to receive a certificate
> > blob that is out of sync with the effective signing key for the
> > attestation report.
> > 
> > These interfaces also provide some versatility with how similar
> > firmware/certificate update activities can be handled in the future.
> 
> Wait, IIUC, this is using the kernel to get two userspace components to not
> stomp over each other.   Why is this the kernel's problem to solve?

It's not that they are stepping on each other, but that kernel and
userspace need to coordinate on updating 2 components whose updates need
to be atomic from a guest perspective. Take an update to VLEK key for
instance:

 1) management gets a new VLEK endorsement key from KDS along with
    associated certificate chain
 2) management uses SNP_VLEK_LOAD to update key
 3) management updates the certs at the path VMM will grab them
    from when the EXT_GUEST_REQUEST userspace exit is issued

If an attestation request comes in after 2), but before 3), then the
guest sees an attestation report signed with the new key, but still
gets the old certificate.

If you reverse the ordering:

 1) management gets a new VLEK endorsement key from KDS along with
    associated certificate chain
 2) management updates the certs at the path VMM will grab them
    from when the EXT_GUEST_REQUEST userspace exit is issued
 3) management uses SNP_VLEK_LOAD to update key

then an attestation request between 2) and 3) will result in the guest
getting the new cert, but getting an attestation report signed with an old
endorsement key.

Providing a way to pause guest attestation requests prior to 2), and
resume after 3), provides a straightforward way to make those updates
atomic to the guest.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-25  0:10   ` Sean Christopherson
@ 2024-04-26 17:57     ` Michael Roth
  0 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-26 17:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Wed, Apr 24, 2024 at 05:10:31PM -0700, Sean Christopherson wrote:
> On Sun, Apr 21, 2024, Michael Roth wrote:
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 85099198a10f..6cf186ed8f66 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
> >  		struct kvm_user_vmgexit {
> >  		#define KVM_USER_VMGEXIT_PSC_MSR	1
> >  		#define KVM_USER_VMGEXIT_PSC		2
> > +		#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3
> 
> Assuming we can't get massage this into a vendor agnostic exit, there's gotta be
> a better name than EXT_GUEST_REQ, which is completely meaningless to me and probably
> most other people that aren't intimately familar with the specs.  Request what?

EXT_CERT_REQ maybe? That's effectively all it boils down to, fetching
the GHCB-defined certificate blob from userspace.

> 
> >  			__u32 type; /* KVM_USER_VMGEXIT_* type */
> >  			union {
> >  				struct {
> > @@ -3812,6 +3813,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp
> >  	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
> >  }
> >  
> > +static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu)
> > +{
> > +	struct vcpu_svm *svm = to_svm(vcpu);
> > +	struct vmcb_control_area *control;
> > +	struct kvm *kvm = vcpu->kvm;
> > +	sev_ret_code fw_err = 0;
> > +	int vmm_ret;
> > +
> > +	vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret;
> > +	if (vmm_ret) {
> > +		if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN)
> > +			vcpu->arch.regs[VCPU_REGS_RBX] =
> > +				vcpu->run->vmgexit.ext_guest_req.data_npages;
> > +		goto abort_request;
> > +	}
> > +
> > +	control = &svm->vmcb->control;
> > +
> > +	/*
> > +	 * To avoid the message sequence number getting out of sync between the
> > +	 * actual value seen by firmware verses the value expected by the guest,
> > +	 * make sure attestations can't get paused on the write-side at this
> > +	 * point by holding the lock for the entire duration of the firmware
> > +	 * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY
> > +	 * would need to be returned after firmware sees the request.
> > +	 */
> > +	mutex_lock(&snp_pause_attestation_lock);
> 
> Why is this in KVM?  IIUC, KVM only needs to get involved to translate GFNs to
> PFNs, the rest can go in the sev-dev driver, no?  The whole split is weird,
> seemingly due to KVM "needing" to take this lock.  E.g. why is core kernel code
> in arch/x86/virt/svm/sev.c at all dealing with attestation goo, when pretty much
> all of the actual usage is (or can be) in sev-dev.c

It would be very tempting to have all this locking tucked away in sev-dev
driver, but KVM is a part of that sequence because it needs to handle fetching
the certificate that will be returned to the guest as part of the
attestation request. The transaction ID updates from PAUSE/RESUME
commands is technically enough satisfy this requirement, in which case
KVM wouldn't need to take the lock directly, only check that the
transaction ID isn't stale and report -EBUSY/-EAGAIN to the guest if it
is.

But if the request actually gets sent to firmware, and then we decide
after that the transaction is stale and thus the attestation response
and certificate might not be in sync, then reporting -EBUSY/-EAGAIN will
permanently hose the guest because the message seq fields will be out of
sync between firmware and guest kernel. That's why we need to hold the
lock to actually block a transaction ID update from occuring once we've
committed to sending the attestation request to firmware.

> 
> > +
> > +	if (__snp_transaction_is_stale(svm->snp_transaction_id))
> > +		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
> > +	else if (!__snp_handle_guest_req(kvm, control->exit_info_1,
> > +					 control->exit_info_2, &fw_err))
> > +		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> > +
> > +	mutex_unlock(&snp_pause_attestation_lock);
> > +
> > +abort_request:
> > +	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
> > +
> > +	return 1; /* resume guest */
> > +}
> > +
> > +static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu)
> > +{
> > +	int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> > +	struct vcpu_svm *svm = to_svm(vcpu);
> > +	unsigned long data_npages;
> > +	sev_ret_code fw_err;
> > +	gpa_t data_gpa;
> > +
> > +	if (!sev_snp_guest(vcpu->kvm))
> > +		goto abort_request;
> > +
> > +	data_gpa = vcpu->arch.regs[VCPU_REGS_RAX];
> > +	data_npages = vcpu->arch.regs[VCPU_REGS_RBX];
> > +
> > +	if (!IS_ALIGNED(data_gpa, PAGE_SIZE))
> > +		goto abort_request;
> > +
> > +	svm->snp_transaction_id = snp_transaction_get_id();
> > +	if (snp_transaction_is_stale(svm->snp_transaction_id)) {
> 
> Why bother?  I assume this is rare, so why not handle it on the backend, i.e.
> after userspace does its thing?  Then KVM doesn't even have to care about
> checking for stale IDs, I think?

That's true, it's little more than a very minor performance optimization
to do it here rather than on the return path, so could definitely be
dropped.

> 
> None of this makes much sense to my eyes, e.g. AFAICT, the only thing that can
> pause attestation is userspace, yet the kernel is responsible for tracking whether
> or not a transaction is fresh?

Pausing is essentially the start of an update transaction initiated by
userspace. So the kernel is tracking whether there's a potential
transaction that has been started or completed since it first started
servicing the attestation request, otherwise there could be a mismatch
between the endorsement key used for the attestation report versus the
certificate for the endorsement key that was fetched from userspace.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-26 17:35     ` Michael Roth
@ 2024-04-26 19:57       ` Sean Christopherson
  2024-04-26 21:46         ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-26 19:57 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Fri, Apr 26, 2024, Michael Roth wrote:
> On Wed, Apr 24, 2024 at 05:15:40PM -0700, Sean Christopherson wrote:
> > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > These commands can be used to pause servicing of guest attestation
> > > requests. This useful when updating the reported TCB or signing key with
> > > commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
> > > in turn require updates to userspace-supplied certificates, and if an
> > > attestation request happens to be in-flight at the time those updates
> > > are occurring there is potential for a guest to receive a certificate
> > > blob that is out of sync with the effective signing key for the
> > > attestation report.
> > > 
> > > These interfaces also provide some versatility with how similar
> > > firmware/certificate update activities can be handled in the future.
> > 
> > Wait, IIUC, this is using the kernel to get two userspace components to not
> > stomp over each other.   Why is this the kernel's problem to solve?
> 
> It's not that they are stepping on each other, but that kernel and
> userspace need to coordinate on updating 2 components whose updates need
> to be atomic from a guest perspective. Take an update to VLEK key for
> instance:
> 
>  1) management gets a new VLEK endorsement key from KDS along with

What is "management"?  I assume its some userspace daemon?

>     associated certificate chain
>  2) management uses SNP_VLEK_LOAD to update key
>  3) management updates the certs at the path VMM will grab them
>     from when the EXT_GUEST_REQUEST userspace exit is issued
> 
> If an attestation request comes in after 2), but before 3), then the
> guest sees an attestation report signed with the new key, but still
> gets the old certificate.
> 
> If you reverse the ordering:
> 
>  1) management gets a new VLEK endorsement key from KDS along with
>     associated certificate chain
>  2) management updates the certs at the path VMM will grab them
>     from when the EXT_GUEST_REQUEST userspace exit is issued
>  3) management uses SNP_VLEK_LOAD to update key
> 
> then an attestation request between 2) and 3) will result in the guest
> getting the new cert, but getting an attestation report signed with an old
> endorsement key.
> 
> Providing a way to pause guest attestation requests prior to 2), and
> resume after 3), provides a straightforward way to make those updates
> atomic to the guest.

Assuming "management" is a userspace component, I still don't see why this
requires kernel involvement.  "management" can tell VMMs to pause attestation
without having to bounce through the kernel.  It doesn't even require a push
model, e.g. wrap/redirect the certs with a file that has a "pause" flag and a
sequence counter.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-26 17:16         ` Michael Roth
@ 2024-04-26 20:14           ` Sean Christopherson
  2024-04-26 22:24             ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-26 20:14 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Fri, Apr 26, 2024, Michael Roth wrote:
> On Thu, Apr 25, 2024 at 03:13:40PM -0700, Sean Christopherson wrote:
> > On Thu, Apr 25, 2024, Michael Roth wrote:
> > > On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> > > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > > > > +{
> > > > > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > > > > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > > > > +	struct vcpu_svm *svm = to_svm(vcpu);
> > > > > +
> > > > > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > > > > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > > > > +		return 1; /* resume guest */
> > > > > +	}
> > > > > +
> > > > > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > > > > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > > > > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > > > > +	vcpu->run->vmgexit.psc_msr.op = op;
> > > > 
> > > > Argh, no.
> > > > 
> > > > This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> > > > and extend as *needed*.  There is no good reason page state change requests need
> > > > *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> > > > conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> > > > KVM_EXIT_MEMORY_FAULT.
> > > > 
> > > > The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> > > > majority of requests are contiguous, i.e. can be combined into a range by KVM,
> > > > and that handling any outliers by performing multiple exits to userspace will
> > > > provide sufficient performance.
> > > 
> > > That does tend to be the case. We won't have as much granularity with
> > > the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
> > > expected to be for the entire range anyway, and if that fails for
> > > whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
> > > for cases where the entries aren't contiguous however, which would
> > > involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
> > > not a huge deal since it doesn't seem to be a common case.
> > 
> > If it was less complex overall, I wouldn't be opposed to KVM marshalling everything
> > into a buffer, but I suspect it will be simpler to just have KVM loop until the
> > PSC request is complete.
> 
> Agreed. But *if* we decided to introduce a buffer, where would you
> suggest adding it? The kvm_run union fields are set to 256 bytes, and
> we'd need close to 4K to handle a full GHCB PSC buffer in 1 go. Would
> additional storage at the end of struct kvm_run be acceptable?

Don't even need more memory, just use vcpu->arch.pio_data, which is always
allocated and is mmap()able by userspace via KVM_PIO_PAGE_OFFSET.

> > > KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
> > > flexibility to just issue that directly within a guest rather than
> > > relying on SNP/TDX specific hcalls. I don't know if that approach is
> > > practical for a real guest, but it could be useful for having re-usable
> > > guest code in KVM selftests that "just works" for all variants of
> > > SNP/TDX/sw-protected. (though we'd still want stuff that exercises
> > > SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).
> > > 
> > > I think we'd there is some potential baggage there with the previous SEV
> > > live migration use cases. There's some potential that existing guest kernels
> > > will use it once it gets advertised and issue them alongside GHCB-based
> > > page-state changes. It might make sense to use one of the reserved bits
> > > to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
> > > hardware/software-protected VMs and not interchangeable with calls that
> > > were used for SEV live migration stuff.
> > 
> > I don't think I follow, what exactly wouldn't be interchangeable, and why?
> 
> For instance, if KVM_FEATURE_MIGRATION_CONTROL is advertised, then when
> amd_enc_status_change_finish() is triggered as a result of
> set_memory_encrypted(), we'd see
> 
>   1) a GHCB PSC for SNP, which will get forwarded to userspace via
>      KVM_HC_MAP_GPA_RANGE
>   2) KVM_HC_MAP_GPA_RANGE issued directly by the guest.
> 
> In that case, we'd be duplicating PSCs but it wouldn't necessarily hurt
> anything. But ideally we'd be able to distinguish the 2 cases so we
> could rightly treat 1) as only being expected for SNP, and 2) as only
> being expected for SEV/SEV-ES.

Why would the guest issue both?  That's a guest bug.  Or if supressing the second
hypercall is an issue, simply don't enumerate MIGRATION_CONTROL for SNP guests.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-26 19:57       ` Sean Christopherson
@ 2024-04-26 21:46         ` Michael Roth
  2024-04-27  0:10           ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-26 21:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Larry.Dewey

On Fri, Apr 26, 2024 at 12:57:08PM -0700, Sean Christopherson wrote:
> On Fri, Apr 26, 2024, Michael Roth wrote:
> > On Wed, Apr 24, 2024 at 05:15:40PM -0700, Sean Christopherson wrote:
> > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > These commands can be used to pause servicing of guest attestation
> > > > requests. This useful when updating the reported TCB or signing key with
> > > > commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
> > > > in turn require updates to userspace-supplied certificates, and if an
> > > > attestation request happens to be in-flight at the time those updates
> > > > are occurring there is potential for a guest to receive a certificate
> > > > blob that is out of sync with the effective signing key for the
> > > > attestation report.
> > > > 
> > > > These interfaces also provide some versatility with how similar
> > > > firmware/certificate update activities can be handled in the future.
> > > 
> > > Wait, IIUC, this is using the kernel to get two userspace components to not
> > > stomp over each other.   Why is this the kernel's problem to solve?
> > 
> > It's not that they are stepping on each other, but that kernel and
> > userspace need to coordinate on updating 2 components whose updates need
> > to be atomic from a guest perspective. Take an update to VLEK key for
> > instance:
> > 
> >  1) management gets a new VLEK endorsement key from KDS along with
> 
> What is "management"?  I assume its some userspace daemon?

It could be a daemon depending on cloud provider, but the main example
we have in mind is something more basic like virtee[1] being used to
interactively perform an update at the command-line. E.g. you point it
at the new VLEK, the new cert, and it will handle updating the certs at
some known location and issuing the SNP_LOAD_VLEK command. With this
interface, it can take the additional step of PAUSE'ing attestations
before performing either update to keep the 2 actions in sync with the
guest view.

[1] https://github.com/virtee/snphost

> 
> >     associated certificate chain
> >  2) management uses SNP_VLEK_LOAD to update key
> >  3) management updates the certs at the path VMM will grab them
> >     from when the EXT_GUEST_REQUEST userspace exit is issued
> > 
> > If an attestation request comes in after 2), but before 3), then the
> > guest sees an attestation report signed with the new key, but still
> > gets the old certificate.
> > 
> > If you reverse the ordering:
> > 
> >  1) management gets a new VLEK endorsement key from KDS along with
> >     associated certificate chain
> >  2) management updates the certs at the path VMM will grab them
> >     from when the EXT_GUEST_REQUEST userspace exit is issued
> >  3) management uses SNP_VLEK_LOAD to update key
> > 
> > then an attestation request between 2) and 3) will result in the guest
> > getting the new cert, but getting an attestation report signed with an old
> > endorsement key.
> > 
> > Providing a way to pause guest attestation requests prior to 2), and
> > resume after 3), provides a straightforward way to make those updates
> > atomic to the guest.
> 
> Assuming "management" is a userspace component, I still don't see why this
> requires kernel involvement.  "management" can tell VMMs to pause attestation
> without having to bounce through the kernel.  It doesn't even require a push

That would mean a tool like virtee above would need to issue kernel
commands like SNP_LOAD_VLEK to handle key update, then implement some
VMM-specific hook to pause servicing of EXT_GUEST_REQ (or whatever we
end up calling it). QEMU could define events for this, and libvirt could
implement them, and virtee could interact with libvirt to issue them in
place of the PAUSE/RESUME approach here.

But SNP libvirt support is a ways out, QEMU event mechanism for this
will be a pain to use directly because you'd need some custom way to
enumerate all guests, to issue them. But then maybe the provider doesn't
even use QEMU and has to invent something else. Or they just decide to
pause all guests before performing updates but that still a potential
significant amount of downtime.

> without having to bounce through the kernel.  It doesn't even require a push
> model, e.g. wrap/redirect the certs with a file that has a "pause" flag and a
> sequence counter.

We could do something like flag the certificate file itself, it does
sounds less painful than the above. But what defines that spec? GHCB
completely defines the current format of the certs blob, so if we wrap
that in another layer we need to extend the GHCB or have something else
be the authority on what that wrapper looks like and tools like virtee
would need to be very selective about what VMMs it can claim to support
based on what file format they support... it just seems like a
significant and unecessary pain that every userspace implementation
will need to go through to achieve the same basic functionality.

With PAUSE/RESUME, tools like virtee can be completely VMM-agnostic, and
more highly-integrated daemon-based approaches can still benefit from a
common mechanism that doesn't require signficant coordination with VMM
processes. For something as important and basic as updating endorsement
keys while guests are running it seems worthwhile to expose this minimal
level of control to userspace.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-26 20:14           ` Sean Christopherson
@ 2024-04-26 22:24             ` Michael Roth
  2024-04-26 22:48               ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-26 22:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Fri, Apr 26, 2024 at 01:14:32PM -0700, Sean Christopherson wrote:
> On Fri, Apr 26, 2024, Michael Roth wrote:
> > On Thu, Apr 25, 2024 at 03:13:40PM -0700, Sean Christopherson wrote:
> > > On Thu, Apr 25, 2024, Michael Roth wrote:
> > > > On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> > > > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > > > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > > > > > +{
> > > > > > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > > > > > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > > > > > +	struct vcpu_svm *svm = to_svm(vcpu);
> > > > > > +
> > > > > > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > > > > > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > > > > > +		return 1; /* resume guest */
> > > > > > +	}
> > > > > > +
> > > > > > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > > > > > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > > > > > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > > > > > +	vcpu->run->vmgexit.psc_msr.op = op;
> > > > > 
> > > > > Argh, no.
> > > > > 
> > > > > This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> > > > > and extend as *needed*.  There is no good reason page state change requests need
> > > > > *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> > > > > conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> > > > > KVM_EXIT_MEMORY_FAULT.
> > > > > 
> > > > > The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> > > > > majority of requests are contiguous, i.e. can be combined into a range by KVM,
> > > > > and that handling any outliers by performing multiple exits to userspace will
> > > > > provide sufficient performance.
> > > > 
> > > > That does tend to be the case. We won't have as much granularity with
> > > > the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
> > > > expected to be for the entire range anyway, and if that fails for
> > > > whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
> > > > for cases where the entries aren't contiguous however, which would
> > > > involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
> > > > not a huge deal since it doesn't seem to be a common case.
> > > 
> > > If it was less complex overall, I wouldn't be opposed to KVM marshalling everything
> > > into a buffer, but I suspect it will be simpler to just have KVM loop until the
> > > PSC request is complete.
> > 
> > Agreed. But *if* we decided to introduce a buffer, where would you
> > suggest adding it? The kvm_run union fields are set to 256 bytes, and
> > we'd need close to 4K to handle a full GHCB PSC buffer in 1 go. Would
> > additional storage at the end of struct kvm_run be acceptable?
> 
> Don't even need more memory, just use vcpu->arch.pio_data, which is always
> allocated and is mmap()able by userspace via KVM_PIO_PAGE_OFFSET.

Nice, that seems like a good option if needed.

> 
> > > > KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
> > > > flexibility to just issue that directly within a guest rather than
> > > > relying on SNP/TDX specific hcalls. I don't know if that approach is
> > > > practical for a real guest, but it could be useful for having re-usable
> > > > guest code in KVM selftests that "just works" for all variants of
> > > > SNP/TDX/sw-protected. (though we'd still want stuff that exercises
> > > > SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).
> > > > 
> > > > I think we'd there is some potential baggage there with the previous SEV
> > > > live migration use cases. There's some potential that existing guest kernels
> > > > will use it once it gets advertised and issue them alongside GHCB-based
> > > > page-state changes. It might make sense to use one of the reserved bits
> > > > to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
> > > > hardware/software-protected VMs and not interchangeable with calls that
> > > > were used for SEV live migration stuff.
> > > 
> > > I don't think I follow, what exactly wouldn't be interchangeable, and why?
> > 
> > For instance, if KVM_FEATURE_MIGRATION_CONTROL is advertised, then when
> > amd_enc_status_change_finish() is triggered as a result of
> > set_memory_encrypted(), we'd see
> > 
> >   1) a GHCB PSC for SNP, which will get forwarded to userspace via
> >      KVM_HC_MAP_GPA_RANGE
> >   2) KVM_HC_MAP_GPA_RANGE issued directly by the guest.
> > 
> > In that case, we'd be duplicating PSCs but it wouldn't necessarily hurt
> > anything. But ideally we'd be able to distinguish the 2 cases so we
> > could rightly treat 1) as only being expected for SNP, and 2) as only
> > being expected for SEV/SEV-ES.
> 
> Why would the guest issue both?  That's a guest bug.  Or if supressing the second
> hypercall is an issue, simply don't enumerate MIGRATION_CONTROL for SNP guests.

At the time of its inception, KVM_HC_MAP_GPA_RANGE was simply
KVM_HC_PAGE_ENC_STATUS and got a more generic name over the course of
development. But its purpose never changed: to inform the hypervisor of
the current encryption status of a GPA range so VMMs could build up a
list of shared guest regions that don't need to go through firmware for
migration.. And it was and still is asynchronous to a degree, since the
the migration control MSRs signals when that list of shared pages is
usable.

These are very different semantics the proposal to use KVM_HC_MAP_GPA_RANGE
as a means to set memory attributes via KVM_SET_MEMORY_ATTRIBUTES, and
the 2 purposes aren't necessarily mutually exclusive to one another. It
only really becomes a bug if we begin to interpret the original use-case
as something other than it's initial intent in the case of SNP.

But at the same time, it's hard to imagine this older SEV live migration
use-case being useful for SNP, since userspace will necessarily have all
the information it needs to determine what is/isn't shared with relying
on an additional hypercall.

So treating the older use case as specific to non-SNP and disallowing the
use of MIGRATION_CONTROL does seems reasonable. But it's really the CPUID
bit that advertises it, SEV just happens to only use it for when
MIGRATION_CONTROL is also advertised. So we could disable that as well,
but I did like the idea of being able to handle guest-issued
KVM_HC_MAP_GPA_RANGE calls even with SNP/TDX enabled, which is less of an
option if we can't advertised KVM_HC_MAP_GPA_RANGE via cpuid. But I
suppose we could do that with KVM selftests which is probably where
that's more likely to be useful.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-26 22:24             ` Michael Roth
@ 2024-04-26 22:48               ` Michael Roth
  0 siblings, 0 replies; 56+ messages in thread
From: Michael Roth @ 2024-04-26 22:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Fri, Apr 26, 2024 at 05:24:57PM -0500, Michael Roth wrote:
> On Fri, Apr 26, 2024 at 01:14:32PM -0700, Sean Christopherson wrote:
> > On Fri, Apr 26, 2024, Michael Roth wrote:
> > > On Thu, Apr 25, 2024 at 03:13:40PM -0700, Sean Christopherson wrote:
> > > > On Thu, Apr 25, 2024, Michael Roth wrote:
> > > > > On Wed, Apr 24, 2024 at 01:59:48PM -0700, Sean Christopherson wrote:
> > > > > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > > > > +static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
> > > > > > > +{
> > > > > > > +	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
> > > > > > > +	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
> > > > > > > +	struct vcpu_svm *svm = to_svm(vcpu);
> > > > > > > +
> > > > > > > +	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
> > > > > > > +		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
> > > > > > > +		return 1; /* resume guest */
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
> > > > > > > +	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
> > > > > > > +	vcpu->run->vmgexit.psc_msr.gpa = gpa;
> > > > > > > +	vcpu->run->vmgexit.psc_msr.op = op;
> > > > > > 
> > > > > > Argh, no.
> > > > > > 
> > > > > > This is the same crud that TDX tried to push[*].  Use KVM's existing user exits,
> > > > > > and extend as *needed*.  There is no good reason page state change requests need
> > > > > > *two* exit reasons.  The *only* thing KVM supports right now is private<=>shared
> > > > > > conversions, and that can be handled with either KVM_HC_MAP_GPA_RANGE or
> > > > > > KVM_EXIT_MEMORY_FAULT.
> > > > > > 
> > > > > > The non-MSR flavor can batch requests, but I'm willing to bet that the overwhelming
> > > > > > majority of requests are contiguous, i.e. can be combined into a range by KVM,
> > > > > > and that handling any outliers by performing multiple exits to userspace will
> > > > > > provide sufficient performance.
> > > > > 
> > > > > That does tend to be the case. We won't have as much granularity with
> > > > > the per-entry error codes, but KVM_SET_MEMORY_ATTRIBUTES would be
> > > > > expected to be for the entire range anyway, and if that fails for
> > > > > whatever reason then we KVM_BUG_ON() anyway. We do have to have handling
> > > > > for cases where the entries aren't contiguous however, which would
> > > > > involve multiple KVM_EXIT_HYPERCALLs until everything is satisfied. But
> > > > > not a huge deal since it doesn't seem to be a common case.
> > > > 
> > > > If it was less complex overall, I wouldn't be opposed to KVM marshalling everything
> > > > into a buffer, but I suspect it will be simpler to just have KVM loop until the
> > > > PSC request is complete.
> > > 
> > > Agreed. But *if* we decided to introduce a buffer, where would you
> > > suggest adding it? The kvm_run union fields are set to 256 bytes, and
> > > we'd need close to 4K to handle a full GHCB PSC buffer in 1 go. Would
> > > additional storage at the end of struct kvm_run be acceptable?
> > 
> > Don't even need more memory, just use vcpu->arch.pio_data, which is always
> > allocated and is mmap()able by userspace via KVM_PIO_PAGE_OFFSET.
> 
> Nice, that seems like a good option if needed.
> 
> > 
> > > > > KVM_HC_MAP_GPA_RANGE seems like a nice option because we'd also have the
> > > > > flexibility to just issue that directly within a guest rather than
> > > > > relying on SNP/TDX specific hcalls. I don't know if that approach is
> > > > > practical for a real guest, but it could be useful for having re-usable
> > > > > guest code in KVM selftests that "just works" for all variants of
> > > > > SNP/TDX/sw-protected. (though we'd still want stuff that exercises
> > > > > SNP/TDX->KVM_HC_MAP_GPA_RANGE translation).
> > > > > 
> > > > > I think we'd there is some potential baggage there with the previous SEV
> > > > > live migration use cases. There's some potential that existing guest kernels
> > > > > will use it once it gets advertised and issue them alongside GHCB-based
> > > > > page-state changes. It might make sense to use one of the reserved bits
> > > > > to denote this flavor of KVM_HC_MAP_GPA_RANGE as being for
> > > > > hardware/software-protected VMs and not interchangeable with calls that
> > > > > were used for SEV live migration stuff.
> > > > 
> > > > I don't think I follow, what exactly wouldn't be interchangeable, and why?
> > > 
> > > For instance, if KVM_FEATURE_MIGRATION_CONTROL is advertised, then when
> > > amd_enc_status_change_finish() is triggered as a result of
> > > set_memory_encrypted(), we'd see
> > > 
> > >   1) a GHCB PSC for SNP, which will get forwarded to userspace via
> > >      KVM_HC_MAP_GPA_RANGE
> > >   2) KVM_HC_MAP_GPA_RANGE issued directly by the guest.
> > > 
> > > In that case, we'd be duplicating PSCs but it wouldn't necessarily hurt
> > > anything. But ideally we'd be able to distinguish the 2 cases so we
> > > could rightly treat 1) as only being expected for SNP, and 2) as only
> > > being expected for SEV/SEV-ES.
> > 
> > Why would the guest issue both?  That's a guest bug.  Or if supressing the second
> > hypercall is an issue, simply don't enumerate MIGRATION_CONTROL for SNP guests.
> 
> At the time of its inception, KVM_HC_MAP_GPA_RANGE was simply
> KVM_HC_PAGE_ENC_STATUS and got a more generic name over the course of
> development. But its purpose never changed: to inform the hypervisor of
> the current encryption status of a GPA range so VMMs could build up a
> list of shared guest regions that don't need to go through firmware for
> migration.. And it was and still is asynchronous to a degree, since the
> the migration control MSRs signals when that list of shared pages is
> usable.
> 
> These are very different semantics the proposal to use KVM_HC_MAP_GPA_RANGE
> as a means to set memory attributes via KVM_SET_MEMORY_ATTRIBUTES, and
> the 2 purposes aren't necessarily mutually exclusive to one another. It
> only really becomes a bug if we begin to interpret the original use-case
> as something other than it's initial intent in the case of SNP.
> 
> But at the same time, it's hard to imagine this older SEV live migration
> use-case being useful for SNP, since userspace will necessarily have all
> the information it needs to determine what is/isn't shared with relying
> on an additional hypercall.
> 
> So treating the older use case as specific to non-SNP and disallowing the
> use of MIGRATION_CONTROL does seems reasonable. But it's really the CPUID
> bit that advertises it, SEV just happens to only use it for when
> MIGRATION_CONTROL is also advertised. So we could disable that as well,
> but I did like the idea of being able to handle guest-issued
> KVM_HC_MAP_GPA_RANGE calls even with SNP/TDX enabled, which is less of an
> option if we can't advertised KVM_HC_MAP_GPA_RANGE via cpuid. But I
> suppose we could do that with KVM selftests which is probably where
> that's more likely to be useful.

Hmm, well...assuming SNP/TDX guest agree to make those vCPU registers
available via VMSA/etc in those cases... So i suppose we'd need some
additional handling to support advertising KVM_HC_MAP_GPA_RANGE via cpuid
either way and it is best to disallow guest-issued KVM_HC_MAP_GPA_RANGE
from being advertised to guests until there's support and a solid use-case
for it.

-Mike

> 
> -Mike
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-26 21:46         ` Michael Roth
@ 2024-04-27  0:10           ` Sean Christopherson
  2024-04-27  1:32             ` Michael Roth
  0 siblings, 1 reply; 56+ messages in thread
From: Sean Christopherson @ 2024-04-27  0:10 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Larry.Dewey

On Fri, Apr 26, 2024, Michael Roth wrote:
> On Fri, Apr 26, 2024 at 12:57:08PM -0700, Sean Christopherson wrote:
> > On Fri, Apr 26, 2024, Michael Roth wrote:
> > What is "management"?  I assume its some userspace daemon?
> 
> It could be a daemon depending on cloud provider, but the main example
> we have in mind is something more basic like virtee[1] being used to
> interactively perform an update at the command-line. E.g. you point it
> at the new VLEK, the new cert, and it will handle updating the certs at
> some known location and issuing the SNP_LOAD_VLEK command. With this
  ^^^^^^^^^^^^^^^^^^^
> interface, it can take the additional step of PAUSE'ing attestations
> before performing either update to keep the 2 actions in sync with the
> guest view.

...

> > without having to bounce through the kernel.  It doesn't even require a push
> > model, e.g. wrap/redirect the certs with a file that has a "pause" flag and a
> > sequence counter.
> 
> We could do something like flag the certificate file itself, it does
> sounds less painful than the above. But what defines that spec?

Whoever defines "some known location".  And it doesn't need to be a file wrapper,
e.g. put the cert in a directory along with a lock.  Actually, IIUC, there doesn't
even need to be a separate lock file.  I know very little about userspace programming,
but common sense and a quick search tells me that file locks are a solved problem.

E.g. it took me ~5 minutes of Googling to come up with this, which AFAICT does
exactly what you want.

touch ~/vlek.cert
(
  flock -e 200
  echo "Locked the cert, sleeping for 10 seconds"
  sleep 10
  echo "Igor, it's alive!!!!!!"
) 200< vlek.cert

touch ~/vlek.cert
(
  flock -s 201
  echo "Got me a shared lock, no updates for you!"
) 201< vlek.cert

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-27  0:10           ` Sean Christopherson
@ 2024-04-27  1:32             ` Michael Roth
  2024-04-29 14:27               ` Sean Christopherson
  0 siblings, 1 reply; 56+ messages in thread
From: Michael Roth @ 2024-04-27  1:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Larry.Dewey

On Fri, Apr 26, 2024 at 05:10:10PM -0700, Sean Christopherson wrote:
> On Fri, Apr 26, 2024, Michael Roth wrote:
> > On Fri, Apr 26, 2024 at 12:57:08PM -0700, Sean Christopherson wrote:
> > > On Fri, Apr 26, 2024, Michael Roth wrote:
> > > What is "management"?  I assume its some userspace daemon?
> > 
> > It could be a daemon depending on cloud provider, but the main example
> > we have in mind is something more basic like virtee[1] being used to
> > interactively perform an update at the command-line. E.g. you point it
> > at the new VLEK, the new cert, and it will handle updating the certs at
> > some known location and issuing the SNP_LOAD_VLEK command. With this
>   ^^^^^^^^^^^^^^^^^^^
> > interface, it can take the additional step of PAUSE'ing attestations
> > before performing either update to keep the 2 actions in sync with the
> > guest view.
> 
> ...
> 
> > > without having to bounce through the kernel.  It doesn't even require a push
> > > model, e.g. wrap/redirect the certs with a file that has a "pause" flag and a
> > > sequence counter.
> > 
> > We could do something like flag the certificate file itself, it does
> > sounds less painful than the above. But what defines that spec?
> 
> Whoever defines "some known location".  And it doesn't need to be a file wrapper,

"some known location" is a necessary and simple parameter controlled by
the cloud provider, so it's easy to make a tool like virtee aware of those
what those parameters should be for any particular environment. But it's not
easy to make it aware of a particular providers internal way of synchronizing
guests and certs access. We'd be somewhat dependent of those providers either
providing hooks to allow for better integration, which is "work" that might
encourage them to just brew their own solutions, versus...

Providing a simple reference scheme that's clearly defined, and easily
adoptable across the board, which is "less work", and makes adoption of common
tools/libraries SNP/certs/key management easier because we don't need to
directly involve a provider's internal guest management mechanisms into those
tools.

> e.g. put the cert in a directory along with a lock.  Actually, IIUC, there doesn't
> even need to be a separate lock file.  I know very little about userspace programming,
> but common sense and a quick search tells me that file locks are a solved problem.
> 
> E.g. it took me ~5 minutes of Googling to come up with this, which AFAICT does
> exactly what you want.
> 
> touch ~/vlek.cert
> (
>   flock -e 200
>   echo "Locked the cert, sleeping for 10 seconds"
>   sleep 10
>   echo "Igor, it's alive!!!!!!"
> ) 200< vlek.cert
> 
> touch ~/vlek.cert
> (
>   flock -s 201
>   echo "Got me a shared lock, no updates for you!"
> ) 201< vlek.cert
> 

Hmm... I did completely miss this option. But I think there are still some
issues here. IIUC you're suggesting (for example):

  "Management":
  a) writelock vlek.cert
  b) perform SNP_LOAD_VLEK and update vlek.cert contents
  c) unlock vlek.cert

  "QEMU":
  a) readlock vlek.cert
  b) copy cert into guest buffer
  c) unlock vlek.cert

The issue is that after "QEMU" unlocks and return the cert to KVM we'll
have:

  "KVM"
  a) return from EXT_GUEST_REQ exit to userspace
  b) issue the attestation report to firmware
  c) return the attestation report and cert to the guest

Between a) and b), "Management" can complete another entire update, but
the cert that it passes back to the guest will be stale relative to the
key used to sign the attestation report.

If we need to take more time to explore other options it's not
absolutely necessary to have the kernel solve this now. But every userspace
will need to solve it in some way so it seemed like it might be nice to
have a simple reference implementation to start with.

-Mike

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-27  1:32             ` Michael Roth
@ 2024-04-29 14:27               ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2024-04-29 14:27 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Larry.Dewey

On Fri, Apr 26, 2024, Michael Roth wrote:
> On Fri, Apr 26, 2024 at 05:10:10PM -0700, Sean Christopherson wrote:
> > e.g. put the cert in a directory along with a lock.  Actually, IIUC, there doesn't
> > even need to be a separate lock file.  I know very little about userspace programming,
> > but common sense and a quick search tells me that file locks are a solved problem.
> > 
> > E.g. it took me ~5 minutes of Googling to come up with this, which AFAICT does
> > exactly what you want.
> > 
> > touch ~/vlek.cert
> > (
> >   flock -e 200
> >   echo "Locked the cert, sleeping for 10 seconds"
> >   sleep 10
> >   echo "Igor, it's alive!!!!!!"
> > ) 200< vlek.cert
> > 
> > touch ~/vlek.cert
> > (
> >   flock -s 201
> >   echo "Got me a shared lock, no updates for you!"
> > ) 201< vlek.cert
> > 
> 
> Hmm... I did completely miss this option. But I think there are still some
> issues here. IIUC you're suggesting (for example):
> 
>   "Management":
>   a) writelock vlek.cert
>   b) perform SNP_LOAD_VLEK and update vlek.cert contents
>   c) unlock vlek.cert
> 
>   "QEMU":
>   a) readlock vlek.cert
>   b) copy cert into guest buffer
>   c) unlock vlek.cert
> 
> The issue is that after "QEMU" unlocks and return the cert to KVM we'll
> have:
> 
>   "KVM"
>   a) return from EXT_GUEST_REQ exit to userspace
>   b) issue the attestation report to firmware
>   c) return the attestation report and cert to the guest
> 
> Between a) and b), "Management" can complete another entire update, but
> the cert that it passes back to the guest will be stale relative to the
> key used to sign the attestation report.

I was thinking userspace would hold the lock across SEV_CMD_SNP_GUEST_REQUEST.

   QEMU:
    a) readlock vlek.cert
    b) copy cert into guest buffer
    c) set kvm_run->immediate_exit
    d) invoke KVM_RUN
    e) KVM sends SEV_CMD_SNP_GUEST_REQUEST to PSP
    f) KVM exits to userspace with -EINTR
    g) unlock vlek.cert
    h) invoke KVM_RUN (resume the guest)

> If we need to take more time to explore other options it's not
> absolutely necessary to have the kernel solve this now. But every userspace
> will need to solve it in some way so it seemed like it might be nice to
> have a simple reference implementation to start with.

Shoving something into the kernel is not a "reference implementation", especially
not when it impacts the ABI of multiple subsystems.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2024-04-29 14:27 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-21 18:01 [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2024-04-21 18:01 ` [PATCH v14 01/22] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
2024-04-21 18:01 ` [PATCH v14 02/22] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
2024-04-21 18:01 ` [PATCH v14 03/22] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2024-04-24 20:21   ` Sean Christopherson
2024-04-25 20:52     ` Michael Roth
2024-04-25 21:55       ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 04/22] KVM: SEV: Add initial SEV-SNP support Michael Roth
2024-04-21 18:01 ` [PATCH v14 05/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2024-04-24 23:26   ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 06/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2024-04-24 23:58   ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 07/22] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2024-04-21 18:01 ` [PATCH v14 08/22] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2024-04-21 18:01 ` [PATCH v14 09/22] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2024-04-24 20:59   ` Sean Christopherson
2024-04-25 22:00     ` Michael Roth
2024-04-25 22:13       ` Sean Christopherson
2024-04-26 17:16         ` Michael Roth
2024-04-26 20:14           ` Sean Christopherson
2024-04-26 22:24             ` Michael Roth
2024-04-26 22:48               ` Michael Roth
2024-04-21 18:01 ` [PATCH v14 10/22] KVM: SEV: Add support to handle " Michael Roth
2024-04-21 18:01 ` [PATCH v14 11/22] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
2024-04-21 18:01 ` [PATCH v14 12/22] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
2024-04-21 18:01 ` [PATCH v14 13/22] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2024-04-21 18:01 ` [PATCH v14 14/22] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2024-04-21 18:01 ` [PATCH v14 15/22] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2024-04-21 18:01 ` [PATCH v14 16/22] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
2024-04-25  0:45   ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 17/22] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
2024-04-25  0:17   ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 18/22] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
2024-04-21 18:01 ` [PATCH v14 19/22] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2024-04-21 18:01 ` [PATCH v14 20/22] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
2024-04-21 18:01 ` [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
2024-04-25  0:15   ` Sean Christopherson
2024-04-26 17:35     ` Michael Roth
2024-04-26 19:57       ` Sean Christopherson
2024-04-26 21:46         ` Michael Roth
2024-04-27  0:10           ` Sean Christopherson
2024-04-27  1:32             ` Michael Roth
2024-04-29 14:27               ` Sean Christopherson
2024-04-21 18:01 ` [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
2024-04-25  0:10   ` Sean Christopherson
2024-04-26 17:57     ` Michael Roth
2024-04-23 16:21 ` [PATCH v14 23/22] [SQUASH] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2024-04-23 16:21   ` [PATCH v14 24/22] [SQUASH] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2024-04-23 16:21   ` [PATCH v14 25/22] [SQUASH] KVM: SEV: Add support to handle " Michael Roth
2024-04-23 16:21   ` [PATCH v14 26/22] [SQUASH] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2024-04-23 16:21   ` [PATCH v14 27/22] [SQUASH] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2024-04-23 16:21   ` [PATCH v14 28/22] [SQUASH] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST " Michael Roth
2024-04-23 21:36     ` Jarkko Sakkinen
2024-04-23 16:21   ` [PATCH v14 29/22] [SQUASH] KVM: SEV: Support SEV-SNP AP Creation " Michael Roth
2024-04-23 16:31 ` [PATCH v14 00/22] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2024-04-24 16:51   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).