Linux-HyperV Archive mirror
 help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	luto@kernel.org, peterz@infradead.org, thomas.lendacky@amd.com,
	sathyanarayanan.kuppuswamy@linux.intel.com,
	kirill.shutemov@linux.intel.com, seanjc@google.com,
	rick.p.edgecombe@intel.com, linux-kernel@vger.kernel.org,
	linux-hyperv@vger.kernel.org, x86@kernel.org
Cc: mikelley@microsoft.com
Subject: [PATCH 0/5] x86/coco: Mark CoCo VM pages not present when changing encrypted state
Date: Fri, 29 Sep 2023 11:19:04 -0700	[thread overview]
Message-ID: <1696011549-28036-1-git-send-email-mikelley@microsoft.com> (raw)

In a CoCo VM when a page transitions from encrypted to decrypted, or vice
versa, attributes in the PTE must be updated *and* the hypervisor must
be notified of the change. Because there are two separate steps, there's
a window where the settings are inconsistent.  Normally the code that
initiates the transition (via set_memory_decrypted() or
set_memory_encrypted()) ensures that the memory is not being accessed
during a transition, so the window of inconsistency is not a problem.
However, load_unaligned_zeropad() can read arbitrary memory pages at
arbitrary times, which could read a transitioning page during the
window.  In such a case, CoCo VM specific exceptions are taken
(depending on the CoCo architecture in use).  Current code in those
exception handlers recovers and does "fixup" on the result returned by
load_unaligned_zeropad().  Unfortunately, this exception handling can't
work in paravisor scenarios (TDX Paritioning and SEV-SNP in vTOM mode).
The exceptions would need to be forwarded from the paravisor to the
Linux guest, but there's no architectural spec for how to do that.

Fortunately, there's a simpler way to solve the problem by changing
the core transition code in __set_memory_enc_pgtable() to do the
following:

1.  Remove aliasing mappings
2.  Flush the data cache if needed
3.  Remove the PRESENT bit from the PTEs of all transitioning pages
4.  Set/clear the encryption attribute as appropriate
5.  Flush the TLB so the changed encryption attribute isn't visible
6.  Notify the hypervisor of the encryption status change
7.  Add back the PRESENT bit, making the changed attribute visible

With this approach, load_unaligned_zeropad() just takes its normal
page-fault-based fixup path if it touches a page that is transitioning.
As a result, load_unaligned_zeropad() and CoCo VM page transitioning
are completely decoupled.  CoCo VM page transitions can proceed
without needing to handle architecture-specific exceptions and fix
things up. This decoupling reduces the complexity due to separate
TDX and SEV-SNP fixup paths, and gives more freedom to revise and
introduce new capabilities in future versions of the TDX and SEV-SNP
architectures. Paravisor scenarios work properly without needing
to forward exceptions.

This patch set is follow-up to an RFC patch and discussion.[1]
Compared with the RFC patch, the steps listed above are optimized for
better performance and particularly for fewer TLB flushes.

Patch 1 handles implications of the hypervisor callbacks in Step 6
needing to do virt-to-phys translations on pages that are temporarily
marked not present.

Patch 2 is a performance optimization so that Step 7 doesn't generate
a TLB flush.

Patch 3 is the core change that implements Steps 1 thru 7. It also
simplifies the associated TDX, SEV-SNP, and Hyper-V vTOM callbacks.

Patch 4 is a somewhat tangential cleanup that removes an unnecessary
wrapper function in the path for doing a transition.

Patch 5 adds comments describing the implications of errors when
doing a transition.  These implications are discussed in the email
thread for the RFC patch.

With this change, the #VE and #VC exception handlers should no longer
be triggered for load_unaligned_zeropad() accesses, and the existing
code in those handlers to do the "fixup" shouldn't be needed. But I
have not removed that code in this patch set. Kirill Shutemov wants
to keep the code for TDX #VE, so the code for #VC on the the SEV-SNP
side has also been kept.

This patch set is based on the linux-next20230921 code tree.

[1] https://lore.kernel.org/lkml/1688661719-60329-1-git-send-email-mikelley@microsoft.com/

Michael Kelley (5):
  x86/coco: Use slow_virt_to_phys() in page transition hypervisor
    callbacks
  x86/mm: Don't do a TLB flush if changing a PTE that isn't marked
    present
  x86/mm: Mark CoCo VM pages not present while changing encrypted state
  x86/mm: Remove unnecessary call layer for __set_memory_enc_pgtable()
  x86/mm: Add comments about errors in
    set_memory_decrypted()/encrypted()

 arch/x86/coco/tdx/tdx.c       |  66 +-----------------------
 arch/x86/hyperv/ivm.c         |  15 +++---
 arch/x86/kernel/sev.c         |   8 ++-
 arch/x86/kernel/x86_init.c    |   4 --
 arch/x86/mm/mem_encrypt_amd.c |  27 +++-------
 arch/x86/mm/pat/set_memory.c  | 114 +++++++++++++++++++++++++++++-------------
 6 files changed, 102 insertions(+), 132 deletions(-)

-- 
1.8.3.1


             reply	other threads:[~2023-09-29 18:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-29 18:19 Michael Kelley [this message]
2023-09-29 18:19 ` [PATCH 1/5] x86/coco: Use slow_virt_to_phys() in page transition hypervisor callbacks Michael Kelley
2023-10-02 15:52   ` Tom Lendacky
2023-09-29 18:19 ` [PATCH 2/5] x86/mm: Don't do a TLB flush if changing a PTE that isn't marked present Michael Kelley
2023-09-29 18:19 ` [PATCH 3/5] x86/mm: Mark CoCo VM pages not present while changing encrypted state Michael Kelley
2023-09-29 23:13   ` kernel test robot
2023-10-02 16:35   ` Tom Lendacky
2023-10-02 18:59     ` Tom Lendacky
2023-10-02 20:43       ` Michael Kelley (LINUX)
2023-10-17  0:35         ` Michael Kelley (LINUX)
2023-09-29 18:19 ` [PATCH 4/5] x86/mm: Remove unnecessary call layer for __set_memory_enc_pgtable() Michael Kelley
2023-09-29 18:19 ` [PATCH 5/5] x86/mm: Add comments about errors in set_memory_decrypted()/encrypted() Michael Kelley
2023-10-02 15:42 ` [PATCH 0/5] x86/coco: Mark CoCo VM pages not present when changing encrypted state Tom Lendacky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1696011549-28036-1-git-send-email-mikelley@microsoft.com \
    --to=mikelley@microsoft.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).