[PATCH v3 0/3] TDX Guest Quote generation support

Linux-Doc Archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/3] TDX Guest Quote generation support
@ 2023-05-14  7:23 Kuppuswamy Sathyanarayanan
  2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
                   ` (4 more replies)
  0 siblings, 5 replies; 41+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-05-14  7:23 UTC (permalink / raw
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

Hi All,

In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest.

The TDX guest attestation process consists of two steps:

1. TDREPORT generation
2. Quote generation.

The First step (TDREPORT generation) involves getting the TDX guest
measurement data in the format of TDREPORT which is further used to
validate the authenticity of the TDX guest. The second step involves
sending the TDREPORT to a Quoting Enclave (QE) server to generate a
remotely verifiable Quote. TDREPORT by design can only be verified on
the local platform. To support remote verification of the TDREPORT,
TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
locally and convert it to a remotely verifiable Quote. Although
attestation software can use communication methods like TCP/IP or
vsock to send the TDREPORT to QE, not all platforms support these
communication models. So TDX GHCI specification [1] defines a method
for Quote generation via hypercalls. Please check the discussion from
Google [2] and Alibaba [3] which clarifies the need for hypercall based
Quote generation support. This patch set adds this support.

Support for TDREPORT generation already exists in the TDX guest driver. 
This patchset extends the same driver to add the Quote generation
support.

Following are the details of the patch set:

Patch 1/3 -> Adds event notification IRQ support.
Patch 2/3 -> Adds Quote generation support.
Patch 3/3 -> Adds selftest support for Quote generation feature.

[1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
[2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_kA79mg@mail.gmail.com/
[3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.alibaba.com/

Kuppuswamy Sathyanarayanan (3):
  x86/tdx: Add TDX Guest event notify interrupt support
  virt: tdx-guest: Add Quote generation support
  selftests/tdx: Test GetQuote TDX attestation feature

 Documentation/virt/coco/tdx-guest.rst        |  11 ++
 arch/x86/coco/tdx/tdx.c                      | 194 +++++++++++++++++++
 arch/x86/include/asm/tdx.h                   |   8 +
 drivers/virt/coco/tdx-guest/tdx-guest.c      | 175 ++++++++++++++++-
 include/uapi/linux/tdx-guest.h               |  44 +++++
 tools/testing/selftests/tdx/tdx_guest_test.c |  65 ++++++-
 6 files changed, 490 insertions(+), 7 deletions(-)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-05-14  7:23 ` Kuppuswamy Sathyanarayanan
  2023-06-12 12:49   ` Huang, Kai
  2023-08-23 20:47   ` Thomas Gleixner
  2023-05-14  7:23 ` [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 41+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-05-14  7:23 UTC (permalink / raw
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

Host-guest event notification via configured interrupt vector is useful
in cases where a guest makes an asynchronous request and needs a
callback from the host to indicate the completion or to let the host
notify the guest about events like device removal. One usage example is,
callback requirement of GetQuote asynchronous hypercall.

In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
guest to specify which interrupt vector to use as an event-notify
vector from the VMM. Details about the SetupEventNotifyInterrupt
hypercall can be found in TDX Guest-Host Communication Interface
(GHCI) Specification, section "VP.VMCALL<SetupEventNotifyInterrupt>".

As per design, VMM will post the event completion IRQ using the same
CPU on which SetupEventNotifyInterrupt hypercall request is received.
So allocate an IRQ vector from "x86_vector_domain", and set the CPU
affinity of the IRQ vector to the CPU on which
SetupEventNotifyInterrupt hypercall is made.

Add tdx_register_event_irq_cb()/tdx_unregister_event_irq_cb()
interfaces to allow drivers to register/unregister event notification
handlers.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Erdem Aktas <erdemaktas@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v1:
 * Used early_initcall() instead of arch_initcall() to trigger
   tdx_event_irq_init().
 * Removed unused headers and included headers for spinlock and list
   explicitly.
 * Since during the early_initcall() only one CPU would be enabled, remove
   CPU locking logic (like using set_cpus_allowed_ptr() or get_cpu())

Changes since v2:
 * Fixed commit log and comments as per review suggestions.
 * Used __irq_domain_alloc_irqs() instead of irq_domain_alloc_irqs() to
   pass affinity mask.
 * Used smp_processor_id() instead of hardcoding CPU0 as per review
   suggestion.

 arch/x86/coco/tdx/tdx.c    | 159 +++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/tdx.h |   6 ++
 2 files changed, 165 insertions(+)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index e146b599260f..49d7e11066c1 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,12 +7,17 @@
 #include <linux/cpufeature.h>
 #include <linux/export.h>
 #include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/list.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
 #include <asm/coco.h>
 #include <asm/tdx.h>
 #include <asm/vmx.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 #include <asm/pgtable.h>
+#include <asm/irqdomain.h>
 
 /* TDX module Call Leaf IDs */
 #define TDX_GET_INFO			1
@@ -27,6 +32,7 @@
 /* TDX hypercall Leaf IDs */
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
+#define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
 
 /* MMIO direction */
 #define EPT_READ	0
@@ -51,6 +57,16 @@
 
 #define TDREPORT_SUBTYPE_0	0
 
+struct event_irq_entry {
+	tdx_event_irq_cb_t handler;
+	void *data;
+	struct list_head head;
+};
+
+static int tdx_event_irq __ro_after_init;
+static LIST_HEAD(event_irq_cb_list);
+static DEFINE_SPINLOCK(event_irq_cb_lock);
+
 /*
  * Wrapper for standard use of __tdx_hypercall with no output aside from
  * return code.
@@ -873,3 +889,146 @@ void __init tdx_early_init(void)
 
 	pr_info("Guest detected\n");
 }
+
+static irqreturn_t tdx_event_irq_handler(int irq, void *dev_id)
+{
+	struct event_irq_entry *entry;
+
+	spin_lock(&event_irq_cb_lock);
+	list_for_each_entry(entry, &event_irq_cb_list, head) {
+		if (entry->handler)
+			entry->handler(entry->data);
+	}
+	spin_unlock(&event_irq_cb_lock);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * tdx_event_irq_init() - Register IRQ for event notification from the VMM to
+ *			  the TDX Guest.
+ *
+ * Use SetupEventNotifyInterrupt TDVMCALL to register the event notification
+ * IRQ with the VMM, which is used by the VMM to notify the TDX guest when
+ * needed, for instance, when VMM finishes the GetQuote request from the TDX
+ * guest. The VMM always notifies the TDX guest via the same CPU that calls
+ * the SetupEventNotifyInterrupt TDVMCALL. Allocate an IRQ/vector from the
+ * x86_vector_domain and pin it on the same CPU on which TDVMCALL is called.
+ * For simplicity, use early_initcall() to allow both IRQ allocation and
+ * TDVMCALL to use BSP.
+ */
+static int __init tdx_event_irq_init(void)
+{
+	struct irq_affinity_desc desc;
+	struct irq_alloc_info info;
+	struct irq_cfg *cfg;
+	int irq;
+
+	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+		return 0;
+
+	init_irq_alloc_info(&info, NULL);
+
+	cpumask_set_cpu(smp_processor_id(), &desc.mask);
+
+	irq = __irq_domain_alloc_irqs(x86_vector_domain, -1, 1,
+				      cpu_to_node(smp_processor_id()), &info,
+				      false, &desc);
+	if (irq <= 0) {
+		pr_err("Event notification IRQ allocation failed %d\n", irq);
+		return -EIO;
+	}
+
+	irq_set_handler(irq, handle_edge_irq);
+
+	/*
+	 * The IRQ cannot be migrated because VMM always notifies the TDX
+	 * guest on the same CPU on which the SetupEventNotifyInterrupt
+	 * TDVMCALL is called. Set the IRQ with IRQF_NOBALANCING to prevent
+	 * its affinity from being changed.
+	 */
+	if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING,
+			"tdx_event_irq", NULL)) {
+		pr_err("Event notification IRQ request failed\n");
+		goto err_free_domain_irqs;
+	}
+
+	cfg = irq_cfg(irq);
+
+	if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) {
+		pr_err("Event notification hypercall failed\n");
+		goto err_free_irqs;
+	}
+
+	tdx_event_irq = irq;
+
+	return 0;
+
+err_free_irqs:
+	free_irq(irq, NULL);
+err_free_domain_irqs:
+	irq_domain_free_irqs(irq, 1);
+
+	return -EIO;
+}
+early_initcall(tdx_event_irq_init)
+
+/**
+ * tdx_register_event_irq_cb() - Register TDX event IRQ callback handler.
+ * @handler: Address of driver specific event IRQ callback handler. Handler
+ *           will be called in IRQ context and hence cannot sleep.
+ * @data: Context data to be passed to the callback handler.
+ *
+ * Return: 0 on success or standard error code on other failures.
+ */
+int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data)
+{
+	struct event_irq_entry *entry;
+	unsigned long flags;
+
+	if (tdx_event_irq <= 0)
+		return -EIO;
+
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return -ENOMEM;
+
+	entry->data = data;
+	entry->handler = handler;
+
+	spin_lock_irqsave(&event_irq_cb_lock, flags);
+	list_add_tail(&entry->head, &event_irq_cb_list);
+	spin_unlock_irqrestore(&event_irq_cb_lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(tdx_register_event_irq_cb);
+
+/**
+ * tdx_unregister_event_irq_cb() - Unregister TDX event IRQ callback handler.
+ * @handler: Address of driver specific event IRQ callback handler.
+ * @data: Context data to be passed to the callback handler.
+ *
+ * Return: 0 on success or -EIO if event IRQ is not allocated.
+ */
+int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data)
+{
+	struct event_irq_entry *entry;
+	unsigned long flags;
+
+	if (tdx_event_irq <= 0)
+		return -EIO;
+
+	spin_lock_irqsave(&event_irq_cb_lock, flags);
+	list_for_each_entry(entry, &event_irq_cb_list, head) {
+		if (entry->handler == handler && entry->data == data) {
+			list_del(&entry->head);
+			kfree(entry);
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&event_irq_cb_lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(tdx_unregister_event_irq_cb);
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 28d889c9aa16..8807fe1b1f3f 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -53,6 +53,8 @@ struct ve_info {
 
 #ifdef CONFIG_INTEL_TDX_GUEST
 
+typedef int (*tdx_event_irq_cb_t)(void *);
+
 void __init tdx_early_init(void);
 
 /* Used to communicate with the TDX module */
@@ -69,6 +71,10 @@ bool tdx_early_handle_ve(struct pt_regs *regs);
 
 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport);
 
+int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
+
+int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
+
 #else
 
 static inline void tdx_early_init(void) { };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support
  2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
  2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
@ 2023-05-14  7:23 ` Kuppuswamy Sathyanarayanan
  2023-06-12 12:50   ` Huang, Kai
  2023-05-14  7:23 ` [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-05-14  7:23 UTC (permalink / raw
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest. The First step in the attestation process is TDREPORT
generation, which involves getting the guest measurement data in the
format of TDREPORT, which is further used to validate the authenticity
of the TDX guest. TDREPORT by design is integrity-protected and can
only be verified on the local machine.

To support remote verification of the TDREPORT (in a SGX-based
attestation), the TDREPORT needs to be sent to the SGX Quoting Enclave
(QE) to convert it to a remote verifiable Quote. SGX QE by design can
only run outside of the TDX guest (i.e. in a host process or in a
normal VM) and guest can use communication channels like vsock or
TCP/IP to send the TDREPORT to the QE. But for security concerns, the
TDX guest may not support these communication channels. To handle such
cases, TDX defines a GetQuote hypercall which can be used by the guest
to request the host VMM to communicate with the SGX QE. More details
about GetQuote hypercall can be found in TDX Guest-Host Communication
Interface (GHCI) for Intel TDX 1.0, section titled
"TDG.VP.VMCALL<GetQuote>".

Add support for TDX_CMD_GET_QUOTE IOCTL to allow an attestation agent
to submit GetQuote requests from the user space using GetQuote
hypercall.

Since GetQuote is an asynchronous request hypercall, VMM will use the
callback interrupt vector configured by the SetupEventNotifyInterrupt
hypercall to notify the guest about Quote generation completion or
failure. So register an IRQ handler for it.

GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
with TDREPORT data as input, which is further used by the VMM to copy
the TD Quote result after successful Quote generation. To create the
shared buffer, allocate a large enough memory and mark it shared using
set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
for GetQuote requests in TDX_CMD_GET_QUOTE IOCTL handler.

Although this method reserves a fixed chunk of memory for GetQuote
requests, such one-time allocation is preferable to the alternative
choice of repeatedly allocating/freeing the shared buffer in the
TDX_CMD_GET_QUOTE IOCTL handler, which will damage the direct map
(because the sharing/unsharing process modifies the direct map). This
allocation model is similar to that used by the AMD SEV guest driver.

Since the Quote generation process is not time-critical or frequently
used, the current version does not support parallel GetQuote requests.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Erdem Aktas <erdemaktas@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v2:
 * Fixed commit log and comments as per review suggestion.
 * Renamed struct tdx_quote_hdr -> struct tdx_quote_buf.
 * Used alloc_pages_exact() instead of alloc_pages() in
   alloc_shared_pages().
 * Clear quote buffer before each re-use in tdx_get_quote().

Changes since v1:
 * Removed platform bus device support.
 * Instead of allocating the shared buffers using DMA APIs in IOCTL
   handler, allocated it once in tdx_guest_init() and re-used it in
   GetQuote IOCTL handler.
 * To simplify the design, removed the support for parallel GetQuote
   requests. It can be added when there is a real requirement for it.
 * Fixed commit log and comments to reflect the latest changes.

 Documentation/virt/coco/tdx-guest.rst   |  11 ++
 arch/x86/coco/tdx/tdx.c                 |  35 +++++
 arch/x86/include/asm/tdx.h              |   2 +
 drivers/virt/coco/tdx-guest/tdx-guest.c | 175 +++++++++++++++++++++++-
 include/uapi/linux/tdx-guest.h          |  44 ++++++
 5 files changed, 266 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/coco/tdx-guest.rst b/Documentation/virt/coco/tdx-guest.rst
index 46e316db6bb4..35e829466265 100644
--- a/Documentation/virt/coco/tdx-guest.rst
+++ b/Documentation/virt/coco/tdx-guest.rst
@@ -42,6 +42,17 @@ ABI. However, in the future, if the TDX Module supports more than one subtype,
 a new IOCTL CMD will be created to handle it. To keep the IOCTL naming
 consistent, a subtype index is added as part of the IOCTL CMD.
 
+2.2 TDX_CMD_GET_QUOTE
+----------------------
+
+:Input parameters: struct tdx_quote_req
+:Output: Return 0 on success, -EIO on TDCALL failure or standard error number
+         on common failures. Upon successful execution, Quote data is copied
+         to tdx_quote_req.buf.
+
+The TDX_CMD_GET_QUOTE IOCTL can be used by attestation software to generate
+Quote for the given TDREPORT using TDG.VP.VMCALL<GetQuote> hypercall.
+
 Reference
 ---------
 
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 49d7e11066c1..7e0af10515a6 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -33,6 +33,7 @@
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
 #define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
+#define TDVMCALL_GET_QUOTE		0x10002
 
 /* MMIO direction */
 #define EPT_READ	0
@@ -198,6 +199,40 @@ static void __noreturn tdx_panic(const char *msg)
 		__tdx_hypercall(&args);
 }
 
+/**
+ * tdx_hcall_get_quote() - Wrapper to request TD Quote using GetQuote
+ *                         hypercall.
+ * @buf: Address of the directly mapped shared kernel buffer which
+ *	 contains TDREPORT data. The same buffer will be used by
+ *	 VMM to store the generated TD Quote output.
+ * @size: size of the tdquote buffer (4KB-aligned).
+ *
+ * Refer to section titled "TDG.VP.VMCALL<GetQuote>" in the TDX GHCI
+ * v1.0 specification for more information on GetQuote hypercall.
+ * It is used in the TDX guest driver module to get the TD Quote.
+ *
+ * Return 0 on success or error code on failure.
+ */
+int tdx_hcall_get_quote(u8 *buf, size_t size)
+{
+	struct tdx_hypercall_args args = {0};
+
+	args.r10 = TDX_HYPERCALL_STANDARD;
+	args.r11 = TDVMCALL_GET_QUOTE;
+	/* Since buf is a shared memory, set the shared (decrypted) bits */
+	args.r12 = cc_mkdec(virt_to_phys(buf));
+	args.r13 = size;
+
+	/*
+	 * Pass the physical address of TDREPORT to the VMM and
+	 * trigger the Quote generation. It is not a blocking
+	 * call, hence completion of this request will be notified to
+	 * the TD guest via a callback interrupt.
+	 */
+	return __tdx_hypercall(&args);
+}
+EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
+
 static void tdx_parse_tdinfo(u64 *cc_mask)
 {
 	struct tdx_module_output out;
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 8807fe1b1f3f..254468448a1b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -75,6 +75,8 @@ int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
 
 int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
 
+int tdx_hcall_get_quote(u8 *buf, size_t size);
+
 #else
 
 static inline void tdx_early_init(void) { };
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
index 5e44a0fa69bd..388491fa63a1 100644
--- a/drivers/virt/coco/tdx-guest/tdx-guest.c
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -12,12 +12,106 @@
 #include <linux/mod_devicetable.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
+#include <linux/set_memory.h>
 
 #include <uapi/linux/tdx-guest.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/tdx.h>
 
+/*
+ * Intel's SGX QE implementation generally uses Quote size less
+ * than 8K; Use 16K as MAX size to handle future updates and other
+ * 3rd party implementations.
+ */
+#define GET_QUOTE_MAX_SIZE		(4 * PAGE_SIZE)
+
+/**
+ * struct quote_entry - Quote request struct
+ * @valid: Flag to check validity of the GetQuote request.
+ * @buf: Kernel buffer to share data with VMM (size is page aligned).
+ * @buf_len: Size of the buf in bytes.
+ * @compl: Completion object to track completion of GetQuote request.
+ */
+struct quote_entry {
+	bool valid;
+	void *buf;
+	size_t buf_len;
+	struct completion compl;
+};
+
+/* Quote data entry */
+static struct quote_entry *qentry;
+
+/* Lock to streamline quote requests */
+static DEFINE_MUTEX(quote_lock);
+
+static int quote_cb_handler(void *dev_id)
+{
+	struct quote_entry *entry = dev_id;
+	struct tdx_quote_buf *quote_buf = entry->buf;
+
+	if (entry->valid && quote_buf->status != GET_QUOTE_IN_FLIGHT)
+		complete(&entry->compl);
+
+	return 0;
+}
+
+static void free_shared_pages(void *buf, size_t len)
+{
+	unsigned int count = PAGE_ALIGN(len) >> PAGE_SHIFT;
+
+	set_memory_encrypted((unsigned long)buf, count);
+
+	free_pages_exact(buf, PAGE_ALIGN(len));
+}
+
+static void *alloc_shared_pages(size_t len)
+{
+	unsigned int count = PAGE_ALIGN(len) >> PAGE_SHIFT;
+	void *addr;
+	int ret;
+
+	addr = alloc_pages_exact(len, GFP_KERNEL);
+	if (!addr)
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)addr, count);
+	if (ret) {
+		free_pages_exact(addr, PAGE_ALIGN(len));
+		return NULL;
+	}
+
+	return addr;
+}
+
+static struct quote_entry *alloc_quote_entry(size_t len)
+{
+	struct quote_entry *entry = NULL;
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return NULL;
+
+	entry->buf = alloc_shared_pages(len);
+	if (!entry->buf) {
+		kfree(entry);
+		return NULL;
+	}
+
+	entry->buf_len = PAGE_ALIGN(len);
+	init_completion(&entry->compl);
+	entry->valid = false;
+
+	return entry;
+}
+
+static void free_quote_entry(struct quote_entry *entry)
+{
+	free_shared_pages(entry->buf, entry->buf_len);
+	kfree(entry);
+}
+
 static long tdx_get_report0(struct tdx_report_req __user *req)
 {
 	u8 *reportdata, *tdreport;
@@ -53,12 +147,65 @@ static long tdx_get_report0(struct tdx_report_req __user *req)
 	return ret;
 }
 
+static long tdx_get_quote(struct tdx_quote_req __user *ureq)
+{
+	struct tdx_quote_req req;
+	long ret;
+
+	if (copy_from_user(&req, ureq, sizeof(req)))
+		return -EFAULT;
+
+	mutex_lock(&quote_lock);
+
+	if (!req.len || req.len > qentry->buf_len) {
+		ret = -EINVAL;
+		goto quote_failed;
+	}
+
+	memset(qentry->buf, 0, qentry->buf_len);
+	reinit_completion(&qentry->compl);
+	qentry->valid = true;
+
+	if (copy_from_user(qentry->buf, (void __user *)req.buf, req.len)) {
+		ret = -EFAULT;
+		goto quote_failed;
+	}
+
+	/* Submit GetQuote Request using GetQuote hypercall */
+	ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
+	if (ret) {
+		pr_err("GetQuote hypercall failed, status:%lx\n", ret);
+		ret = -EIO;
+		goto quote_failed;
+	}
+
+	/*
+	 * Although the GHCI specification does not state explicitly that
+	 * the VMM must not wait indefinitely for the Quote request to be
+	 * completed, a sane VMM should always notify the guest after a
+	 * certain time, regardless of whether the Quote generation is
+	 * successful or not.  For now just assume the VMM will do so.
+	 */
+	wait_for_completion(&qentry->compl);
+
+	if (copy_to_user((void __user *)req.buf, qentry->buf, req.len))
+		ret = -EFAULT;
+
+quote_failed:
+	qentry->valid = false;
+	mutex_unlock(&quote_lock);
+
+	return ret;
+}
+
 static long tdx_guest_ioctl(struct file *file, unsigned int cmd,
 			    unsigned long arg)
 {
 	switch (cmd) {
 	case TDX_CMD_GET_REPORT0:
 		return tdx_get_report0((struct tdx_report_req __user *)arg);
+	case TDX_CMD_GET_QUOTE:
+		return tdx_get_quote((struct tdx_quote_req *)arg);
 	default:
 		return -ENOTTY;
 	}
@@ -84,15 +231,41 @@ MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
 
 static int __init tdx_guest_init(void)
 {
+	int ret;
+
 	if (!x86_match_cpu(tdx_guest_ids))
 		return -ENODEV;
 
-	return misc_register(&tdx_misc_dev);
+	ret = misc_register(&tdx_misc_dev);
+	if (ret)
+		return ret;
+
+	qentry = alloc_quote_entry(GET_QUOTE_MAX_SIZE);
+	if (!qentry) {
+		pr_err("Failed to allocate Quote buffer\n");
+		ret = -ENOMEM;
+		goto free_misc;
+	}
+
+	ret = tdx_register_event_irq_cb(quote_cb_handler, qentry);
+	if (ret)
+		goto free_quote;
+
+	return 0;
+
+free_quote:
+	free_quote_entry(qentry);
+free_misc:
+	misc_deregister(&tdx_misc_dev);
+
+	return ret;
 }
 module_init(tdx_guest_init);
 
 static void __exit tdx_guest_exit(void)
 {
+	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
+	free_quote_entry(qentry);
 	misc_deregister(&tdx_misc_dev);
 }
 module_exit(tdx_guest_exit);
diff --git a/include/uapi/linux/tdx-guest.h b/include/uapi/linux/tdx-guest.h
index a6a2098c08ff..8a6ade299090 100644
--- a/include/uapi/linux/tdx-guest.h
+++ b/include/uapi/linux/tdx-guest.h
@@ -17,6 +17,12 @@
 /* Length of TDREPORT used in TDG.MR.REPORT TDCALL */
 #define TDX_REPORT_LEN                  1024
 
+/* TD Quote status codes */
+#define GET_QUOTE_SUCCESS               0
+#define GET_QUOTE_IN_FLIGHT             0xffffffffffffffff
+#define GET_QUOTE_ERROR                 0x8000000000000000
+#define GET_QUOTE_SERVICE_UNAVAILABLE   0x8000000000000001
+
 /**
  * struct tdx_report_req - Request struct for TDX_CMD_GET_REPORT0 IOCTL.
  *
@@ -30,6 +36,36 @@ struct tdx_report_req {
 	__u8 tdreport[TDX_REPORT_LEN];
 };
 
+/* struct tdx_quote_buf: Format of Quote request buffer.
+ * @version: Quote format version, filled by TD.
+ * @status: Status code of Quote request, filled by VMM.
+ * @in_len: Length of TDREPORT, filled by TD.
+ * @out_len: Length of Quote data, filled by VMM.
+ * @data: Quote data on output or TDREPORT on input.
+ *
+ * More details of Quote request buffer can be found in TDX
+ * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
+ * section titled "TDG.VP.VMCALL<GetQuote>"
+ */
+struct tdx_quote_buf {
+	__u64 version;
+	__u64 status;
+	__u32 in_len;
+	__u32 out_len;
+	__u64 data[];
+};
+
+/* struct tdx_quote_req: Request struct for TDX_CMD_GET_QUOTE IOCTL.
+ * @buf: Address of user buffer in the format of struct tdx_quote_buf.
+ *	 Upon successful completion of IOCTL, output is copied back to
+ *	 the same buffer (in struct tdx_quote_buf.data).
+ * @len: Length of the Quote buffer.
+ */
+struct tdx_quote_req {
+	__u64 buf;
+	__u64 len;
+};
+
 /*
  * TDX_CMD_GET_REPORT0 - Get TDREPORT0 (a.k.a. TDREPORT subtype 0) using
  *                       TDCALL[TDG.MR.REPORT]
@@ -39,4 +75,12 @@ struct tdx_report_req {
  */
 #define TDX_CMD_GET_REPORT0              _IOWR('T', 1, struct tdx_report_req)
 
+/*
+ * TDX_CMD_GET_QUOTE - Get TD Guest Quote from QE/QGS using GetQuote
+ *		       TDVMCALL.
+ *
+ * Returns 0 on success or standard errno on other failures.
+ */
+#define TDX_CMD_GET_QUOTE		_IOWR('T', 2, struct tdx_quote_req)
+
 #endif /* _UAPI_LINUX_TDX_GUEST_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
  2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
  2023-05-14  7:23 ` [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-05-14  7:23 ` Kuppuswamy Sathyanarayanan
  2023-06-12 19:03   ` Dan Williams
  2023-05-24 21:33 ` [PATCH v3 0/3] TDX Guest Quote generation support Chong Cai
  2023-06-24  4:05 ` Dan Williams
  4 siblings, 1 reply; 41+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-05-14  7:23 UTC (permalink / raw
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

In TDX guest, the second stage of the attestation process is Quote
generation. This process is required to convert the locally generated
TDREPORT into a remotely verifiable Quote. It involves sending the
TDREPORT data to a Quoting Enclave (QE) which will verify the
integrity of the TDREPORT and sign it with an attestation key.

Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
allow the user agent to get the TD Quote.

Add a kernel selftest module to verify the Quote generation feature.

TD Quote generation involves following steps:

* Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
* Embed the TDREPORT data in quote buffer and request for quote
  generation via TDX_CMD_GET_QUOTE IOCTL request.
* Upon completion of the GetQuote request, check for non zero value
  in the status field of Quote header to make sure the generated
  quote is valid.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Erdem Aktas <erdemaktas@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v2:
 * Adapted to struct tdx_quote_hdr -> struct tdx_quote_buf rename.

 tools/testing/selftests/tdx/tdx_guest_test.c | 65 ++++++++++++++++++--
 1 file changed, 59 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/tdx/tdx_guest_test.c b/tools/testing/selftests/tdx/tdx_guest_test.c
index 81d8cb88ea1a..0b4b4293b9cb 100644
--- a/tools/testing/selftests/tdx/tdx_guest_test.c
+++ b/tools/testing/selftests/tdx/tdx_guest_test.c
@@ -18,6 +18,7 @@
 #define TDX_GUEST_DEVNAME "/dev/tdx_guest"
 #define HEX_DUMP_SIZE 8
 #define DEBUG 0
+#define QUOTE_SIZE 8192
 
 /**
  * struct tdreport_type - Type header of TDREPORT_STRUCT.
@@ -128,21 +129,29 @@ static void print_array_hex(const char *title, const char *prefix_str,
 	printf("\n");
 }
 
+/* Helper function to get TDREPORT */
+long get_tdreport0(int devfd, struct tdx_report_req *req)
+{
+	int i;
+
+	/* Generate sample report data */
+	for (i = 0; i < TDX_REPORTDATA_LEN; i++)
+		req->reportdata[i] = i;
+
+	return ioctl(devfd, TDX_CMD_GET_REPORT0, req);
+}
+
 TEST(verify_report)
 {
 	struct tdx_report_req req;
 	struct tdreport *tdreport;
-	int devfd, i;
+	int devfd;
 
 	devfd = open(TDX_GUEST_DEVNAME, O_RDWR | O_SYNC);
 	ASSERT_LT(0, devfd);
 
-	/* Generate sample report data */
-	for (i = 0; i < TDX_REPORTDATA_LEN; i++)
-		req.reportdata[i] = i;
-
 	/* Get TDREPORT */
-	ASSERT_EQ(0, ioctl(devfd, TDX_CMD_GET_REPORT0, &req));
+	ASSERT_EQ(0, get_tdreport0(devfd, &req));
 
 	if (DEBUG) {
 		print_array_hex("\n\t\tTDX report data\n", "",
@@ -160,4 +169,48 @@ TEST(verify_report)
 	ASSERT_EQ(0, close(devfd));
 }
 
+TEST(verify_quote)
+{
+	struct tdx_quote_buf *quote_buf;
+	struct tdx_report_req rep_req;
+	struct tdx_quote_req req;
+	__u64 quote_buf_size;
+	int devfd;
+
+	/* Open attestation device */
+	devfd = open(TDX_GUEST_DEVNAME, O_RDWR | O_SYNC);
+
+	ASSERT_LT(0, devfd);
+
+	/* Add size for quote header */
+	quote_buf_size = sizeof(*quote_buf) + QUOTE_SIZE;
+
+	/* Allocate quote buffer */
+	quote_buf = (struct tdx_quote_buf *)malloc(quote_buf_size);
+	ASSERT_NE(NULL, quote_buf);
+
+	/* Initialize GetQuote header */
+	quote_buf->version = 1;
+	quote_buf->status  = GET_QUOTE_SUCCESS;
+	quote_buf->in_len  = TDX_REPORT_LEN;
+	quote_buf->out_len = 0;
+
+	/* Get TDREPORT data */
+	ASSERT_EQ(0, get_tdreport0(devfd, &rep_req));
+
+	/* Fill GetQuote request */
+	memcpy(quote_buf->data, rep_req.tdreport, TDX_REPORT_LEN);
+	req.buf	  = (__u64)quote_buf;
+	req.len	  = quote_buf_size;
+
+	ASSERT_EQ(0, ioctl(devfd, TDX_CMD_GET_QUOTE, &req));
+
+	/* Check whether GetQuote request is successful */
+	EXPECT_EQ(0, quote_buf->status);
+
+	free(quote_buf);
+
+	ASSERT_EQ(0, close(devfd));
+}
+
 TEST_HARNESS_MAIN
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
                   ` (2 preceding siblings ...)
  2023-05-14  7:23 ` [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
@ 2023-05-24 21:33 ` Chong Cai
  2023-05-25 22:55   ` Sathyanarayanan Kuppuswamy
  2023-06-24  4:05 ` Dan Williams
  4 siblings, 1 reply; 41+ messages in thread
From: Chong Cai @ 2023-05-24 21:33 UTC (permalink / raw
  To: Kuppuswamy Sathyanarayanan
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

Tested-by: Qinkun Bao <qinkun@google.com>

Thanks Sathyanarayanan for the new patch! This patch is critical for
our use case.
We built a guest image with the patch, and verified it works for us,
when using a host kernel built with https://github.com/intel/tdx repo.

On Sun, May 14, 2023 at 12:24 AM Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>
> Hi All,
>
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest.
>
> The TDX guest attestation process consists of two steps:
>
> 1. TDREPORT generation
> 2. Quote generation.
>
> The First step (TDREPORT generation) involves getting the TDX guest
> measurement data in the format of TDREPORT which is further used to
> validate the authenticity of the TDX guest. The second step involves
> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> remotely verifiable Quote. TDREPORT by design can only be verified on
> the local platform. To support remote verification of the TDREPORT,
> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> locally and convert it to a remotely verifiable Quote. Although
> attestation software can use communication methods like TCP/IP or
> vsock to send the TDREPORT to QE, not all platforms support these
> communication models. So TDX GHCI specification [1] defines a method
> for Quote generation via hypercalls. Please check the discussion from
> Google [2] and Alibaba [3] which clarifies the need for hypercall based
> Quote generation support. This patch set adds this support.
>
> Support for TDREPORT generation already exists in the TDX guest driver.
> This patchset extends the same driver to add the Quote generation
> support.
>
> Following are the details of the patch set:
>
> Patch 1/3 -> Adds event notification IRQ support.
> Patch 2/3 -> Adds Quote generation support.
> Patch 3/3 -> Adds selftest support for Quote generation feature.
>
> [1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
> [2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_kA79mg@mail.gmail.com/
> [3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.alibaba.com/
>
> Kuppuswamy Sathyanarayanan (3):
>   x86/tdx: Add TDX Guest event notify interrupt support
>   virt: tdx-guest: Add Quote generation support
>   selftests/tdx: Test GetQuote TDX attestation feature
>
>  Documentation/virt/coco/tdx-guest.rst        |  11 ++
>  arch/x86/coco/tdx/tdx.c                      | 194 +++++++++++++++++++
>  arch/x86/include/asm/tdx.h                   |   8 +
>  drivers/virt/coco/tdx-guest/tdx-guest.c      | 175 ++++++++++++++++-
>  include/uapi/linux/tdx-guest.h               |  44 +++++
>  tools/testing/selftests/tdx/tdx_guest_test.c |  65 ++++++-
>  6 files changed, 490 insertions(+), 7 deletions(-)
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-05-24 21:33 ` [PATCH v3 0/3] TDX Guest Quote generation support Chong Cai
@ 2023-05-25 22:55   ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-05-25 22:55 UTC (permalink / raw
  To: Chong Cai
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

Hi,

On 5/24/23 2:33 PM, Chong Cai wrote:
> Tested-by: Qinkun Bao <qinkun@google.com>
> 
> Thanks Sathyanarayanan for the new patch! This patch is critical for
> our use case.
> We built a guest image with the patch, and verified it works for us,
> when using a host kernel built with https://github.com/intel/tdx repo.

Qinkun Bao/Chong Cai, thanks for testing it. I really appreciate the help.

Dave/Boris, could you please take a look at this patch set?

> 
> On Sun, May 14, 2023 at 12:24 AM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>>
>> Hi All,
>>
>> In TDX guest, the attestation process is used to verify the TDX guest
>> trustworthiness to other entities before provisioning secrets to the
>> guest.
>>
>> The TDX guest attestation process consists of two steps:
>>
>> 1. TDREPORT generation
>> 2. Quote generation.
>>
>> The First step (TDREPORT generation) involves getting the TDX guest
>> measurement data in the format of TDREPORT which is further used to
>> validate the authenticity of the TDX guest. The second step involves
>> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
>> remotely verifiable Quote. TDREPORT by design can only be verified on
>> the local platform. To support remote verification of the TDREPORT,
>> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
>> locally and convert it to a remotely verifiable Quote. Although
>> attestation software can use communication methods like TCP/IP or
>> vsock to send the TDREPORT to QE, not all platforms support these
>> communication models. So TDX GHCI specification [1] defines a method
>> for Quote generation via hypercalls. Please check the discussion from
>> Google [2] and Alibaba [3] which clarifies the need for hypercall based
>> Quote generation support. This patch set adds this support.
>>
>> Support for TDREPORT generation already exists in the TDX guest driver.
>> This patchset extends the same driver to add the Quote generation
>> support.
>>
>> Following are the details of the patch set:
>>
>> Patch 1/3 -> Adds event notification IRQ support.
>> Patch 2/3 -> Adds Quote generation support.
>> Patch 3/3 -> Adds selftest support for Quote generation feature.
>>
>> [1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
>> [2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_kA79mg@mail.gmail.com/
>> [3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.alibaba.com/
>>
>> Kuppuswamy Sathyanarayanan (3):
>>   x86/tdx: Add TDX Guest event notify interrupt support
>>   virt: tdx-guest: Add Quote generation support
>>   selftests/tdx: Test GetQuote TDX attestation feature
>>
>>  Documentation/virt/coco/tdx-guest.rst        |  11 ++
>>  arch/x86/coco/tdx/tdx.c                      | 194 +++++++++++++++++++
>>  arch/x86/include/asm/tdx.h                   |   8 +
>>  drivers/virt/coco/tdx-guest/tdx-guest.c      | 175 ++++++++++++++++-
>>  include/uapi/linux/tdx-guest.h               |  44 +++++
>>  tools/testing/selftests/tdx/tdx_guest_test.c |  65 ++++++-
>>  6 files changed, 490 insertions(+), 7 deletions(-)
>>
>> --
>> 2.34.1
>>

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
@ 2023-06-12 12:49   ` Huang, Kai
  2023-08-23 20:47   ` Thomas Gleixner
  1 sibling, 0 replies; 41+ messages in thread
From: Huang, Kai @ 2023-06-12 12:49 UTC (permalink / raw
  To: corbet@lwn.net, dave.hansen@linux.intel.com, bp@alien8.de,
	shuah@kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	tglx@linutronix.de, x86@kernel.org, mingo@redhat.com
  Cc: linux-kselftest@vger.kernel.org, Yu, Guorui, qinkun@apache.org,
	wander@redhat.com, hpa@zytor.com, chongc@google.com, Aktas, Erdem,
	kirill.shutemov@linux.intel.com, Luck, Tony,
	dionnaglaze@google.com, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, Du, Fan

On Sun, 2023-05-14 at 00:23 -0700, Kuppuswamy Sathyanarayanan wrote:
> Host-guest event notification via configured interrupt vector is useful
> in cases where a guest makes an asynchronous request and needs a
> callback from the host to indicate the completion or to let the host
> notify the guest about events like device removal. One usage example is,
> callback requirement of GetQuote asynchronous hypercall.
> 
> In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
> guest to specify which interrupt vector to use as an event-notify
> vector from the VMM. Details about the SetupEventNotifyInterrupt
> hypercall can be found in TDX Guest-Host Communication Interface
> (GHCI) Specification, section "VP.VMCALL<SetupEventNotifyInterrupt>".
> 
> As per design, VMM will post the event completion IRQ using the same
> CPU on which SetupEventNotifyInterrupt hypercall request is received.
> So allocate an IRQ vector from "x86_vector_domain", and set the CPU
> affinity of the IRQ vector to the CPU on which
> SetupEventNotifyInterrupt hypercall is made.
> 
> Add tdx_register_event_irq_cb()/tdx_unregister_event_irq_cb()
> interfaces to allow drivers to register/unregister event notification
> handlers.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Wander Lairson Costa <wander@redhat.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 

Acked-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support
  2023-05-14  7:23 ` [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-06-12 12:50   ` Huang, Kai
  0 siblings, 0 replies; 41+ messages in thread
From: Huang, Kai @ 2023-06-12 12:50 UTC (permalink / raw
  To: corbet@lwn.net, dave.hansen@linux.intel.com, bp@alien8.de,
	shuah@kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	tglx@linutronix.de, x86@kernel.org, mingo@redhat.com
  Cc: linux-kselftest@vger.kernel.org, Yu, Guorui, qinkun@apache.org,
	wander@redhat.com, hpa@zytor.com, chongc@google.com, Aktas, Erdem,
	kirill.shutemov@linux.intel.com, Luck, Tony,
	dionnaglaze@google.com, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, Du, Fan

On Sun, 2023-05-14 at 00:23 -0700, Kuppuswamy Sathyanarayanan wrote:
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest. The First step in the attestation process is TDREPORT
> generation, which involves getting the guest measurement data in the
> format of TDREPORT, which is further used to validate the authenticity
> of the TDX guest. TDREPORT by design is integrity-protected and can
> only be verified on the local machine.
> 
> To support remote verification of the TDREPORT (in a SGX-based
> attestation), the TDREPORT needs to be sent to the SGX Quoting Enclave
> (QE) to convert it to a remote verifiable Quote. SGX QE by design can
> only run outside of the TDX guest (i.e. in a host process or in a
> normal VM) and guest can use communication channels like vsock or
> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> TDX guest may not support these communication channels. To handle such
> cases, TDX defines a GetQuote hypercall which can be used by the guest
> to request the host VMM to communicate with the SGX QE. More details
> about GetQuote hypercall can be found in TDX Guest-Host Communication
> Interface (GHCI) for Intel TDX 1.0, section titled
> "TDG.VP.VMCALL<GetQuote>".
> 
> Add support for TDX_CMD_GET_QUOTE IOCTL to allow an attestation agent
> to submit GetQuote requests from the user space using GetQuote
> hypercall.
> 
> Since GetQuote is an asynchronous request hypercall, VMM will use the
> callback interrupt vector configured by the SetupEventNotifyInterrupt
> hypercall to notify the guest about Quote generation completion or
> failure. So register an IRQ handler for it.
> 
> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> with TDREPORT data as input, which is further used by the VMM to copy
> the TD Quote result after successful Quote generation. To create the
> shared buffer, allocate a large enough memory and mark it shared using
> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> for GetQuote requests in TDX_CMD_GET_QUOTE IOCTL handler.
> 
> Although this method reserves a fixed chunk of memory for GetQuote
> requests, such one-time allocation is preferable to the alternative
> choice of repeatedly allocating/freeing the shared buffer in the
> TDX_CMD_GET_QUOTE IOCTL handler, which will damage the direct map
> (because the sharing/unsharing process modifies the direct map). This
> allocation model is similar to that used by the AMD SEV guest driver.
> 
> Since the Quote generation process is not time-critical or frequently
> used, the current version does not support parallel GetQuote requests.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 

Acked-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-05-14  7:23 ` [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
@ 2023-06-12 19:03   ` Dan Williams
  2023-06-19  5:38     ` Sathyanarayanan Kuppuswamy
                       ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-12 19:03 UTC (permalink / raw
  To: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, dhowells, brijesh.singh, atishp

[ add David, Brijesh, and Atish]

Kuppuswamy Sathyanarayanan wrote:
> In TDX guest, the second stage of the attestation process is Quote
> generation. This process is required to convert the locally generated
> TDREPORT into a remotely verifiable Quote. It involves sending the
> TDREPORT data to a Quoting Enclave (QE) which will verify the
> integrity of the TDREPORT and sign it with an attestation key.
> 
> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> allow the user agent to get the TD Quote.
> 
> Add a kernel selftest module to verify the Quote generation feature.
> 
> TD Quote generation involves following steps:
> 
> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> * Embed the TDREPORT data in quote buffer and request for quote
>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> * Upon completion of the GetQuote request, check for non zero value
>   in the status field of Quote header to make sure the generated
>   quote is valid.

What this cover letter does not say is that this is adding another
instance of the similar pattern as SNP_GET_REPORT.

Linux is best served when multiple vendors trying to do similar
operations are brought together behind a common ABI. We see this in the
history of wrangling SCSI vendors behind common interfaces. Now multiple
confidential computing vendors trying to develop similar flows with
differentiated formats where that differentiation need not leak over the
ABI boundary.

My observation of SNP_GET_REPORT and TDX_CMD_GET_REPORT is that they are
both passing blobs across the user/kernel and platform/kernel boundary
for the purposes of unlocking other resources. To me that is a flow that
the Keys subsystem has infrastructure to handle. It has the concept of
upcalls and asynchronous population of blobs by handles and mechanisms
to protect and cache those communications. Linux / the Keys subsystem
could benefit from the enhancements it would need to cover these 2
cases. Specifically, the benefit that when ARM and RISC-V arrive with
similar communications with platform TSMs (Trusted Security Module) they
can build upon the same infrastructure.

David, am I reaching with that association? My strawman mapping of
TDX_CMD_GET_QUOTE to request_key() is something like:

request_key(coco_quote, "description", "<uuencoded tdreport>")

Where this is a common key_type for all vendors, but the description and
arguments have room for vendor differentiation when doing the upcall to
the platform TSM, but userspace never needs to contend with the
different vendor formats, that is all handled internally to the kernel.

At this point I am just looking for confirmation that the "every vendor
invent a new character device + ioctl" does not scale and a deeper
conversation is needed. Keys is a plausible solution to that ABI
proliferation problem.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-12 19:03   ` Dan Williams
@ 2023-06-19  5:38     ` Sathyanarayanan Kuppuswamy
  2023-06-22 23:31     ` Erdem Aktas
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-19  5:38 UTC (permalink / raw
  To: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze, Chong Cai,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc, dhowells, brijesh.singh, atishp

Hi Dan,

On 6/12/23 12:03 PM, Dan Williams wrote:
> [ add David, Brijesh, and Atish]
> 
> Kuppuswamy Sathyanarayanan wrote:
>> In TDX guest, the second stage of the attestation process is Quote
>> generation. This process is required to convert the locally generated
>> TDREPORT into a remotely verifiable Quote. It involves sending the
>> TDREPORT data to a Quoting Enclave (QE) which will verify the
>> integrity of the TDREPORT and sign it with an attestation key.
>>
>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
>> allow the user agent to get the TD Quote.
>>
>> Add a kernel selftest module to verify the Quote generation feature.
>>
>> TD Quote generation involves following steps:
>>
>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
>> * Embed the TDREPORT data in quote buffer and request for quote
>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
>> * Upon completion of the GetQuote request, check for non zero value
>>   in the status field of Quote header to make sure the generated
>>   quote is valid.
> 
> What this cover letter does not say is that this is adding another
> instance of the similar pattern as SNP_GET_REPORT.
> 
> Linux is best served when multiple vendors trying to do similar
> operations are brought together behind a common ABI. We see this in the
> history of wrangling SCSI vendors behind common interfaces. Now multiple
> confidential computing vendors trying to develop similar flows with
> differentiated formats where that differentiation need not leak over the
> ABI boundary.
> 
> My observation of SNP_GET_REPORT and TDX_CMD_GET_REPORT is that they are
> both passing blobs across the user/kernel and platform/kernel boundary
> for the purposes of unlocking other resources. To me that is a flow that
> the Keys subsystem has infrastructure to handle. It has the concept of
> upcalls and asynchronous population of blobs by handles and mechanisms
> to protect and cache those communications. Linux / the Keys subsystem
> could benefit from the enhancements it would need to cover these 2
> cases. Specifically, the benefit that when ARM and RISC-V arrive with
> similar communications with platform TSMs (Trusted Security Module) they
> can build upon the same infrastructure.
> 
> David, am I reaching with that association? My strawman mapping of
> TDX_CMD_GET_QUOTE to request_key() is something like:
> 
> request_key(coco_quote, "description", "<uuencoded tdreport>")
> 
> Where this is a common key_type for all vendors, but the description and
> arguments have room for vendor differentiation when doing the upcall to
> the platform TSM, but userspace never needs to contend with the
> different vendor formats, that is all handled internally to the kernel.
> 
> At this point I am just looking for confirmation that the "every vendor
> invent a new character device + ioctl" does not scale and a deeper
> conversation is needed. Keys is a plausible solution to that ABI
> proliferation problem.

I agree that vendor-specific interfaces do not scale, and the ABI generalization
will benefit future vendors who require similar feature support. However, such
generalization, in my opinion, will make more sense if the requirements at the top
level are also generalized. Currently, each vendor has their own attestation flow,
and the user ABI they introduced includes a lot of vendor-specific information to
support it. IMO, it is difficult to hide these vendor-specific information from the
user without generalizing the high level attestation flow.

I have included the attestation IOCTL interfaces used by S390, AMD SEV and TDX
below for reference. As you can see, each of these ARCHs uses very different
input and output data formats. It contains a lot of vendor-specific information
(like the vmpl field in struct snp_report_req or the meas_addr, arcb_adr in struct
uvio_attest). The only thing I see in common is the use of input and output blobs.

Even if we just generalize the ABI now, I'm not sure if the unified ABI we create
now will meet the requirements of RISC-v or ARM platforms when they introduce their
own attestation flow in the future. My thinking is that without some sort of
arch-agnostic high-level attestation workflow, ABI generalization has very little
benefit.

======================================================================
// Following are S390 specific attestation struct
drivers/s390/char/uvdevice.c

struct uvio_ioctl_cb {
        __u32 flags;
        __u16 uv_rc;                    /* UV header rc value */
        __u16 uv_rrc;                   /* UV header rrc value */
        __u64 argument_addr;            /* Userspace address of uvio argument */
        __u32 argument_len;
        __u8  reserved14[0x40 - 0x14];  /* must be zero */
};

#define UVIO_ATT_USER_DATA_LEN          0x100
#define UVIO_ATT_UID_LEN                0x10


struct uvio_attest {
        __u64 arcb_addr;                                /* 0x0000 */
        __u64 meas_addr;                                /* 0x0008 */
        __u64 add_data_addr;                            /* 0x0010 */
        __u8  user_data[UVIO_ATT_USER_DATA_LEN];        /* 0x0018 */
        __u8  config_uid[UVIO_ATT_UID_LEN];             /* 0x0118 */
        __u32 arcb_len;                                 /* 0x0128 */
        __u32 meas_len;                                 /* 0x012c */
        __u32 add_data_len;                             /* 0x0130 */
        __u16 user_data_len;                            /* 0x0134 */
        __u16 reserved136;                              /* 0x0136 */
};


#define UVIO_DEVICE_NAME "uv"
#define UVIO_TYPE_UVC 'u'

#define UVIO_IOCTL_ATT _IOWR(UVIO_TYPE_UVC, 0x01, struct uvio_ioctl_cb)


// Following are TDX specific interfaces
drivers/virt/coco/tdx-guest/tdx-guest.c

/**
 * struct tdx_report_req - Request struct for TDX_CMD_GET_REPORT0 IOCTL.
 *
 * @reportdata: User buffer with REPORTDATA to be included into TDREPORT.
 *              Typically it can be some nonce provided by attestation
 *              service, so the generated TDREPORT can be uniquely verified.
 * @tdreport: User buffer to store TDREPORT output from TDCALL[TDG.MR.REPORT].
 */
struct tdx_report_req {
        __u8 reportdata[TDX_REPORTDATA_LEN];
        __u8 tdreport[TDX_REPORT_LEN];
};



/* struct tdx_quote_buf: Format of Quote request buffer.
 * @version: Quote format version, filled by TD.
 * @status: Status code of Quote request, filled by VMM.
 * @in_len: Length of TDREPORT, filled by TD.
 * @out_len: Length of Quote data, filled by VMM.
 * @data: Quote data on output or TDREPORT on input.
 *
 * More details of Quote request buffer can be found in TDX
 * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
 * section titled "TDG.VP.VMCALL<GetQuote>"
 */
struct tdx_quote_buf {
        __u64 version;
        __u64 status;
        __u32 in_len;
        __u32 out_len;
        __u64 data[];
};

/* struct tdx_quote_req: Request struct for TDX_CMD_GET_QUOTE IOCTL.
 * @buf: Address of user buffer in the format of struct tdx_quote_buf.
 *       Upon successful completion of IOCTL, output is copied back to
 *       the same buffer (in struct tdx_quote_buf.data).
 * @len: Length of the Quote buffer.
 */
struct tdx_quote_req {
        __u64 buf;
        __u64 len;
};

/*
 * TDX_CMD_GET_REPORT0 - Get TDREPORT0 (a.k.a. TDREPORT subtype 0) using
 *                       TDCALL[TDG.MR.REPORT]
 *
 * Return 0 on success, -EIO on TDCALL execution failure, and
 * standard errno on other general error cases.
 */
#define TDX_CMD_GET_REPORT0              _IOWR('T', 1, struct tdx_report_req)

/*
 * TDX_CMD_GET_QUOTE - Get TD Guest Quote from QE/QGS using GetQuote
 *                     TDVMCALL.
 *
 * Returns 0 on success or standard errno on other failures.
 */
#define TDX_CMD_GET_QUOTE               _IOWR('T', 2, struct tdx_quote_req)


// Following are AMD SEV specific interfaces
drivers/virt/coco/sev-guest/sev-guest.c

struct snp_report_req {
        /* user data that should be included in the report */
        __u8 user_data[64];

        /* The vmpl level to be included in the report */
        __u32 vmpl;

        /* Must be zero filled */
        __u8 rsvd[28];
};

struct snp_report_resp {
        /* response data, see SEV-SNP spec for the format */
        __u8 data[4000];
};

struct snp_guest_request_ioctl {
        /* message version number (must be non-zero) */
        __u8 msg_version;

        /* Request and response structure address */
        __u64 req_data;
        __u64 resp_data;

        /* bits[63:32]: VMM error code, bits[31:0] firmware error code (see psp-sev.h) */
        union {
                __u64 exitinfo2;
                struct {
                        __u32 fw_error;
                        __u32 vmm_error;
                };
        };
};

/* Get SNP attestation report */
#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
======================================================================

In addition to GetReport support, each of these guest attestation drivers includes
IOCTLs to handle vendor-specific needs (such as Derived key support in the
SEV driver or RTMR Extend support in the TDX guest driver). So we cannot
completely unify all IOCTL interfaces in these drivers.

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-12 19:03   ` Dan Williams
  2023-06-19  5:38     ` Sathyanarayanan Kuppuswamy
@ 2023-06-22 23:31     ` Erdem Aktas
  2023-06-22 23:44       ` Huang, Kai
  2023-06-23 22:27     ` Dan Williams
       [not found]     ` <CAAYXXYyK4g9k7a78CU9w6Sn9KTBdoNLOu9gcgrSHJfp+3-tO=w@mail.gmail.com>
  3 siblings, 1 reply; 41+ messages in thread
From: Erdem Aktas @ 2023-06-22 23:31 UTC (permalink / raw
  To: Dan Williams
  Cc: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Dionna Amalie Glaze, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp

On Mon, Jun 12, 2023 at 12:03 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> [ add David, Brijesh, and Atish]
>
> Kuppuswamy Sathyanarayanan wrote:
> > In TDX guest, the second stage of the attestation process is Quote
> > generation. This process is required to convert the locally generated
> > TDREPORT into a remotely verifiable Quote. It involves sending the
> > TDREPORT data to a Quoting Enclave (QE) which will verify the
> > integrity of the TDREPORT and sign it with an attestation key.
> >
> > Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> > allow the user agent to get the TD Quote.
> >
> > Add a kernel selftest module to verify the Quote generation feature.
> >
> > TD Quote generation involves following steps:
> >
> > * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> > * Embed the TDREPORT data in quote buffer and request for quote
> >   generation via TDX_CMD_GET_QUOTE IOCTL request.
> > * Upon completion of the GetQuote request, check for non zero value
> >   in the status field of Quote header to make sure the generated
> >   quote is valid.
>
> What this cover letter does not say is that this is adding another
> instance of the similar pattern as SNP_GET_REPORT.
>
> Linux is best served when multiple vendors trying to do similar
> operations are brought together behind a common ABI. We see this in the
> history of wrangling SCSI vendors behind common interfaces. Now multiple

Compared to the number of SCSI vendors, I think the number of CPU
vendors for confidential computing seems manageable to me. Is this
really a good comparison?

> confidential computing vendors trying to develop similar flows with
> differentiated formats where that differentiation need not leak over the
> ABI boundary.

<Just my personal opinion below>
I agree with this statement in the high level but it is also somehow
surprising for me after all the discussion happened around this topic.
Honestly, I feel like there are multiple versions of "Intel"  working
in different directions.

If we want multiple vendors trying to do the similar things behind a
common ABI, it should start with the spec. Since this comment is
coming from Intel, I wonder if there is any plan to combine the GHCB
and GHCI interfaces under common ABI in the future or why it did not
even happen in the first place.

What I see is that Intel has GETQUOTE TDVMCALL interface in its spec
and again Intel does not really want to provide support for it in
linux. It feels really frustrating.

>
> My observation of SNP_GET_REPORT and TDX_CMD_GET_REPORT is that they are
> both passing blobs across the user/kernel and platform/kernel boundary
> for the purposes of unlocking other resources. To me that is a flow that
> the Keys subsystem has infrastructure to handle. It has the concept of
> upcalls and asynchronous population of blobs by handles and mechanisms
> to protect and cache those communications. Linux / the Keys subsystem
> could benefit from the enhancements it would need to cover these 2
> cases. Specifically, the benefit that when ARM and RISC-V arrive with
> similar communications with platform TSMs (Trusted Security Module) they
> can build upon the same infrastructure.
>
> David, am I reaching with that association? My strawman mapping of
> TDX_CMD_GET_QUOTE to request_key() is something like:
>
> request_key(coco_quote, "description", "<uuencoded tdreport>")
>
> Where this is a common key_type for all vendors, but the description and
> arguments have room for vendor differentiation when doing the upcall to
> the platform TSM, but userspace never needs to contend with the
> different vendor formats, that is all handled internally to the kernel.

I think the problem definition here is not accurate. With AMD SNP,
guests need to do a hypercall to KVM and KVM needs to issue a
SNP_GUEST_REQUEST(MSG_REPORT_REQ) to the SP firmware. In TDX, guests
need to do a TDCALL to TDXMODULE to get the TDREPORT and then it needs
to get that report delivered to the host userspace to get the TDQUOTE
generated by the SGX quoting enclave. Also TDQUOTE is designed to work
async while the SNP_GUEST_REQUESTS are blocking vmcalls.

Those are completely different flows. Are you suggesting that intel
should also come down to a single call to get the TDQUOTE like AMD
SNP?

The TDCALL interface asking for the TDREPORT is already there. AMD
does not need to ask the report and the quote separately.

Here, the problem was that Intel (or "upstream community") did not
want to implement/accept hypercall for TDQUOTE which would be handled
by the user space VMM. The alternative implementation (using vsock)
does not work for many use cases including ours. I do not see how your
suggestion addresses the problem that this patch was trying to solve.

So while I like the suggested direction, I am not sure how much it is
possible to come up with a common ABI even with just only for 2
vendors (AMD and Intel) without doing spec changes which is a multi
year effort imho.

>
> At this point I am just looking for confirmation that the "every vendor
> invent a new character device + ioctl" does not scale and a deeper
> conversation is needed. Keys is a plausible solution to that ABI
> proliferation problem.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-22 23:31     ` Erdem Aktas
@ 2023-06-22 23:44       ` Huang, Kai
  2023-06-23 22:31         ` Dan Williams
  0 siblings, 1 reply; 41+ messages in thread
From: Huang, Kai @ 2023-06-22 23:44 UTC (permalink / raw
  To: Williams, Dan J, Aktas, Erdem
  Cc: corbet@lwn.net, Du, Fan, shuah@kernel.org, Luck, Tony,
	dave.hansen@linux.intel.com, brijesh.singh@amd.com,
	dionnaglaze@google.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, dhowells@redhat.com,
	Yu, Guorui, x86@kernel.org

On Thu, 2023-06-22 at 16:31 -0700, Erdem Aktas wrote:
> So while I like the suggested direction, I am not sure how much it is
> possible to come up with a common ABI even with just only for 2
> vendors (AMD and Intel) without doing spec changes which is a multi
> year effort imho.

I don't want to intervene the discussion around whether this direction is
correct or not, however I want to say request_key() may not be the right place
to fit Quote (or remote verifiable data blob in general for attestation).

> request_key(coco_quote, "description", "<uuencoded tdreport>")

Although both key and Quote are data blob in some way, Quote certainly is not a
key but have much more information.  The man page of request_key() seems to
suggest it's just for key:

       request_key - request a key from the kernel's key management
       facility

So IMHO using request_key() to fit Quote may cause bigger confusion.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-12 19:03   ` Dan Williams
  2023-06-19  5:38     ` Sathyanarayanan Kuppuswamy
  2023-06-22 23:31     ` Erdem Aktas
@ 2023-06-23 22:27     ` Dan Williams
  2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
  2023-06-28  2:47       ` Huang, Kai
       [not found]     ` <CAAYXXYyK4g9k7a78CU9w6Sn9KTBdoNLOu9gcgrSHJfp+3-tO=w@mail.gmail.com>
  3 siblings, 2 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-23 22:27 UTC (permalink / raw
  To: Dan Williams, Kuppuswamy Sathyanarayanan, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, dhowells, brijesh.singh, atishp,
	gregkh

Dan Williams wrote:
> [ add David, Brijesh, and Atish]
> 
> Kuppuswamy Sathyanarayanan wrote:
> > In TDX guest, the second stage of the attestation process is Quote
> > generation. This process is required to convert the locally generated
> > TDREPORT into a remotely verifiable Quote. It involves sending the
> > TDREPORT data to a Quoting Enclave (QE) which will verify the
> > integrity of the TDREPORT and sign it with an attestation key.
> > 
> > Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> > allow the user agent to get the TD Quote.
> > 
> > Add a kernel selftest module to verify the Quote generation feature.
> > 
> > TD Quote generation involves following steps:
> > 
> > * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> > * Embed the TDREPORT data in quote buffer and request for quote
> >   generation via TDX_CMD_GET_QUOTE IOCTL request.
> > * Upon completion of the GetQuote request, check for non zero value
> >   in the status field of Quote header to make sure the generated
> >   quote is valid.
> 
> What this cover letter does not say is that this is adding another
> instance of the similar pattern as SNP_GET_REPORT.
> 
> Linux is best served when multiple vendors trying to do similar
> operations are brought together behind a common ABI. We see this in the
> history of wrangling SCSI vendors behind common interfaces. Now multiple
> confidential computing vendors trying to develop similar flows with
> differentiated formats where that differentiation need not leak over the
> ABI boundary.
[..]

Below is a rough mock up of this approach to demonstrate the direction.
Again, the goal is to define an ABI that can support any vendor's
arch-specific attestation method and key provisioning flows without
leaking vendor-specific details, or confidential material over the
user/kernel ABI.

The observation is that there are a sufficient number of attestation
flows available to review where Linux can define a superset ABI to
contain them all. The other observation is that the implementations have
features that may cross-polinate over time. For example the SEV
privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
PCRs) mechanisms address generic Confidential Computing use cases.

Vendor specific ioctls for all of this feels like surrender when Linux
already has the keys subsystem which has plenty of degrees of freedom
for tracking blobs with signatures and using those blobs to instantiate
other blobs. It already serves as the ABI wrapping various TPM
implementations and marshaling keys for storage encryption and other use
cases that intersect Confidential Computing.

The benefit of deprecating vendor-specific abstraction layers in
userspace is secondary. The primary benefit is collaboration. It enables
kernel developers from various architectures to collaborate on common
infrastructure. If, referring back to my previous example, SEV adopts an
RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
unfortunate if those efforts were siloed, duplicated, and needlessly
differentiated to userspace. So while there are arguably a manageable
number of basic arch attestation methods the planned expansion of those
to build incremental functionality is where I believe we, as a
community, will be glad that we invested in a "Linux format" for all of
this.

An example, to show what the strawman patch below enables: (req_key is
the sample program from "man 2 request_key")

# ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
Key ID is 10e2f3a7
# keyctl pipe 0x10e2f3a7 | hexdump -C
00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00004000

This is the kernel instantiating a TDX Quote without the TDREPORT
implementation detail ever leaving the kernel. Now, this is only the
top-half of what is needed. The missing bottom half takes that material
and uses it to instantiate derived key material like the storage
decryption key internal to the kernel. See "The Process" in
Documentation/security/keys/request-key.rst for how the Keys subsystem
handles the "keys for keys" use case.

---
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index f79ab13a5c28..0f775847028e 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
 
 source "drivers/virt/coco/tdx-guest/Kconfig"
 
+config GUEST_ATTEST
+	tristate
+	select KEYS
+
 endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index e9aa6fc96fab..66f6b838f8f4 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)		+= acrn/
 obj-$(CONFIG_EFI_SECRET)	+= coco/efi_secret/
 obj-$(CONFIG_SEV_GUEST)		+= coco/sev-guest/
 obj-$(CONFIG_INTEL_TDX_GUEST)	+= coco/tdx-guest/
+obj-$(CONFIG_GUEST_ATTEST)	+= coco/guest-attest/
diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
new file mode 100644
index 000000000000..5581c5a27588
--- /dev/null
+++ b/drivers/virt/coco/guest-attest/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
+guest_attest-y := key.o
diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
new file mode 100644
index 000000000000..2a494b6dd7a7
--- /dev/null
+++ b/drivers/virt/coco/guest-attest/key.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/seq_file.h>
+#include <linux/key-type.h>
+#include <linux/module.h>
+#include <linux/base64.h>
+
+#include <keys/request_key_auth-type.h>
+#include <keys/user-type.h>
+
+#include "guest-attest.h"
+
+static LIST_HEAD(guest_attest_list);
+static DECLARE_RWSEM(guest_attest_rwsem);
+
+static struct guest_attest_ops *fetch_ops(void)
+{
+	return list_first_entry_or_null(&guest_attest_list,
+					struct guest_attest_ops, list);
+}
+
+static struct guest_attest_ops *get_ops(void)
+{
+	down_read(&guest_attest_rwsem);
+	return fetch_ops();
+}
+
+static void put_ops(void)
+{
+	up_read(&guest_attest_rwsem);
+}
+
+int register_guest_attest_ops(struct guest_attest_ops *ops)
+{
+	struct guest_attest_ops *conflict;
+	int rc;
+
+	down_write(&guest_attest_rwsem);
+	conflict = fetch_ops();
+	if (conflict) {
+		pr_err("\"%s\" ops already registered\n", conflict->name);
+		rc = -EEXIST;
+		goto out;
+	}
+	list_add(&ops->list, &guest_attest_list);
+	try_module_get(ops->module);
+	rc = 0;
+out:
+	up_write(&guest_attest_rwsem);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(register_guest_attest_ops);
+
+void unregister_guest_attest_ops(struct guest_attest_ops *ops)
+{
+	down_write(&guest_attest_rwsem);
+	list_del(&ops->list);
+	up_write(&guest_attest_rwsem);
+	module_put(ops->module);
+}
+EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
+
+static int __guest_attest_request_key(struct key *key, int level,
+				      struct key *dest_keyring,
+				      const char *callout_info, int callout_len,
+				      struct key *authkey)
+{
+	struct guest_attest_ops *ops;
+	void *payload = NULL;
+	int rc, payload_len;
+
+	ops = get_ops();
+	if (!ops)
+		return -ENOKEY;
+
+	payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
+	if (!payload) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	payload_len = base64_decode(callout_info, callout_len, payload);
+	if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
+				 authkey);
+out:
+	kfree(payload);
+	put_ops();
+	return rc;
+}
+
+static int guest_attest_request_key(struct key *authkey, void *data)
+{
+	struct request_key_auth *rka = get_request_key_auth(authkey);
+	struct key *key = rka->target_key;
+	unsigned long long id;
+	int rc, level;
+
+	pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
+		 rka->callout_info ? (char *)rka->callout_info : "\"none\"");
+
+	if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
+		return -EINVAL;
+
+	if (!rka->callout_info) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	rc = __guest_attest_request_key(key, level, rka->dest_keyring,
+					rka->callout_info, rka->callout_len,
+					authkey);
+out:
+	complete_request_key(authkey, rc);
+	return rc;
+}
+
+static int guest_attest_vet_description(const char *desc)
+{
+	unsigned long long id;
+	int level;
+
+	if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
+		return -EINVAL;
+	return 0;
+}
+
+static struct key_type key_type_guest_attest = {
+	.name = "guest_attest",
+	.preparse = user_preparse,
+	.free_preparse = user_free_preparse,
+	.instantiate = generic_key_instantiate,
+	.revoke = user_revoke,
+	.destroy = user_destroy,
+	.describe = user_describe,
+	.read = user_read,
+	.vet_description = guest_attest_vet_description,
+	.request_key = guest_attest_request_key,
+};
+
+static int __init guest_attest_init(void)
+{
+	return register_key_type(&key_type_guest_attest);
+}
+
+static void __exit guest_attest_exit(void)
+{
+	unregister_key_type(&key_type_guest_attest);
+}
+
+module_init(guest_attest_init);
+module_exit(guest_attest_exit);
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
index 14246fc2fb02..9a1ec85369fe 100644
--- a/drivers/virt/coco/tdx-guest/Kconfig
+++ b/drivers/virt/coco/tdx-guest/Kconfig
@@ -1,6 +1,7 @@
 config TDX_GUEST_DRIVER
 	tristate "TDX Guest driver"
 	depends on INTEL_TDX_GUEST
+	select GUEST_ATTEST
 	help
 	  The driver provides userspace interface to communicate with
 	  the TDX module to request the TDX guest details like attestation
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
index 388491fa63a1..65b5aab284d9 100644
--- a/drivers/virt/coco/tdx-guest/tdx-guest.c
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -13,11 +13,13 @@
 #include <linux/string.h>
 #include <linux/uaccess.h>
 #include <linux/set_memory.h>
+#include <linux/key-type.h>
 
 #include <uapi/linux/tdx-guest.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/tdx.h>
+#include "../guest-attest/guest-attest.h"
 
 /*
  * Intel's SGX QE implementation generally uses Quote size less
@@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
 };
 MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
 
+static int tdx_request_attest(struct key *key, int level,
+			      struct key *dest_keyring, void *payload,
+			      int payload_len, struct key *authkey)
+{
+	u8 *tdreport;
+	long ret;
+
+	tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
+	if (!tdreport)
+		return -ENOMEM;
+
+	/* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
+	ret = tdx_mcall_get_report0(payload, tdreport);
+	if (ret)
+		goto out;
+
+	mutex_lock(&quote_lock);
+
+	memset(qentry->buf, 0, qentry->buf_len);
+	reinit_completion(&qentry->compl);
+	qentry->valid = true;
+
+	/* Submit GetQuote Request using GetQuote hyperetall */
+	ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
+	if (ret) {
+		pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
+		ret = -EIO;
+		goto quote_failed;
+	}
+
+	/*
+	 * Although the GHCI specification does not state explicitly that
+	 * the VMM must not wait indefinitely for the Quote request to be
+	 * completed, a sane VMM should always notify the guest after a
+	 * certain time, regardless of whether the Quote generation is
+	 * successful or not.  For now just assume the VMM will do so.
+	 */
+	wait_for_completion(&qentry->compl);
+
+	ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
+				       dest_keyring, authkey);
+
+quote_failed:
+	qentry->valid = false;
+	mutex_unlock(&quote_lock);
+out:
+	kfree(tdreport);
+	return ret;
+}
+
+static struct guest_attest_ops tdx_attest_ops = {
+	.name = KBUILD_MODNAME,
+	.module = THIS_MODULE,
+	.request_attest = tdx_request_attest,
+};
+
 static int __init tdx_guest_init(void)
 {
 	int ret;
@@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
 	if (ret)
 		goto free_quote;
 
+	ret = register_guest_attest_ops(&tdx_attest_ops);
+	if (ret)
+		goto free_irq;
+
 	return 0;
 
+free_irq:
+	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
 free_quote:
 	free_quote_entry(qentry);
 free_misc:
@@ -264,6 +328,7 @@ module_init(tdx_guest_init);
 
 static void __exit tdx_guest_exit(void)
 {
+	unregister_guest_attest_ops(&tdx_attest_ops);
 	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
 	free_quote_entry(qentry);
 	misc_deregister(&tdx_misc_dev);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-22 23:44       ` Huang, Kai
@ 2023-06-23 22:31         ` Dan Williams
  0 siblings, 0 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-23 22:31 UTC (permalink / raw
  To: Huang, Kai, Williams, Dan J, Aktas, Erdem
  Cc: corbet@lwn.net, Du, Fan, shuah@kernel.org, Luck, Tony,
	dave.hansen@linux.intel.com, brijesh.singh@amd.com,
	dionnaglaze@google.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, dhowells@redhat.com,
	Yu, Guorui, x86@kernel.org

Huang, Kai wrote:
> On Thu, 2023-06-22 at 16:31 -0700, Erdem Aktas wrote:
> > So while I like the suggested direction, I am not sure how much it is
> > possible to come up with a common ABI even with just only for 2
> > vendors (AMD and Intel) without doing spec changes which is a multi
> > year effort imho.
> 
> I don't want to intervene the discussion around whether this direction is
> correct or not, however I want to say request_key() may not be the right place
> to fit Quote (or remote verifiable data blob in general for attestation).
> 
> > request_key(coco_quote, "description", "<uuencoded tdreport>")
> 
> Although both key and Quote are data blob in some way, Quote certainly is not a
> key but have much more information.  The man page of request_key() seems to
> suggest it's just for key:
> 
>        request_key - request a key from the kernel's key management
>        facility
> 

Read further in that man page and see the example of generic user
defined value stored as a "key". A "key" is just a blob that has meaning
to access other resources / instantiate other keys.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
       [not found]     ` <CAAYXXYyK4g9k7a78CU9w6Sn9KTBdoNLOu9gcgrSHJfp+3-tO=w@mail.gmail.com>
@ 2023-06-23 22:49       ` Dan Williams
  2023-08-23  8:25       ` Thomas Gleixner
  1 sibling, 0 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-23 22:49 UTC (permalink / raw
  To: Erdem Aktas, Dan Williams
  Cc: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Dionna Amalie Glaze, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp

Erdem Aktas wrote:
> On Mon, Jun 12, 2023 at 12:03 PM Dan Williams <dan.j.williams@intel.com>
> wrote:
> 
> > [ add David, Brijesh, and Atish]
> >
> > Kuppuswamy Sathyanarayanan wrote:
> > > In TDX guest, the second stage of the attestation process is Quote
> > > generation. This process is required to convert the locally generated
> > > TDREPORT into a remotely verifiable Quote. It involves sending the
> > > TDREPORT data to a Quoting Enclave (QE) which will verify the
> > > integrity of the TDREPORT and sign it with an attestation key.
> > >
> > > Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> > > allow the user agent to get the TD Quote.
> > >
> > > Add a kernel selftest module to verify the Quote generation feature.
> > >
> > > TD Quote generation involves following steps:
> > >
> > > * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> > > * Embed the TDREPORT data in quote buffer and request for quote
> > >   generation via TDX_CMD_GET_QUOTE IOCTL request.
> > > * Upon completion of the GetQuote request, check for non zero value
> > >   in the status field of Quote header to make sure the generated
> > >   quote is valid.
> >
> > What this cover letter does not say is that this is adding another
> > instance of the similar pattern as SNP_GET_REPORT.
> >
> > Linux is best served when multiple vendors trying to do similar
> > operations are brought together behind a common ABI. We see this in the
> > history of wrangling SCSI vendors behind common interfaces.
> 
> Compared to the number of SCSI vendors, I think the number of CPU vendors
> for confidential computing seems manageable to me. Is this really a good
> comparison?

Fair enough, and prompted by this I talk a bit more about the
motiviations and benefits of a Keys abstraction for attestation here:

https://lore.kernel.org/all/64961c3baf8ce_142af829436@dwillia2-xfh.jf.intel.com.notmuch/

> > Now multiple
> > confidential computing vendors trying to develop similar flows with
> > differentiated formats where that differentiation need not leak over the
> > ABI boundary.
> >
> 
> <Just my personal opinion below>
> I agree with this statement in the high level but it is also somehow
> surprising for me after all the discussion happened around this topic.
> Honestly, I feel like there are multiple versions of "Intel"  working in
> different directions.

This proposal was sent while firmly wearing my Linux community hat. I
agree, the timing here is unfortunate.

> If we want multiple vendors trying to do the similar things behind a common
> ABI, it should start with the spec. Since this comment is coming from
> Intel, I wonder if there is any plan to combine the GHCB and GHCI
> interfaces under common ABI in the future or why it did not even happen in
> the first place.

Per above comment about firmly wearing my Linux hat I am coming at this
purely from the perspective of what do we do now as a community that
continues to see these implementations proliferate and grow more
features. Common specs are great, but I agree with you, it is too late
for that, but I hope that as Linux asserts "this is what it should look
like" it starts to influence future IP innovation, and attestation
service providers, to acommodate the kernel's ABI momentum.

> What I see is that Intel has GETQUOTE TDVMCALL interface in its spec and
> again Intel does not really want to provide support for it in linux. It
> feels really frustrating.

I am aware of how frustrating late feedback can be. I am also encouraged
by some of the conversations and investigations that have already
happened around how Keys fits what these attestation solutions need.

> > My observation of SNP_GET_REPORT and TDX_CMD_GET_REPORT is that they are
> > both passing blobs across the user/kernel and platform/kernel boundary
> > for the purposes of unlocking other resources. To me that is a flow that
> > the Keys subsystem has infrastructure to handle. It has the concept of
> > upcalls and asynchronous population of blobs by handles and mechanisms
> > to protect and cache those communications. Linux / the Keys subsystem
> > could benefit from the enhancements it would need to cover these 2
> > cases. Specifically, the benefit that when ARM and RISC-V arrive with
> > similar communications with platform TSMs (Trusted Security Module) they
> > can build upon the same infrastructure.
> >
> > David, am I reaching with that association? My strawman mapping of
> > TDX_CMD_GET_QUOTE to request_key() is something like:
> >
> > request_key(coco_quote, "description", "<uuencoded tdreport>")
> >
> > Where this is a common key_type for all vendors, but the description and
> > arguments have room for vendor differentiation when doing the upcall to
> > the platform TSM, but userspace never needs to contend with the
> > different vendor formats, that is all handled internally to the kernel.
> >
> > I think the problem definition here is not accurate. With AMD SNP, guests
> need to do a hypercall to KVM and KVM needs to issue
> a  SNP_GUEST_REQUEST(MSG_REPORT_REQ) to the SP firmware. In TDX, guests
> need to do a TDCALL to TDXMODULE to get the TDREPORT and then it needs to
> get that report delivered to the host userspace to get the TDQUOTE
> generated by the SGX quoting enclave. Also TDQUOTE is designed to work
> async while the SNP_GUEST_REQUESTS are blocking vmcalls.
> 
> Those are completely different flows. Are you suggesting that intel should
> also come down to a single call to get the TDQUOTE like AMD SNP?

The Keys subsystem supports async instantiation of key material with
usermode upcalls if necessary. So I do not see a problem supporting
these flows behind a common key type.

> The TDCALL interface asking for the TDREPORT is already there. AMD does not
> need to ask the report and the quote separately.
> 
> Here, the problem was that Intel (upstream) did not want to implement
> hypercall for TDQUOTE which would be handled by the user space VMM. The
> alternative implementation (using vsock) does not work for many use cases
> including ours. I do not see how your suggestion addresses the problem that
> this patch was trying to solve.

Perhaps the strawman mockup makes it more clear:

https://lore.kernel.org/all/64961c3baf8ce_142af829436@dwillia2-xfh.jf.intel.com.notmuch/

> So while I like the suggested direction, I am not sure how much it is
> possible to come up with a common ABI even with just only for 2 vendors
> (AMD and Intel) without doing spec changes which is a multi year effort
> imho.

I agree, hardware spec changes are out of scope for this effort, but
Keys might require some additional flows to be built up in the kernel
that could be previously handled in userspace. I.e. the "bottom half"
that I reference in the mockup.

This is something we went through with using "encrypted-keys" for
nvdimm. Instead of an ioctl to inject a secret key over the user kernel
boundary a key server need to store a serialized version of the
encrypted key blob and pass that into the kernel.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
                   ` (3 preceding siblings ...)
  2023-05-24 21:33 ` [PATCH v3 0/3] TDX Guest Quote generation support Chong Cai
@ 2023-06-24  4:05 ` Dan Williams
  2023-06-25 20:21   ` Dan Williams
  2023-06-26  3:07   ` Sathyanarayanan Kuppuswamy
  4 siblings, 2 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-24  4:05 UTC (permalink / raw
  To: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

Kuppuswamy Sathyanarayanan wrote:
> Hi All,
> 
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest.
> 
> The TDX guest attestation process consists of two steps:
> 
> 1. TDREPORT generation
> 2. Quote generation.
> 
> The First step (TDREPORT generation) involves getting the TDX guest
> measurement data in the format of TDREPORT which is further used to
> validate the authenticity of the TDX guest. The second step involves
> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> remotely verifiable Quote. TDREPORT by design can only be verified on
> the local platform. To support remote verification of the TDREPORT,
> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> locally and convert it to a remotely verifiable Quote. Although
> attestation software can use communication methods like TCP/IP or
> vsock to send the TDREPORT to QE, not all platforms support these
> communication models. So TDX GHCI specification [1] defines a method
> for Quote generation via hypercalls. Please check the discussion from
> Google [2] and Alibaba [3] which clarifies the need for hypercall based
> Quote generation support. This patch set adds this support.
> 
> Support for TDREPORT generation already exists in the TDX guest driver. 
> This patchset extends the same driver to add the Quote generation
> support.

I missed that the TDREPORT ioctl() and this character device are already
upstream. The TDREPORT ioctl() if it is only needed for quote generation
seems a waste because it just retrieves a blob that needs to be turned
around and injected back into the kernel to generate a quote.

An ABI wants to care about the abstractions around what the hardware
mechanism enables. The TD quote is not even at the end of that chain of
what the ABI needs to offer. The guest wants to use the TD quote to access
/ unlock other resources, just like the SEV report is used to
"...provide the VM with secrets, such as a disk decryption key, or other
keys required for operation".

That's where the ABI line needs to be drawn. I.e. for the guest to be
able to request the distributions of keys to unlock resources by a
key-type and key-descriptor. Enable userspace to interrogate an
attestation object without blobs needing to traverse the kernel. If the
remote service needs more than just a blob and signature to validate the
state of the guest then provide faclity to interrogate that property of
quote / report in a common way versus the ABI risk of conveying vendor
specific binary data formats in the kernel ABI.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-06-24  4:05 ` Dan Williams
@ 2023-06-25 20:21   ` Dan Williams
  2023-06-26  3:07   ` Sathyanarayanan Kuppuswamy
  1 sibling, 0 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-25 20:21 UTC (permalink / raw
  To: Dan Williams, Kuppuswamy Sathyanarayanan, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, gregkh, dhowells

Dan Williams wrote:
> Kuppuswamy Sathyanarayanan wrote:
> > Hi All,
> > 
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest.
> > 
> > The TDX guest attestation process consists of two steps:
> > 
> > 1. TDREPORT generation
> > 2. Quote generation.
> > 
> > The First step (TDREPORT generation) involves getting the TDX guest
> > measurement data in the format of TDREPORT which is further used to
> > validate the authenticity of the TDX guest. The second step involves
> > sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> > remotely verifiable Quote. TDREPORT by design can only be verified on
> > the local platform. To support remote verification of the TDREPORT,
> > TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> > locally and convert it to a remotely verifiable Quote. Although
> > attestation software can use communication methods like TCP/IP or
> > vsock to send the TDREPORT to QE, not all platforms support these
> > communication models. So TDX GHCI specification [1] defines a method
> > for Quote generation via hypercalls. Please check the discussion from
> > Google [2] and Alibaba [3] which clarifies the need for hypercall based
> > Quote generation support. This patch set adds this support.
> > 
> > Support for TDREPORT generation already exists in the TDX guest driver. 
> > This patchset extends the same driver to add the Quote generation
> > support.
> 
> I missed that the TDREPORT ioctl() and this character device are already
> upstream. The TDREPORT ioctl() if it is only needed for quote generation
> seems a waste because it just retrieves a blob that needs to be turned
> around and injected back into the kernel to generate a quote.
> 
> An ABI wants to care about the abstractions around what the hardware
> mechanism enables. The TD quote is not even at the end of that chain of
> what the ABI needs to offer. The guest wants to use the TD quote to access
> / unlock other resources, just like the SEV report is used to
> "...provide the VM with secrets, such as a disk decryption key, or other
> keys required for operation".
> 
> That's where the ABI line needs to be drawn. I.e. for the guest to be
> able to request the distributions of keys to unlock resources by a
> key-type and key-descriptor. Enable userspace to interrogate an
> attestation object without blobs needing to traverse the kernel. If the
> remote service needs more than just a blob and signature to validate the
> state of the guest then provide faclity to interrogate that property of
> quote / report in a common way versus the ABI risk of conveying vendor
> specific binary data formats in the kernel ABI.

A proposal for how this space moves forward:

1/ Stop accepting new arch specific ioctls in this space and revert the
   Intel TDREPORT ioctl if its only reason for existing is "quote"
   generation.

2/ Define a container format / envelope for platform-provided
   attestation evidence.

   The observation here is that although it is too late to unify the
   evidence formats across vendors, they appear to share the common form of
   a blob with an ECDSA signature. That reduces the minimum viable
   attestation service to something that can generically verify an
   evidence-blob signature.

3/ Define a key-description format that considers a superset of the
   platform needs. For example a 'privelege-level' concept can map to
   'vmpl' on AMD, but be ignored for now for Intel.

4/ For in progress enabling concepts like runtime measurement registers,
   look to reuse / abstract that behind the Keys subsystem existing support
   for managing TPM PCRs.

5/ Deprecate the multiple arch specific attestation ioctl interfaces in
   favor of this unified conveyance method.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-23 22:27     ` Dan Williams
@ 2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
  2023-06-26 18:57         ` Dionna Amalie Glaze
  2023-06-27 23:44         ` Dan Williams
  2023-06-28  2:47       ` Huang, Kai
  1 sibling, 2 replies; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-26  3:05 UTC (permalink / raw
  To: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze, Chong Cai,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc, dhowells, brijesh.singh, atishp, gregkh

Hi Dan,

On 6/23/23 3:27 PM, Dan Williams wrote:
> Dan Williams wrote:
>> [ add David, Brijesh, and Atish]
>>
>> Kuppuswamy Sathyanarayanan wrote:
>>> In TDX guest, the second stage of the attestation process is Quote
>>> generation. This process is required to convert the locally generated
>>> TDREPORT into a remotely verifiable Quote. It involves sending the
>>> TDREPORT data to a Quoting Enclave (QE) which will verify the
>>> integrity of the TDREPORT and sign it with an attestation key.
>>>
>>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
>>> allow the user agent to get the TD Quote.
>>>
>>> Add a kernel selftest module to verify the Quote generation feature.
>>>
>>> TD Quote generation involves following steps:
>>>
>>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
>>> * Embed the TDREPORT data in quote buffer and request for quote
>>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
>>> * Upon completion of the GetQuote request, check for non zero value
>>>   in the status field of Quote header to make sure the generated
>>>   quote is valid.
>>
>> What this cover letter does not say is that this is adding another
>> instance of the similar pattern as SNP_GET_REPORT.
>>
>> Linux is best served when multiple vendors trying to do similar
>> operations are brought together behind a common ABI. We see this in the
>> history of wrangling SCSI vendors behind common interfaces. Now multiple
>> confidential computing vendors trying to develop similar flows with
>> differentiated formats where that differentiation need not leak over the
>> ABI boundary.
> [..]
> 
> Below is a rough mock up of this approach to demonstrate the direction.
> Again, the goal is to define an ABI that can support any vendor's
> arch-specific attestation method and key provisioning flows without
> leaking vendor-specific details, or confidential material over the
> user/kernel ABI.

Thanks for working on this mock code and helping out. It gives me the
general idea about your proposal.

> 
> The observation is that there are a sufficient number of attestation
> flows available to review where Linux can define a superset ABI to
> contain them all. The other observation is that the implementations have
> features that may cross-polinate over time. For example the SEV
> privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> PCRs) mechanisms address generic Confidential Computing use cases.


I agree with your point about VMPL and RTMR feature cases. This observation
is valid for AMD SEV and TDX attestation flows. But I am not sure whether
it will hold true for other vendor implementations. Our sample set is not
good enough to make this conclusion. The reason for my concern is, if you
check the ABI interface used in the S390 arch attestation driver
(drivers/s390/char/uvdevice.c), you would notice that there is a significant
difference between the ABI used in that driver and SEV/TDX drivers. The S390
driver attestation request appears to accept two data blobs as input, as well
as a variety of vendor-specific header configurations.

Maybe the s390 attestation model is a special case, but, I think we consider
this issue. Since we don't have a common spec, there is chance that any
superset ABI we define now may not meet future vendor requirements. One way to
handle it to leave enough space in the generic ABI to handle future vendor
requirements.

I think it would be better if other vendors (like ARM or RISC) can comment and
confirm whether this proposal meets their demands.

> 
> Vendor specific ioctls for all of this feels like surrender when Linux
> already has the keys subsystem which has plenty of degrees of freedom
> for tracking blobs with signatures and using those blobs to instantiate
> other blobs. It already serves as the ABI wrapping various TPM
> implementations and marshaling keys for storage encryption and other use
> cases that intersect Confidential Computing.
> 
> The benefit of deprecating vendor-specific abstraction layers in
> userspace is secondary. The primary benefit is collaboration. It enables
> kernel developers from various architectures to collaborate on common
> infrastructure. If, referring back to my previous example, SEV adopts an
> RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
> unfortunate if those efforts were siloed, duplicated, and needlessly
> differentiated to userspace. So while there are arguably a manageable
> number of basic arch attestation methods the planned expansion of those
> to build incremental functionality is where I believe we, as a
> community, will be glad that we invested in a "Linux format" for all of
> this.
> 
> An example, to show what the strawman patch below enables: (req_key is
> the sample program from "man 2 request_key")
> 
> # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
> Key ID is 10e2f3a7
> # keyctl pipe 0x10e2f3a7 | hexdump -C
> 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
> 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00004000
> 
> This is the kernel instantiating a TDX Quote without the TDREPORT
> implementation detail ever leaving the kernel. Now, this is only the

IIUC, the idea here is to cache the quote data and return it to the user whenever
possible, right? If yes, I think such optimization may not be very useful for our
case. AFAIK, the quote data will change whenever there is a change in the guest
measurement data. Since the validity of the generated quote will not be long,
and the frequency of quote generation requests is expected to be less, we may not
get much benefit from caching the quote data. I think we can keep this logic simple
by directly retrieving the quote data from the quoting enclave whenever there is a
request from the user.

> top-half of what is needed. The missing bottom half takes that material
> and uses it to instantiate derived key material like the storage
> decryption key internal to the kernel. See "The Process" in
> Documentation/security/keys/request-key.rst for how the Keys subsystem
> handles the "keys for keys" use case.

This is only useful for key-server use case, right? Attestation can also be
used for use cases like pattern matching or uploading some secure data, etc.
Since key-server is not the only use case, does it make sense to suppport
this derived key feature?

> 
> ---
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..0f775847028e 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>  
>  source "drivers/virt/coco/tdx-guest/Kconfig"
>  
> +config GUEST_ATTEST
> +	tristate
> +	select KEYS
> +
>  endif
> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> index e9aa6fc96fab..66f6b838f8f4 100644
> --- a/drivers/virt/Makefile
> +++ b/drivers/virt/Makefile
> @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)		+= acrn/
>  obj-$(CONFIG_EFI_SECRET)	+= coco/efi_secret/
>  obj-$(CONFIG_SEV_GUEST)		+= coco/sev-guest/
>  obj-$(CONFIG_INTEL_TDX_GUEST)	+= coco/tdx-guest/
> +obj-$(CONFIG_GUEST_ATTEST)	+= coco/guest-attest/
> diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
> new file mode 100644
> index 000000000000..5581c5a27588
> --- /dev/null
> +++ b/drivers/virt/coco/guest-attest/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
> +guest_attest-y := key.o
> diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
> new file mode 100644
> index 000000000000..2a494b6dd7a7
> --- /dev/null
> +++ b/drivers/virt/coco/guest-attest/key.c
> @@ -0,0 +1,159 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +#include <linux/seq_file.h>
> +#include <linux/key-type.h>
> +#include <linux/module.h>
> +#include <linux/base64.h>
> +
> +#include <keys/request_key_auth-type.h>
> +#include <keys/user-type.h>
> +
> +#include "guest-attest.h"

Can you share you guest-attest.h?

> +
> +static LIST_HEAD(guest_attest_list);
> +static DECLARE_RWSEM(guest_attest_rwsem);
> +
> +static struct guest_attest_ops *fetch_ops(void)
> +{
> +	return list_first_entry_or_null(&guest_attest_list,
> +					struct guest_attest_ops, list);
> +}
> +
> +static struct guest_attest_ops *get_ops(void)
> +{
> +	down_read(&guest_attest_rwsem);
> +	return fetch_ops();
> +}
> +
> +static void put_ops(void)
> +{
> +	up_read(&guest_attest_rwsem);
> +}
> +
> +int register_guest_attest_ops(struct guest_attest_ops *ops)
> +{
> +	struct guest_attest_ops *conflict;
> +	int rc;
> +
> +	down_write(&guest_attest_rwsem);
> +	conflict = fetch_ops();
> +	if (conflict) {
> +		pr_err("\"%s\" ops already registered\n", conflict->name);
> +		rc = -EEXIST;
> +		goto out;
> +	}
> +	list_add(&ops->list, &guest_attest_list);
> +	try_module_get(ops->module);
> +	rc = 0;
> +out:
> +	up_write(&guest_attest_rwsem);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
> +
> +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
> +{
> +	down_write(&guest_attest_rwsem);
> +	list_del(&ops->list);
> +	up_write(&guest_attest_rwsem);
> +	module_put(ops->module);
> +}
> +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
> +
> +static int __guest_attest_request_key(struct key *key, int level,
> +				      struct key *dest_keyring,
> +				      const char *callout_info, int callout_len,
> +				      struct key *authkey)
> +{
> +	struct guest_attest_ops *ops;
> +	void *payload = NULL;
> +	int rc, payload_len;
> +
> +	ops = get_ops();
> +	if (!ops)
> +		return -ENOKEY;
> +
> +	payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
> +	if (!payload) {
> +		rc = -ENOMEM;
> +		goto out;
> +	}

Is the idea to get the values like vmpl part of the payload?

> +
> +	payload_len = base64_decode(callout_info, callout_len, payload);
> +	if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
> +		rc = -EINVAL;
> +		goto out;
> +	}
> +
> +	rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
> +				 authkey);
> +out:
> +	kfree(payload);
> +	put_ops();
> +	return rc;
> +}
> +
> +static int guest_attest_request_key(struct key *authkey, void *data)
> +{
> +	struct request_key_auth *rka = get_request_key_auth(authkey);
> +	struct key *key = rka->target_key;
> +	unsigned long long id;
> +	int rc, level;
> +
> +	pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
> +		 rka->callout_info ? (char *)rka->callout_info : "\"none\"");
> +
> +	if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
> +		return -EINVAL;
> +

Can you explain some details about the id and level? It is not very clear why
we need it.

> +	if (!rka->callout_info) {
> +		rc = -EINVAL;
> +		goto out;
> +	}
> +
> +	rc = __guest_attest_request_key(key, level, rka->dest_keyring,
> +					rka->callout_info, rka->callout_len,
> +					authkey);
> +out:
> +	complete_request_key(authkey, rc);
> +	return rc;
> +}
> +
> +static int guest_attest_vet_description(const char *desc)
> +{
> +	unsigned long long id;
> +	int level;
> +
> +	if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +static struct key_type key_type_guest_attest = {
> +	.name = "guest_attest",
> +	.preparse = user_preparse,
> +	.free_preparse = user_free_preparse,
> +	.instantiate = generic_key_instantiate,
> +	.revoke = user_revoke,
> +	.destroy = user_destroy,
> +	.describe = user_describe,
> +	.read = user_read,
> +	.vet_description = guest_attest_vet_description,
> +	.request_key = guest_attest_request_key,
> +};
> +
> +static int __init guest_attest_init(void)
> +{
> +	return register_key_type(&key_type_guest_attest);
> +}
> +
> +static void __exit guest_attest_exit(void)
> +{
> +	unregister_key_type(&key_type_guest_attest);
> +}
> +
> +module_init(guest_attest_init);
> +module_exit(guest_attest_exit);
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
> index 14246fc2fb02..9a1ec85369fe 100644
> --- a/drivers/virt/coco/tdx-guest/Kconfig
> +++ b/drivers/virt/coco/tdx-guest/Kconfig
> @@ -1,6 +1,7 @@
>  config TDX_GUEST_DRIVER
>  	tristate "TDX Guest driver"
>  	depends on INTEL_TDX_GUEST
> +	select GUEST_ATTEST
>  	help
>  	  The driver provides userspace interface to communicate with
>  	  the TDX module to request the TDX guest details like attestation
> diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
> index 388491fa63a1..65b5aab284d9 100644
> --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
> +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
> @@ -13,11 +13,13 @@
>  #include <linux/string.h>
>  #include <linux/uaccess.h>
>  #include <linux/set_memory.h>
> +#include <linux/key-type.h>
>  
>  #include <uapi/linux/tdx-guest.h>
>  
>  #include <asm/cpu_device_id.h>
>  #include <asm/tdx.h>
> +#include "../guest-attest/guest-attest.h"
>  
>  /*
>   * Intel's SGX QE implementation generally uses Quote size less
> @@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
>  };
>  MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
>  
> +static int tdx_request_attest(struct key *key, int level,
> +			      struct key *dest_keyring, void *payload,
> +			      int payload_len, struct key *authkey)
> +{
> +	u8 *tdreport;
> +	long ret;
> +
> +	tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
> +	if (!tdreport)
> +		return -ENOMEM;
> +
> +	/* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
> +	ret = tdx_mcall_get_report0(payload, tdreport);
> +	if (ret)
> +		goto out;
> +
> +	mutex_lock(&quote_lock);
> +
> +	memset(qentry->buf, 0, qentry->buf_len);
> +	reinit_completion(&qentry->compl);
> +	qentry->valid = true;
> +
> +	/* Submit GetQuote Request using GetQuote hyperetall */
> +	ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
> +	if (ret) {
> +		pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
> +		ret = -EIO;
> +		goto quote_failed;
> +	}
> +
> +	/*
> +	 * Although the GHCI specification does not state explicitly that
> +	 * the VMM must not wait indefinitely for the Quote request to be
> +	 * completed, a sane VMM should always notify the guest after a
> +	 * certain time, regardless of whether the Quote generation is
> +	 * successful or not.  For now just assume the VMM will do so.
> +	 */
> +	wait_for_completion(&qentry->compl);
> +
> +	ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
> +				       dest_keyring, authkey);
> +
> +quote_failed:
> +	qentry->valid = false;
> +	mutex_unlock(&quote_lock);
> +out:
> +	kfree(tdreport);
> +	return ret;
> +}
> +
> +static struct guest_attest_ops tdx_attest_ops = {
> +	.name = KBUILD_MODNAME,
> +	.module = THIS_MODULE,
> +	.request_attest = tdx_request_attest,
> +};
> +
>  static int __init tdx_guest_init(void)
>  {
>  	int ret;
> @@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
>  	if (ret)
>  		goto free_quote;
>  
> +	ret = register_guest_attest_ops(&tdx_attest_ops);
> +	if (ret)
> +		goto free_irq;
> +
>  	return 0;
>  
> +free_irq:
> +	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>  free_quote:
>  	free_quote_entry(qentry);
>  free_misc:
> @@ -264,6 +328,7 @@ module_init(tdx_guest_init);
>  
>  static void __exit tdx_guest_exit(void)
>  {
> +	unregister_guest_attest_ops(&tdx_attest_ops);
>  	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>  	free_quote_entry(qentry);
>  	misc_deregister(&tdx_misc_dev);

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-06-24  4:05 ` Dan Williams
  2023-06-25 20:21   ` Dan Williams
@ 2023-06-26  3:07   ` Sathyanarayanan Kuppuswamy
  2023-06-26  4:31     ` Dan Williams
  1 sibling, 1 reply; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-26  3:07 UTC (permalink / raw
  To: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze, Chong Cai,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc



On 6/23/23 9:05 PM, Dan Williams wrote:
> Kuppuswamy Sathyanarayanan wrote:
>> Hi All,
>>
>> In TDX guest, the attestation process is used to verify the TDX guest
>> trustworthiness to other entities before provisioning secrets to the
>> guest.
>>
>> The TDX guest attestation process consists of two steps:
>>
>> 1. TDREPORT generation
>> 2. Quote generation.
>>
>> The First step (TDREPORT generation) involves getting the TDX guest
>> measurement data in the format of TDREPORT which is further used to
>> validate the authenticity of the TDX guest. The second step involves
>> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
>> remotely verifiable Quote. TDREPORT by design can only be verified on
>> the local platform. To support remote verification of the TDREPORT,
>> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
>> locally and convert it to a remotely verifiable Quote. Although
>> attestation software can use communication methods like TCP/IP or
>> vsock to send the TDREPORT to QE, not all platforms support these
>> communication models. So TDX GHCI specification [1] defines a method
>> for Quote generation via hypercalls. Please check the discussion from
>> Google [2] and Alibaba [3] which clarifies the need for hypercall based
>> Quote generation support. This patch set adds this support.
>>
>> Support for TDREPORT generation already exists in the TDX guest driver. 
>> This patchset extends the same driver to add the Quote generation
>> support.
> 
> I missed that the TDREPORT ioctl() and this character device are already
> upstream. The TDREPORT ioctl() if it is only needed for quote generation
> seems a waste because it just retrieves a blob that needs to be turned
> around and injected back into the kernel to generate a quote.

Although the end goal is to generate the quote, the method the user chooses to
achieve it may differ for a variety of reasons. In this case, we're trying to
support the use case where the user will use methods like TCP/IP or vsock to
generate the Quote. They can use the GET_REPORT IOCTL to get the TDREPORT and
send it to the quoting enclave via the above-mentioned methods.  TDVMCALL-based
quote generation is intended for users who, for a variety of security reasons, do
not wish to use the methods described above.



-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-06-26  3:07   ` Sathyanarayanan Kuppuswamy
@ 2023-06-26  4:31     ` Dan Williams
  2023-06-27  7:50       ` Chong Cai
  0 siblings, 1 reply; 41+ messages in thread
From: Dan Williams @ 2023-06-26  4:31 UTC (permalink / raw
  To: Sathyanarayanan Kuppuswamy, Dan Williams, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze, Chong Cai,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

Sathyanarayanan Kuppuswamy wrote:
> 
> 
> On 6/23/23 9:05 PM, Dan Williams wrote:
> > Kuppuswamy Sathyanarayanan wrote:
> >> Hi All,
> >>
> >> In TDX guest, the attestation process is used to verify the TDX guest
> >> trustworthiness to other entities before provisioning secrets to the
> >> guest.
> >>
> >> The TDX guest attestation process consists of two steps:
> >>
> >> 1. TDREPORT generation
> >> 2. Quote generation.
> >>
> >> The First step (TDREPORT generation) involves getting the TDX guest
> >> measurement data in the format of TDREPORT which is further used to
> >> validate the authenticity of the TDX guest. The second step involves
> >> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> >> remotely verifiable Quote. TDREPORT by design can only be verified on
> >> the local platform. To support remote verification of the TDREPORT,
> >> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> >> locally and convert it to a remotely verifiable Quote. Although
> >> attestation software can use communication methods like TCP/IP or
> >> vsock to send the TDREPORT to QE, not all platforms support these
> >> communication models. So TDX GHCI specification [1] defines a method
> >> for Quote generation via hypercalls. Please check the discussion from
> >> Google [2] and Alibaba [3] which clarifies the need for hypercall based
> >> Quote generation support. This patch set adds this support.
> >>
> >> Support for TDREPORT generation already exists in the TDX guest driver. 
> >> This patchset extends the same driver to add the Quote generation
> >> support.
> > 
> > I missed that the TDREPORT ioctl() and this character device are already
> > upstream. The TDREPORT ioctl() if it is only needed for quote generation
> > seems a waste because it just retrieves a blob that needs to be turned
> > around and injected back into the kernel to generate a quote.
> 
> Although the end goal is to generate the quote, the method the user chooses to
> achieve it may differ for a variety of reasons. In this case, we're trying to
> support the use case where the user will use methods like TCP/IP or vsock to
> generate the Quote. They can use the GET_REPORT IOCTL to get the TDREPORT and
> send it to the quoting enclave via the above-mentioned methods.  TDVMCALL-based
> quote generation is intended for users who, for a variety of security reasons, do
> not wish to use the methods described above.

This flexibility could be supported with keys if necessary, although I
would want to hear strong reasons not a "variety of reasons" why
everyone cannot use a unified approach. ABI proliferation has a
maintenance cost and a collaboration cost. It is within the kernel
community's right to judge the cost of ABI flexibility and opt for a
constrained implementation if that cost is too high.

What I would ask of those who absolutely cannot support the TDVMCALL
method is to contribute a solution that intercepts the "upcall" to the
platform "guest_attest_ops" and turn it into a typical keys upcall to
userspace that can use the report data with a vsock tunnel.

That way the end result is still the same, a key established with the
TDX Quote evidence contained within a Linux-defined envelope.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
@ 2023-06-26 18:57         ` Dionna Amalie Glaze
  2023-06-27  0:39           ` Sathyanarayanan Kuppuswamy
  2023-06-28  0:11           ` Dan Williams
  2023-06-27 23:44         ` Dan Williams
  1 sibling, 2 replies; 41+ messages in thread
From: Dionna Amalie Glaze @ 2023-06-26 18:57 UTC (permalink / raw
  To: Sathyanarayanan Kuppuswamy
  Cc: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa, Erdem Aktas,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, dhowells, brijesh.singh, atishp,
	gregkh, linux-coco, joey.gouly

On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>
> Hi Dan,
>
> On 6/23/23 3:27 PM, Dan Williams wrote:
> > Dan Williams wrote:
> >> [ add David, Brijesh, and Atish]
> >>
> >> Kuppuswamy Sathyanarayanan wrote:
> >>> In TDX guest, the second stage of the attestation process is Quote
> >>> generation. This process is required to convert the locally generated
> >>> TDREPORT into a remotely verifiable Quote. It involves sending the
> >>> TDREPORT data to a Quoting Enclave (QE) which will verify the
> >>> integrity of the TDREPORT and sign it with an attestation key.
> >>>
> >>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> >>> allow the user agent to get the TD Quote.
> >>>
> >>> Add a kernel selftest module to verify the Quote generation feature.
> >>>
> >>> TD Quote generation involves following steps:
> >>>
> >>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> >>> * Embed the TDREPORT data in quote buffer and request for quote
> >>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> >>> * Upon completion of the GetQuote request, check for non zero value
> >>>   in the status field of Quote header to make sure the generated
> >>>   quote is valid.
> >>
> >> What this cover letter does not say is that this is adding another
> >> instance of the similar pattern as SNP_GET_REPORT.
> >>
> >> Linux is best served when multiple vendors trying to do similar
> >> operations are brought together behind a common ABI. We see this in the
> >> history of wrangling SCSI vendors behind common interfaces. Now multiple
> >> confidential computing vendors trying to develop similar flows with
> >> differentiated formats where that differentiation need not leak over the
> >> ABI boundary.
> > [..]
> >
> > Below is a rough mock up of this approach to demonstrate the direction.
> > Again, the goal is to define an ABI that can support any vendor's
> > arch-specific attestation method and key provisioning flows without
> > leaking vendor-specific details, or confidential material over the
> > user/kernel ABI.
>
> Thanks for working on this mock code and helping out. It gives me the
> general idea about your proposal.
>
> >
> > The observation is that there are a sufficient number of attestation
> > flows available to review where Linux can define a superset ABI to
> > contain them all. The other observation is that the implementations have
> > features that may cross-polinate over time. For example the SEV
> > privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> > PCRs) mechanisms address generic Confidential Computing use cases.
>
>
> I agree with your point about VMPL and RTMR feature cases. This observation
> is valid for AMD SEV and TDX attestation flows. But I am not sure whether
> it will hold true for other vendor implementations. Our sample set is not
> good enough to make this conclusion. The reason for my concern is, if you
> check the ABI interface used in the S390 arch attestation driver
> (drivers/s390/char/uvdevice.c), you would notice that there is a significant
> difference between the ABI used in that driver and SEV/TDX drivers. The S390
> driver attestation request appears to accept two data blobs as input, as well
> as a variety of vendor-specific header configurations.
>
> Maybe the s390 attestation model is a special case, but, I think we consider
> this issue. Since we don't have a common spec, there is chance that any
> superset ABI we define now may not meet future vendor requirements. One way to
> handle it to leave enough space in the generic ABI to handle future vendor
> requirements.
>
> I think it would be better if other vendors (like ARM or RISC) can comment and
> confirm whether this proposal meets their demands.
>

The VMPL-based separation that will house the supervisor module known
as SVSM can have protocols that implement a TPM command interface, or
an RTMR-extension interface, and will also need to have an
SVSM-specific protocol attestation report format to keep the secure
chain of custody apparent. We'd have different formats and protocols
in the kernel, at least, to speak to each technology. I'm not sure
it's worth the trouble of papering over all the... 3-4 technologies
with similar but still weirdly different formats and ways of doing
things with an abstracted attestation ABI, especially since the output
all has to be interpreted in an architecture-specific way anyway.

ARM's Confidential Computing Realm Management Extensions (RME) seems
to be going along the lines of a runtime measurement register model
with their hardware enforced security. The number of registers isn't
prescribed in the spec.

+Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
you know who would be best to weigh in on this discussion of a unified
attestation model?

> >
> > Vendor specific ioctls for all of this feels like surrender when Linux
> > already has the keys subsystem which has plenty of degrees of freedom
> > for tracking blobs with signatures and using those blobs to instantiate
> > other blobs. It already serves as the ABI wrapping various TPM
> > implementations and marshaling keys for storage encryption and other use
> > cases that intersect Confidential Computing.
> >
> > The benefit of deprecating vendor-specific abstraction layers in
> > userspace is secondary. The primary benefit is collaboration. It enables
> > kernel developers from various architectures to collaborate on common
> > infrastructure. If, referring back to my previous example, SEV adopts an
> > RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
> > unfortunate if those efforts were siloed, duplicated, and needlessly
> > differentiated to userspace. So while there are arguably a manageable
> > number of basic arch attestation methods the planned expansion of those
> > to build incremental functionality is where I believe we, as a
> > community, will be glad that we invested in a "Linux format" for all of
> > this.
> >
> > An example, to show what the strawman patch below enables: (req_key is
> > the sample program from "man 2 request_key")
> >
> > # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
> > Key ID is 10e2f3a7
> > # keyctl pipe 0x10e2f3a7 | hexdump -C
> > 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
> > 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
> > 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00004000
> >
> > This is the kernel instantiating a TDX Quote without the TDREPORT
> > implementation detail ever leaving the kernel. Now, this is only the
>
> IIUC, the idea here is to cache the quote data and return it to the user whenever
> possible, right? If yes, I think such optimization may not be very useful for our
> case. AFAIK, the quote data will change whenever there is a change in the guest
> measurement data. Since the validity of the generated quote will not be long,
> and the frequency of quote generation requests is expected to be less, we may not
> get much benefit from caching the quote data. I think we can keep this logic simple
> by directly retrieving the quote data from the quoting enclave whenever there is a
> request from the user.
>
> > top-half of what is needed. The missing bottom half takes that material
> > and uses it to instantiate derived key material like the storage
> > decryption key internal to the kernel. See "The Process" in
> > Documentation/security/keys/request-key.rst for how the Keys subsystem
> > handles the "keys for keys" use case.
>
> This is only useful for key-server use case, right? Attestation can also be
> used for use cases like pattern matching or uploading some secure data, etc.
> Since key-server is not the only use case, does it make sense to suppport
> this derived key feature?
>
> >
> > ---
> > diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> > index f79ab13a5c28..0f775847028e 100644
> > --- a/drivers/virt/Kconfig
> > +++ b/drivers/virt/Kconfig
> > @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
> >
> >  source "drivers/virt/coco/tdx-guest/Kconfig"
> >
> > +config GUEST_ATTEST
> > +     tristate
> > +     select KEYS
> > +
> >  endif
> > diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> > index e9aa6fc96fab..66f6b838f8f4 100644
> > --- a/drivers/virt/Makefile
> > +++ b/drivers/virt/Makefile
> > @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)              += acrn/
> >  obj-$(CONFIG_EFI_SECRET)     += coco/efi_secret/
> >  obj-$(CONFIG_SEV_GUEST)              += coco/sev-guest/
> >  obj-$(CONFIG_INTEL_TDX_GUEST)        += coco/tdx-guest/
> > +obj-$(CONFIG_GUEST_ATTEST)   += coco/guest-attest/
> > diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
> > new file mode 100644
> > index 000000000000..5581c5a27588
> > --- /dev/null
> > +++ b/drivers/virt/coco/guest-attest/Makefile
> > @@ -0,0 +1,2 @@
> > +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
> > +guest_attest-y := key.o
> > diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
> > new file mode 100644
> > index 000000000000..2a494b6dd7a7
> > --- /dev/null
> > +++ b/drivers/virt/coco/guest-attest/key.c
> > @@ -0,0 +1,159 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +#include <linux/seq_file.h>
> > +#include <linux/key-type.h>
> > +#include <linux/module.h>
> > +#include <linux/base64.h>
> > +
> > +#include <keys/request_key_auth-type.h>
> > +#include <keys/user-type.h>
> > +
> > +#include "guest-attest.h"
>
> Can you share you guest-attest.h?
>
> > +
> > +static LIST_HEAD(guest_attest_list);
> > +static DECLARE_RWSEM(guest_attest_rwsem);
> > +
> > +static struct guest_attest_ops *fetch_ops(void)
> > +{
> > +     return list_first_entry_or_null(&guest_attest_list,
> > +                                     struct guest_attest_ops, list);
> > +}
> > +
> > +static struct guest_attest_ops *get_ops(void)
> > +{
> > +     down_read(&guest_attest_rwsem);
> > +     return fetch_ops();
> > +}
> > +
> > +static void put_ops(void)
> > +{
> > +     up_read(&guest_attest_rwsem);
> > +}
> > +
> > +int register_guest_attest_ops(struct guest_attest_ops *ops)
> > +{
> > +     struct guest_attest_ops *conflict;
> > +     int rc;
> > +
> > +     down_write(&guest_attest_rwsem);
> > +     conflict = fetch_ops();
> > +     if (conflict) {
> > +             pr_err("\"%s\" ops already registered\n", conflict->name);
> > +             rc = -EEXIST;
> > +             goto out;
> > +     }
> > +     list_add(&ops->list, &guest_attest_list);
> > +     try_module_get(ops->module);
> > +     rc = 0;
> > +out:
> > +     up_write(&guest_attest_rwsem);
> > +     return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
> > +
> > +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
> > +{
> > +     down_write(&guest_attest_rwsem);
> > +     list_del(&ops->list);
> > +     up_write(&guest_attest_rwsem);
> > +     module_put(ops->module);
> > +}
> > +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
> > +
> > +static int __guest_attest_request_key(struct key *key, int level,
> > +                                   struct key *dest_keyring,
> > +                                   const char *callout_info, int callout_len,
> > +                                   struct key *authkey)
> > +{
> > +     struct guest_attest_ops *ops;
> > +     void *payload = NULL;
> > +     int rc, payload_len;
> > +
> > +     ops = get_ops();
> > +     if (!ops)
> > +             return -ENOKEY;
> > +
> > +     payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
> > +     if (!payload) {
> > +             rc = -ENOMEM;
> > +             goto out;
> > +     }
>
> Is the idea to get the values like vmpl part of the payload?
>
> > +
> > +     payload_len = base64_decode(callout_info, callout_len, payload);
> > +     if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
> > +             rc = -EINVAL;
> > +             goto out;
> > +     }
> > +
> > +     rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
> > +                              authkey);
> > +out:
> > +     kfree(payload);
> > +     put_ops();
> > +     return rc;
> > +}
> > +
> > +static int guest_attest_request_key(struct key *authkey, void *data)
> > +{
> > +     struct request_key_auth *rka = get_request_key_auth(authkey);
> > +     struct key *key = rka->target_key;
> > +     unsigned long long id;
> > +     int rc, level;
> > +
> > +     pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
> > +              rka->callout_info ? (char *)rka->callout_info : "\"none\"");
> > +
> > +     if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
> > +             return -EINVAL;
> > +
>
> Can you explain some details about the id and level? It is not very clear why
> we need it.
>
> > +     if (!rka->callout_info) {
> > +             rc = -EINVAL;
> > +             goto out;
> > +     }
> > +
> > +     rc = __guest_attest_request_key(key, level, rka->dest_keyring,
> > +                                     rka->callout_info, rka->callout_len,
> > +                                     authkey);
> > +out:
> > +     complete_request_key(authkey, rc);
> > +     return rc;
> > +}
> > +
> > +static int guest_attest_vet_description(const char *desc)
> > +{
> > +     unsigned long long id;
> > +     int level;
> > +
> > +     if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
> > +             return -EINVAL;
> > +     return 0;
> > +}
> > +
> > +static struct key_type key_type_guest_attest = {
> > +     .name = "guest_attest",
> > +     .preparse = user_preparse,
> > +     .free_preparse = user_free_preparse,
> > +     .instantiate = generic_key_instantiate,
> > +     .revoke = user_revoke,
> > +     .destroy = user_destroy,
> > +     .describe = user_describe,
> > +     .read = user_read,
> > +     .vet_description = guest_attest_vet_description,
> > +     .request_key = guest_attest_request_key,
> > +};
> > +
> > +static int __init guest_attest_init(void)
> > +{
> > +     return register_key_type(&key_type_guest_attest);
> > +}
> > +
> > +static void __exit guest_attest_exit(void)
> > +{
> > +     unregister_key_type(&key_type_guest_attest);
> > +}
> > +
> > +module_init(guest_attest_init);
> > +module_exit(guest_attest_exit);
> > +MODULE_LICENSE("GPL v2");
> > diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
> > index 14246fc2fb02..9a1ec85369fe 100644
> > --- a/drivers/virt/coco/tdx-guest/Kconfig
> > +++ b/drivers/virt/coco/tdx-guest/Kconfig
> > @@ -1,6 +1,7 @@
> >  config TDX_GUEST_DRIVER
> >       tristate "TDX Guest driver"
> >       depends on INTEL_TDX_GUEST
> > +     select GUEST_ATTEST
> >       help
> >         The driver provides userspace interface to communicate with
> >         the TDX module to request the TDX guest details like attestation
> > diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
> > index 388491fa63a1..65b5aab284d9 100644
> > --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
> > +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
> > @@ -13,11 +13,13 @@
> >  #include <linux/string.h>
> >  #include <linux/uaccess.h>
> >  #include <linux/set_memory.h>
> > +#include <linux/key-type.h>
> >
> >  #include <uapi/linux/tdx-guest.h>
> >
> >  #include <asm/cpu_device_id.h>
> >  #include <asm/tdx.h>
> > +#include "../guest-attest/guest-attest.h"
> >
> >  /*
> >   * Intel's SGX QE implementation generally uses Quote size less
> > @@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
> >  };
> >  MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
> >
> > +static int tdx_request_attest(struct key *key, int level,
> > +                           struct key *dest_keyring, void *payload,
> > +                           int payload_len, struct key *authkey)
> > +{
> > +     u8 *tdreport;
> > +     long ret;
> > +
> > +     tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
> > +     if (!tdreport)
> > +             return -ENOMEM;
> > +
> > +     /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
> > +     ret = tdx_mcall_get_report0(payload, tdreport);
> > +     if (ret)
> > +             goto out;
> > +
> > +     mutex_lock(&quote_lock);
> > +
> > +     memset(qentry->buf, 0, qentry->buf_len);
> > +     reinit_completion(&qentry->compl);
> > +     qentry->valid = true;
> > +
> > +     /* Submit GetQuote Request using GetQuote hyperetall */
> > +     ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
> > +     if (ret) {
> > +             pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
> > +             ret = -EIO;
> > +             goto quote_failed;
> > +     }
> > +
> > +     /*
> > +      * Although the GHCI specification does not state explicitly that
> > +      * the VMM must not wait indefinitely for the Quote request to be
> > +      * completed, a sane VMM should always notify the guest after a
> > +      * certain time, regardless of whether the Quote generation is
> > +      * successful or not.  For now just assume the VMM will do so.
> > +      */
> > +     wait_for_completion(&qentry->compl);
> > +
> > +     ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
> > +                                    dest_keyring, authkey);
> > +
> > +quote_failed:
> > +     qentry->valid = false;
> > +     mutex_unlock(&quote_lock);
> > +out:
> > +     kfree(tdreport);
> > +     return ret;
> > +}
> > +
> > +static struct guest_attest_ops tdx_attest_ops = {
> > +     .name = KBUILD_MODNAME,
> > +     .module = THIS_MODULE,
> > +     .request_attest = tdx_request_attest,
> > +};
> > +
> >  static int __init tdx_guest_init(void)
> >  {
> >       int ret;
> > @@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
> >       if (ret)
> >               goto free_quote;
> >
> > +     ret = register_guest_attest_ops(&tdx_attest_ops);
> > +     if (ret)
> > +             goto free_irq;
> > +
> >       return 0;
> >
> > +free_irq:
> > +     tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
> >  free_quote:
> >       free_quote_entry(qentry);
> >  free_misc:
> > @@ -264,6 +328,7 @@ module_init(tdx_guest_init);
> >
> >  static void __exit tdx_guest_exit(void)
> >  {
> > +     unregister_guest_attest_ops(&tdx_attest_ops);
> >       tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
> >       free_quote_entry(qentry);
> >       misc_deregister(&tdx_misc_dev);
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer



--
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-26 18:57         ` Dionna Amalie Glaze
@ 2023-06-27  0:39           ` Sathyanarayanan Kuppuswamy
  2023-06-28 15:41             ` Samuel Ortiz
  2023-06-28  0:11           ` Dan Williams
  1 sibling, 1 reply; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-27  0:39 UTC (permalink / raw
  To: Dionna Amalie Glaze
  Cc: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa, Erdem Aktas,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, dhowells, brijesh.singh, atishp,
	gregkh, linux-coco, joey.gouly, Atish Kumar Patra

+Atish

Atish, any comments on this topic from RISC-v?

On 6/26/23 11:57 AM, Dionna Amalie Glaze wrote:
> On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>>
>> Hi Dan,
>>
>> On 6/23/23 3:27 PM, Dan Williams wrote:
>>> Dan Williams wrote:
>>>> [ add David, Brijesh, and Atish]
>>>>
>>>> Kuppuswamy Sathyanarayanan wrote:
>>>>> In TDX guest, the second stage of the attestation process is Quote
>>>>> generation. This process is required to convert the locally generated
>>>>> TDREPORT into a remotely verifiable Quote. It involves sending the
>>>>> TDREPORT data to a Quoting Enclave (QE) which will verify the
>>>>> integrity of the TDREPORT and sign it with an attestation key.
>>>>>
>>>>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
>>>>> allow the user agent to get the TD Quote.
>>>>>
>>>>> Add a kernel selftest module to verify the Quote generation feature.
>>>>>
>>>>> TD Quote generation involves following steps:
>>>>>
>>>>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
>>>>> * Embed the TDREPORT data in quote buffer and request for quote
>>>>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
>>>>> * Upon completion of the GetQuote request, check for non zero value
>>>>>   in the status field of Quote header to make sure the generated
>>>>>   quote is valid.
>>>>
>>>> What this cover letter does not say is that this is adding another
>>>> instance of the similar pattern as SNP_GET_REPORT.
>>>>
>>>> Linux is best served when multiple vendors trying to do similar
>>>> operations are brought together behind a common ABI. We see this in the
>>>> history of wrangling SCSI vendors behind common interfaces. Now multiple
>>>> confidential computing vendors trying to develop similar flows with
>>>> differentiated formats where that differentiation need not leak over the
>>>> ABI boundary.
>>> [..]
>>>
>>> Below is a rough mock up of this approach to demonstrate the direction.
>>> Again, the goal is to define an ABI that can support any vendor's
>>> arch-specific attestation method and key provisioning flows without
>>> leaking vendor-specific details, or confidential material over the
>>> user/kernel ABI.
>>
>> Thanks for working on this mock code and helping out. It gives me the
>> general idea about your proposal.
>>
>>>
>>> The observation is that there are a sufficient number of attestation
>>> flows available to review where Linux can define a superset ABI to
>>> contain them all. The other observation is that the implementations have
>>> features that may cross-polinate over time. For example the SEV
>>> privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
>>> PCRs) mechanisms address generic Confidential Computing use cases.
>>
>>
>> I agree with your point about VMPL and RTMR feature cases. This observation
>> is valid for AMD SEV and TDX attestation flows. But I am not sure whether
>> it will hold true for other vendor implementations. Our sample set is not
>> good enough to make this conclusion. The reason for my concern is, if you
>> check the ABI interface used in the S390 arch attestation driver
>> (drivers/s390/char/uvdevice.c), you would notice that there is a significant
>> difference between the ABI used in that driver and SEV/TDX drivers. The S390
>> driver attestation request appears to accept two data blobs as input, as well
>> as a variety of vendor-specific header configurations.
>>
>> Maybe the s390 attestation model is a special case, but, I think we consider
>> this issue. Since we don't have a common spec, there is chance that any
>> superset ABI we define now may not meet future vendor requirements. One way to
>> handle it to leave enough space in the generic ABI to handle future vendor
>> requirements.
>>
>> I think it would be better if other vendors (like ARM or RISC) can comment and
>> confirm whether this proposal meets their demands.
>>
> 
> The VMPL-based separation that will house the supervisor module known
> as SVSM can have protocols that implement a TPM command interface, or
> an RTMR-extension interface, and will also need to have an
> SVSM-specific protocol attestation report format to keep the secure
> chain of custody apparent. We'd have different formats and protocols
> in the kernel, at least, to speak to each technology. I'm not sure
> it's worth the trouble of papering over all the... 3-4 technologies
> with similar but still weirdly different formats and ways of doing
> things with an abstracted attestation ABI, especially since the output
> all has to be interpreted in an architecture-specific way anyway.
> 
> ARM's Confidential Computing Realm Management Extensions (RME) seems
> to be going along the lines of a runtime measurement register model
> with their hardware enforced security. The number of registers isn't
> prescribed in the spec.
> 
> +Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
> you know who would be best to weigh in on this discussion of a unified
> attestation model?


> 
>>>
>>> Vendor specific ioctls for all of this feels like surrender when Linux
>>> already has the keys subsystem which has plenty of degrees of freedom
>>> for tracking blobs with signatures and using those blobs to instantiate
>>> other blobs. It already serves as the ABI wrapping various TPM
>>> implementations and marshaling keys for storage encryption and other use
>>> cases that intersect Confidential Computing.
>>>
>>> The benefit of deprecating vendor-specific abstraction layers in
>>> userspace is secondary. The primary benefit is collaboration. It enables
>>> kernel developers from various architectures to collaborate on common
>>> infrastructure. If, referring back to my previous example, SEV adopts an
>>> RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
>>> unfortunate if those efforts were siloed, duplicated, and needlessly
>>> differentiated to userspace. So while there are arguably a manageable
>>> number of basic arch attestation methods the planned expansion of those
>>> to build incremental functionality is where I believe we, as a
>>> community, will be glad that we invested in a "Linux format" for all of
>>> this.
>>>
>>> An example, to show what the strawman patch below enables: (req_key is
>>> the sample program from "man 2 request_key")
>>>
>>> # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
>>> Key ID is 10e2f3a7
>>> # keyctl pipe 0x10e2f3a7 | hexdump -C
>>> 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
>>> 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
>>> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>>> *
>>> 00004000
>>>
>>> This is the kernel instantiating a TDX Quote without the TDREPORT
>>> implementation detail ever leaving the kernel. Now, this is only the
>>
>> IIUC, the idea here is to cache the quote data and return it to the user whenever
>> possible, right? If yes, I think such optimization may not be very useful for our
>> case. AFAIK, the quote data will change whenever there is a change in the guest
>> measurement data. Since the validity of the generated quote will not be long,
>> and the frequency of quote generation requests is expected to be less, we may not
>> get much benefit from caching the quote data. I think we can keep this logic simple
>> by directly retrieving the quote data from the quoting enclave whenever there is a
>> request from the user.
>>
>>> top-half of what is needed. The missing bottom half takes that material
>>> and uses it to instantiate derived key material like the storage
>>> decryption key internal to the kernel. See "The Process" in
>>> Documentation/security/keys/request-key.rst for how the Keys subsystem
>>> handles the "keys for keys" use case.
>>
>> This is only useful for key-server use case, right? Attestation can also be
>> used for use cases like pattern matching or uploading some secure data, etc.
>> Since key-server is not the only use case, does it make sense to suppport
>> this derived key feature?
>>
>>>
>>> ---
>>> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
>>> index f79ab13a5c28..0f775847028e 100644
>>> --- a/drivers/virt/Kconfig
>>> +++ b/drivers/virt/Kconfig
>>> @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>>>
>>>  source "drivers/virt/coco/tdx-guest/Kconfig"
>>>
>>> +config GUEST_ATTEST
>>> +     tristate
>>> +     select KEYS
>>> +
>>>  endif
>>> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
>>> index e9aa6fc96fab..66f6b838f8f4 100644
>>> --- a/drivers/virt/Makefile
>>> +++ b/drivers/virt/Makefile
>>> @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)              += acrn/
>>>  obj-$(CONFIG_EFI_SECRET)     += coco/efi_secret/
>>>  obj-$(CONFIG_SEV_GUEST)              += coco/sev-guest/
>>>  obj-$(CONFIG_INTEL_TDX_GUEST)        += coco/tdx-guest/
>>> +obj-$(CONFIG_GUEST_ATTEST)   += coco/guest-attest/
>>> diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
>>> new file mode 100644
>>> index 000000000000..5581c5a27588
>>> --- /dev/null
>>> +++ b/drivers/virt/coco/guest-attest/Makefile
>>> @@ -0,0 +1,2 @@
>>> +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
>>> +guest_attest-y := key.o
>>> diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
>>> new file mode 100644
>>> index 000000000000..2a494b6dd7a7
>>> --- /dev/null
>>> +++ b/drivers/virt/coco/guest-attest/key.c
>>> @@ -0,0 +1,159 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
>>> +
>>> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>> +#include <linux/seq_file.h>
>>> +#include <linux/key-type.h>
>>> +#include <linux/module.h>
>>> +#include <linux/base64.h>
>>> +
>>> +#include <keys/request_key_auth-type.h>
>>> +#include <keys/user-type.h>
>>> +
>>> +#include "guest-attest.h"
>>
>> Can you share you guest-attest.h?
>>
>>> +
>>> +static LIST_HEAD(guest_attest_list);
>>> +static DECLARE_RWSEM(guest_attest_rwsem);
>>> +
>>> +static struct guest_attest_ops *fetch_ops(void)
>>> +{
>>> +     return list_first_entry_or_null(&guest_attest_list,
>>> +                                     struct guest_attest_ops, list);
>>> +}
>>> +
>>> +static struct guest_attest_ops *get_ops(void)
>>> +{
>>> +     down_read(&guest_attest_rwsem);
>>> +     return fetch_ops();
>>> +}
>>> +
>>> +static void put_ops(void)
>>> +{
>>> +     up_read(&guest_attest_rwsem);
>>> +}
>>> +
>>> +int register_guest_attest_ops(struct guest_attest_ops *ops)
>>> +{
>>> +     struct guest_attest_ops *conflict;
>>> +     int rc;
>>> +
>>> +     down_write(&guest_attest_rwsem);
>>> +     conflict = fetch_ops();
>>> +     if (conflict) {
>>> +             pr_err("\"%s\" ops already registered\n", conflict->name);
>>> +             rc = -EEXIST;
>>> +             goto out;
>>> +     }
>>> +     list_add(&ops->list, &guest_attest_list);
>>> +     try_module_get(ops->module);
>>> +     rc = 0;
>>> +out:
>>> +     up_write(&guest_attest_rwsem);
>>> +     return rc;
>>> +}
>>> +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
>>> +
>>> +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
>>> +{
>>> +     down_write(&guest_attest_rwsem);
>>> +     list_del(&ops->list);
>>> +     up_write(&guest_attest_rwsem);
>>> +     module_put(ops->module);
>>> +}
>>> +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
>>> +
>>> +static int __guest_attest_request_key(struct key *key, int level,
>>> +                                   struct key *dest_keyring,
>>> +                                   const char *callout_info, int callout_len,
>>> +                                   struct key *authkey)
>>> +{
>>> +     struct guest_attest_ops *ops;
>>> +     void *payload = NULL;
>>> +     int rc, payload_len;
>>> +
>>> +     ops = get_ops();
>>> +     if (!ops)
>>> +             return -ENOKEY;
>>> +
>>> +     payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
>>> +     if (!payload) {
>>> +             rc = -ENOMEM;
>>> +             goto out;
>>> +     }
>>
>> Is the idea to get the values like vmpl part of the payload?
>>
>>> +
>>> +     payload_len = base64_decode(callout_info, callout_len, payload);
>>> +     if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
>>> +             rc = -EINVAL;
>>> +             goto out;
>>> +     }
>>> +
>>> +     rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
>>> +                              authkey);
>>> +out:
>>> +     kfree(payload);
>>> +     put_ops();
>>> +     return rc;
>>> +}
>>> +
>>> +static int guest_attest_request_key(struct key *authkey, void *data)
>>> +{
>>> +     struct request_key_auth *rka = get_request_key_auth(authkey);
>>> +     struct key *key = rka->target_key;
>>> +     unsigned long long id;
>>> +     int rc, level;
>>> +
>>> +     pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
>>> +              rka->callout_info ? (char *)rka->callout_info : "\"none\"");
>>> +
>>> +     if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
>>> +             return -EINVAL;
>>> +
>>
>> Can you explain some details about the id and level? It is not very clear why
>> we need it.
>>
>>> +     if (!rka->callout_info) {
>>> +             rc = -EINVAL;
>>> +             goto out;
>>> +     }
>>> +
>>> +     rc = __guest_attest_request_key(key, level, rka->dest_keyring,
>>> +                                     rka->callout_info, rka->callout_len,
>>> +                                     authkey);
>>> +out:
>>> +     complete_request_key(authkey, rc);
>>> +     return rc;
>>> +}
>>> +
>>> +static int guest_attest_vet_description(const char *desc)
>>> +{
>>> +     unsigned long long id;
>>> +     int level;
>>> +
>>> +     if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
>>> +             return -EINVAL;
>>> +     return 0;
>>> +}
>>> +
>>> +static struct key_type key_type_guest_attest = {
>>> +     .name = "guest_attest",
>>> +     .preparse = user_preparse,
>>> +     .free_preparse = user_free_preparse,
>>> +     .instantiate = generic_key_instantiate,
>>> +     .revoke = user_revoke,
>>> +     .destroy = user_destroy,
>>> +     .describe = user_describe,
>>> +     .read = user_read,
>>> +     .vet_description = guest_attest_vet_description,
>>> +     .request_key = guest_attest_request_key,
>>> +};
>>> +
>>> +static int __init guest_attest_init(void)
>>> +{
>>> +     return register_key_type(&key_type_guest_attest);
>>> +}
>>> +
>>> +static void __exit guest_attest_exit(void)
>>> +{
>>> +     unregister_key_type(&key_type_guest_attest);
>>> +}
>>> +
>>> +module_init(guest_attest_init);
>>> +module_exit(guest_attest_exit);
>>> +MODULE_LICENSE("GPL v2");
>>> diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
>>> index 14246fc2fb02..9a1ec85369fe 100644
>>> --- a/drivers/virt/coco/tdx-guest/Kconfig
>>> +++ b/drivers/virt/coco/tdx-guest/Kconfig
>>> @@ -1,6 +1,7 @@
>>>  config TDX_GUEST_DRIVER
>>>       tristate "TDX Guest driver"
>>>       depends on INTEL_TDX_GUEST
>>> +     select GUEST_ATTEST
>>>       help
>>>         The driver provides userspace interface to communicate with
>>>         the TDX module to request the TDX guest details like attestation
>>> diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
>>> index 388491fa63a1..65b5aab284d9 100644
>>> --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
>>> +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
>>> @@ -13,11 +13,13 @@
>>>  #include <linux/string.h>
>>>  #include <linux/uaccess.h>
>>>  #include <linux/set_memory.h>
>>> +#include <linux/key-type.h>
>>>
>>>  #include <uapi/linux/tdx-guest.h>
>>>
>>>  #include <asm/cpu_device_id.h>
>>>  #include <asm/tdx.h>
>>> +#include "../guest-attest/guest-attest.h"
>>>
>>>  /*
>>>   * Intel's SGX QE implementation generally uses Quote size less
>>> @@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
>>>  };
>>>  MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
>>>
>>> +static int tdx_request_attest(struct key *key, int level,
>>> +                           struct key *dest_keyring, void *payload,
>>> +                           int payload_len, struct key *authkey)
>>> +{
>>> +     u8 *tdreport;
>>> +     long ret;
>>> +
>>> +     tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
>>> +     if (!tdreport)
>>> +             return -ENOMEM;
>>> +
>>> +     /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
>>> +     ret = tdx_mcall_get_report0(payload, tdreport);
>>> +     if (ret)
>>> +             goto out;
>>> +
>>> +     mutex_lock(&quote_lock);
>>> +
>>> +     memset(qentry->buf, 0, qentry->buf_len);
>>> +     reinit_completion(&qentry->compl);
>>> +     qentry->valid = true;
>>> +
>>> +     /* Submit GetQuote Request using GetQuote hyperetall */
>>> +     ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
>>> +     if (ret) {
>>> +             pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
>>> +             ret = -EIO;
>>> +             goto quote_failed;
>>> +     }
>>> +
>>> +     /*
>>> +      * Although the GHCI specification does not state explicitly that
>>> +      * the VMM must not wait indefinitely for the Quote request to be
>>> +      * completed, a sane VMM should always notify the guest after a
>>> +      * certain time, regardless of whether the Quote generation is
>>> +      * successful or not.  For now just assume the VMM will do so.
>>> +      */
>>> +     wait_for_completion(&qentry->compl);
>>> +
>>> +     ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
>>> +                                    dest_keyring, authkey);
>>> +
>>> +quote_failed:
>>> +     qentry->valid = false;
>>> +     mutex_unlock(&quote_lock);
>>> +out:
>>> +     kfree(tdreport);
>>> +     return ret;
>>> +}
>>> +
>>> +static struct guest_attest_ops tdx_attest_ops = {
>>> +     .name = KBUILD_MODNAME,
>>> +     .module = THIS_MODULE,
>>> +     .request_attest = tdx_request_attest,
>>> +};
>>> +
>>>  static int __init tdx_guest_init(void)
>>>  {
>>>       int ret;
>>> @@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
>>>       if (ret)
>>>               goto free_quote;
>>>
>>> +     ret = register_guest_attest_ops(&tdx_attest_ops);
>>> +     if (ret)
>>> +             goto free_irq;
>>> +
>>>       return 0;
>>>
>>> +free_irq:
>>> +     tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>>>  free_quote:
>>>       free_quote_entry(qentry);
>>>  free_misc:
>>> @@ -264,6 +328,7 @@ module_init(tdx_guest_init);
>>>
>>>  static void __exit tdx_guest_exit(void)
>>>  {
>>> +     unregister_guest_attest_ops(&tdx_attest_ops);
>>>       tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>>>       free_quote_entry(qentry);
>>>       misc_deregister(&tdx_misc_dev);
>>
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer
> 
> 
> 
> --
> -Dionna Glaze, PhD (she/her)

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-06-26  4:31     ` Dan Williams
@ 2023-06-27  7:50       ` Chong Cai
  2023-08-23  7:33         ` Thomas Gleixner
  0 siblings, 1 reply; 41+ messages in thread
From: Chong Cai @ 2023-06-27  7:50 UTC (permalink / raw
  To: Dan Williams
  Cc: Sathyanarayanan Kuppuswamy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

On Sun, Jun 25, 2023 at 9:32 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Sathyanarayanan Kuppuswamy wrote:
> >
> >
> > On 6/23/23 9:05 PM, Dan Williams wrote:
> > > Kuppuswamy Sathyanarayanan wrote:
> > >> Hi All,
> > >>
> > >> In TDX guest, the attestation process is used to verify the TDX guest
> > >> trustworthiness to other entities before provisioning secrets to the
> > >> guest.
> > >>
> > >> The TDX guest attestation process consists of two steps:
> > >>
> > >> 1. TDREPORT generation
> > >> 2. Quote generation.
> > >>
> > >> The First step (TDREPORT generation) involves getting the TDX guest
> > >> measurement data in the format of TDREPORT which is further used to
> > >> validate the authenticity of the TDX guest. The second step involves
> > >> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> > >> remotely verifiable Quote. TDREPORT by design can only be verified on
> > >> the local platform. To support remote verification of the TDREPORT,
> > >> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> > >> locally and convert it to a remotely verifiable Quote. Although
> > >> attestation software can use communication methods like TCP/IP or
> > >> vsock to send the TDREPORT to QE, not all platforms support these
> > >> communication models. So TDX GHCI specification [1] defines a method
> > >> for Quote generation via hypercalls. Please check the discussion from
> > >> Google [2] and Alibaba [3] which clarifies the need for hypercall based
> > >> Quote generation support. This patch set adds this support.
> > >>
> > >> Support for TDREPORT generation already exists in the TDX guest driver.
> > >> This patchset extends the same driver to add the Quote generation
> > >> support.
> > >
> > > I missed that the TDREPORT ioctl() and this character device are already
> > > upstream. The TDREPORT ioctl() if it is only needed for quote generation
> > > seems a waste because it just retrieves a blob that needs to be turned
> > > around and injected back into the kernel to generate a quote.
> >
> > Although the end goal is to generate the quote, the method the user chooses to
> > achieve it may differ for a variety of reasons. In this case, we're trying to
> > support the use case where the user will use methods like TCP/IP or vsock to
> > generate the Quote. They can use the GET_REPORT IOCTL to get the TDREPORT and
> > send it to the quoting enclave via the above-mentioned methods.  TDVMCALL-based
> > quote generation is intended for users who, for a variety of security reasons, do
> > not wish to use the methods described above.
>
> This flexibility could be supported with keys if necessary, although I
> would want to hear strong reasons not a "variety of reasons" why
> everyone cannot use a unified approach. ABI proliferation has a
> maintenance cost and a collaboration cost. It is within the kernel
> community's right to judge the cost of ABI flexibility and opt for a
> constrained implementation if that cost is too high.
>
> What I would ask of those who absolutely cannot support the TDVMCALL
> method is to contribute a solution that intercepts the "upcall" to the
> platform "guest_attest_ops" and turn it into a typical keys upcall to
> userspace that can use the report data with a vsock tunnel.
>
> That way the end result is still the same, a key established with the
> TDX Quote evidence contained within a Linux-defined envelope.

I agree a unified ABI across vendors would be ideal in the long term.
However, it sounds like a non-trivial task and could take quite some
time to achieve.
Given there's already an AMD equivalent approach upstreamed, can we
also allow this TDVMCALL patch as an intermediate step to unblock
various TDX attestation user cases while targeting unified ABI? The
TDVMCALL here is quite isolated and serves a very specific purpose, it
should be very low risk to other kernel features and easy to be
reverted in the future.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
  2023-06-26 18:57         ` Dionna Amalie Glaze
@ 2023-06-27 23:44         ` Dan Williams
  1 sibling, 0 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-27 23:44 UTC (permalink / raw
  To: Sathyanarayanan Kuppuswamy, Dan Williams, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze, Chong Cai,
	Qinkun Bao, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc, dhowells, brijesh.singh, atishp, gregkh

Sathyanarayanan Kuppuswamy wrote:
> Hi Dan,
> 
> On 6/23/23 3:27 PM, Dan Williams wrote:
> > Dan Williams wrote:
> >> [ add David, Brijesh, and Atish]
> >>
> >> Kuppuswamy Sathyanarayanan wrote:
> >>> In TDX guest, the second stage of the attestation process is Quote
> >>> generation. This process is required to convert the locally generated
> >>> TDREPORT into a remotely verifiable Quote. It involves sending the
> >>> TDREPORT data to a Quoting Enclave (QE) which will verify the
> >>> integrity of the TDREPORT and sign it with an attestation key.
> >>>
> >>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> >>> allow the user agent to get the TD Quote.
> >>>
> >>> Add a kernel selftest module to verify the Quote generation feature.
> >>>
> >>> TD Quote generation involves following steps:
> >>>
> >>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> >>> * Embed the TDREPORT data in quote buffer and request for quote
> >>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> >>> * Upon completion of the GetQuote request, check for non zero value
> >>>   in the status field of Quote header to make sure the generated
> >>>   quote is valid.
> >>
> >> What this cover letter does not say is that this is adding another
> >> instance of the similar pattern as SNP_GET_REPORT.
> >>
> >> Linux is best served when multiple vendors trying to do similar
> >> operations are brought together behind a common ABI. We see this in the
> >> history of wrangling SCSI vendors behind common interfaces. Now multiple
> >> confidential computing vendors trying to develop similar flows with
> >> differentiated formats where that differentiation need not leak over the
> >> ABI boundary.
> > [..]
> > 
> > Below is a rough mock up of this approach to demonstrate the direction.
> > Again, the goal is to define an ABI that can support any vendor's
> > arch-specific attestation method and key provisioning flows without
> > leaking vendor-specific details, or confidential material over the
> > user/kernel ABI.
> 
> Thanks for working on this mock code and helping out. It gives me the
> general idea about your proposal.
> 
> > 
> > The observation is that there are a sufficient number of attestation
> > flows available to review where Linux can define a superset ABI to
> > contain them all. The other observation is that the implementations have
> > features that may cross-polinate over time. For example the SEV
> > privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> > PCRs) mechanisms address generic Confidential Computing use cases.
> 
> 
> I agree with your point about VMPL and RTMR feature cases. This observation
> is valid for AMD SEV and TDX attestation flows. But I am not sure whether
> it will hold true for other vendor implementations. Our sample set is not
> good enough to make this conclusion. The reason for my concern is, if you
> check the ABI interface used in the S390 arch attestation driver
> (drivers/s390/char/uvdevice.c), you would notice that there is a significant
> difference between the ABI used in that driver and SEV/TDX drivers. The S390
> driver attestation request appears to accept two data blobs as input, as well
> as a variety of vendor-specific header configurations.

I would need more time to investigate. It's also the case that if both
major x86 vendors plus ARM and/or RISC-V can all get behind the same
frontend then that is already success in my mind.

> Maybe the s390 attestation model is a special case, but, I think we consider
> this issue. Since we don't have a common spec, there is chance that any
> superset ABI we define now may not meet future vendor requirements. One way to
> handle it to leave enough space in the generic ABI to handle future vendor
> requirements.

Perhaps, but the goal here is to clearly indicate "this is how Linux
conveys confidential computing attestation concepts". If there is future
vendor innovation in this space it needs to consider how it meets the
established needs of Linux, not the other way round.

> I think it would be better if other vendors (like ARM or RISC) can comment and
> confirm whether this proposal meets their demands.

The more participation the better. Open source definitely involves a
component speaking up when the definition of things are still malleable,
or catching issues before they go upstream.

> > Vendor specific ioctls for all of this feels like surrender when Linux
> > already has the keys subsystem which has plenty of degrees of freedom
> > for tracking blobs with signatures and using those blobs to instantiate
> > other blobs. It already serves as the ABI wrapping various TPM
> > implementations and marshaling keys for storage encryption and other use
> > cases that intersect Confidential Computing.
> > 
> > The benefit of deprecating vendor-specific abstraction layers in
> > userspace is secondary. The primary benefit is collaboration. It enables
> > kernel developers from various architectures to collaborate on common
> > infrastructure. If, referring back to my previous example, SEV adopts an
> > RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
> > unfortunate if those efforts were siloed, duplicated, and needlessly
> > differentiated to userspace. So while there are arguably a manageable
> > number of basic arch attestation methods the planned expansion of those
> > to build incremental functionality is where I believe we, as a
> > community, will be glad that we invested in a "Linux format" for all of
> > this.
> > 
> > An example, to show what the strawman patch below enables: (req_key is
> > the sample program from "man 2 request_key")
> > 
> > # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
> > Key ID is 10e2f3a7
> > # keyctl pipe 0x10e2f3a7 | hexdump -C
> > 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
> > 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
> > 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00004000
> > 
> > This is the kernel instantiating a TDX Quote without the TDREPORT
> > implementation detail ever leaving the kernel. Now, this is only the
> 
> IIUC, the idea here is to cache the quote data and return it to the user whenever
> possible, right? If yes, I think such optimization may not be very useful for our
> case. AFAIK, the quote data will change whenever there is a change in the guest
> measurement data. Since the validity of the generated quote will not be long,
> and the frequency of quote generation requests is expected to be less, we may not
> get much benefit from caching the quote data. I think we can keep this logic simple
> by directly retrieving the quote data from the quoting enclave whenever there is a
> request from the user.

The Keys subsystem already supports the concept of keys that expire
immediately, so no need for special consideration here that I can see.

> > top-half of what is needed. The missing bottom half takes that material
> > and uses it to instantiate derived key material like the storage
> > decryption key internal to the kernel. See "The Process" in
> > Documentation/security/keys/request-key.rst for how the Keys subsystem
> > handles the "keys for keys" use case.
> 
> This is only useful for key-server use case, right? Attestation can also be
> used for use cases like pattern matching or uploading some secure data, etc.
> Since key-server is not the only use case, does it make sense to suppport
> this derived key feature?

The Keys subsystem is just a way for both userspace and kernel space to
request the instantiation of blobs that mediate access to another
resource be it another key or something else. So key-server is only one
example client.

The other reason for defining a common frontend is so the kernel can
understand and mediate resource access as a kernel is wont to do. The
ioctl() approach blinds the kernel and requires userspace to repeatedly
solve problems in vendor specific ways. They Keys proposal also has the
property of not requiring userspace round trips for things like
unlocking storage. The driver can just do request_keys() and kick off
the process on its own.

> 
> > 
> > ---
> > diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> > index f79ab13a5c28..0f775847028e 100644
> > --- a/drivers/virt/Kconfig
> > +++ b/drivers/virt/Kconfig
> > @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
> >  
> >  source "drivers/virt/coco/tdx-guest/Kconfig"
> >  
> > +config GUEST_ATTEST
> > +	tristate
> > +	select KEYS
> > +
> >  endif
> > diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> > index e9aa6fc96fab..66f6b838f8f4 100644
> > --- a/drivers/virt/Makefile
> > +++ b/drivers/virt/Makefile
> > @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)		+= acrn/
> >  obj-$(CONFIG_EFI_SECRET)	+= coco/efi_secret/
> >  obj-$(CONFIG_SEV_GUEST)		+= coco/sev-guest/
> >  obj-$(CONFIG_INTEL_TDX_GUEST)	+= coco/tdx-guest/
> > +obj-$(CONFIG_GUEST_ATTEST)	+= coco/guest-attest/
> > diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
> > new file mode 100644
> > index 000000000000..5581c5a27588
> > --- /dev/null
> > +++ b/drivers/virt/coco/guest-attest/Makefile
> > @@ -0,0 +1,2 @@
> > +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
> > +guest_attest-y := key.o
> > diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
> > new file mode 100644
> > index 000000000000..2a494b6dd7a7
> > --- /dev/null
> > +++ b/drivers/virt/coco/guest-attest/key.c
> > @@ -0,0 +1,159 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +#include <linux/seq_file.h>
> > +#include <linux/key-type.h>
> > +#include <linux/module.h>
> > +#include <linux/base64.h>
> > +
> > +#include <keys/request_key_auth-type.h>
> > +#include <keys/user-type.h>
> > +
> > +#include "guest-attest.h"
> 
> Can you share you guest-attest.h?

Apologies, missed a 'git add':

/* SPDX-License-Identifier: GPL-2.0-only */
/* Copyright(c) 2023 Intel Corporation. */

#ifndef __GUEST_ATTEST_H__
#define __GUEST_ATTEST_H__
#include <linux/list.h>

/*
 * arch specific ops, only one is expected to be registered at a time
 * i.e. either SEV or TDX, but not both
 */
struct guest_attest_ops {
        const char *name;
        struct module *module;
        struct list_head list;
        int (*request_attest)(struct key *key, int level,
                              struct key *dest_keyring, void *payload,
                              int payload_len, struct key *authkey);
};

#define GUEST_ATTEST_DATALEN 64

int register_guest_attest_ops(struct guest_attest_ops *ops);
void unregister_guest_attest_ops(struct guest_attest_ops *ops);

#endif /*__GUEST_ATTEST_H__ */


> 
> > +
> > +static LIST_HEAD(guest_attest_list);
> > +static DECLARE_RWSEM(guest_attest_rwsem);
> > +
> > +static struct guest_attest_ops *fetch_ops(void)
> > +{
> > +	return list_first_entry_or_null(&guest_attest_list,
> > +					struct guest_attest_ops, list);
> > +}
> > +
> > +static struct guest_attest_ops *get_ops(void)
> > +{
> > +	down_read(&guest_attest_rwsem);
> > +	return fetch_ops();
> > +}
> > +
> > +static void put_ops(void)
> > +{
> > +	up_read(&guest_attest_rwsem);
> > +}
> > +
> > +int register_guest_attest_ops(struct guest_attest_ops *ops)
> > +{
> > +	struct guest_attest_ops *conflict;
> > +	int rc;
> > +
> > +	down_write(&guest_attest_rwsem);
> > +	conflict = fetch_ops();
> > +	if (conflict) {
> > +		pr_err("\"%s\" ops already registered\n", conflict->name);
> > +		rc = -EEXIST;
> > +		goto out;
> > +	}
> > +	list_add(&ops->list, &guest_attest_list);
> > +	try_module_get(ops->module);
> > +	rc = 0;
> > +out:
> > +	up_write(&guest_attest_rwsem);
> > +	return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
> > +
> > +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
> > +{
> > +	down_write(&guest_attest_rwsem);
> > +	list_del(&ops->list);
> > +	up_write(&guest_attest_rwsem);
> > +	module_put(ops->module);
> > +}
> > +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
> > +
> > +static int __guest_attest_request_key(struct key *key, int level,
> > +				      struct key *dest_keyring,
> > +				      const char *callout_info, int callout_len,
> > +				      struct key *authkey)
> > +{
> > +	struct guest_attest_ops *ops;
> > +	void *payload = NULL;
> > +	int rc, payload_len;
> > +
> > +	ops = get_ops();
> > +	if (!ops)
> > +		return -ENOKEY;
> > +
> > +	payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
> > +	if (!payload) {
> > +		rc = -ENOMEM;
> > +		goto out;
> > +	}
> 
> Is the idea to get the values like vmpl part of the payload?

No, to me vmpl likely needs to be conveyed in the key-description.
Payload is simply the 64-bytes that both SEV and TDX take as input for
the attestation request. The AMD specification seems to imply that
payload is itself a public key? If that is the expectation then it may
be more appropriate to make that a separate retrieval vs something
passed in directly from userspace.

> 
> > +
> > +	payload_len = base64_decode(callout_info, callout_len, payload);
> > +	if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
> > +		rc = -EINVAL;
> > +		goto out;
> > +	}
> > +
> > +	rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
> > +				 authkey);
> > +out:
> > +	kfree(payload);
> > +	put_ops();
> > +	return rc;
> > +}
> > +
> > +static int guest_attest_request_key(struct key *authkey, void *data)
> > +{
> > +	struct request_key_auth *rka = get_request_key_auth(authkey);
> > +	struct key *key = rka->target_key;
> > +	unsigned long long id;
> > +	int rc, level;
> > +
> > +	pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
> > +		 rka->callout_info ? (char *)rka->callout_info : "\"none\"");
> > +
> > +	if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
> > +		return -EINVAL;
> > +
> 
> Can you explain some details about the id and level? It is not very clear why
> we need it.

@level is my mockup of the vmpl concept, and @id is something free-form
for now that the requester of the attestation knows what it means. This
id-scheme would follow a Linux-defined format. More discussion is needed
here about what attestation data is used for and opportunities to define
a common naming scheme. For example, what is the common name of the
guest_attest-key that supports retrieving the storage decryption key
from the key-server.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-26 18:57         ` Dionna Amalie Glaze
  2023-06-27  0:39           ` Sathyanarayanan Kuppuswamy
@ 2023-06-28  0:11           ` Dan Williams
  2023-06-28  1:36             ` Dionna Amalie Glaze
  2023-06-28 15:24             ` Samuel Ortiz
  1 sibling, 2 replies; 41+ messages in thread
From: Dan Williams @ 2023-06-28  0:11 UTC (permalink / raw
  To: Dionna Amalie Glaze, Sathyanarayanan Kuppuswamy
  Cc: Dan Williams, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa, Erdem Aktas,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc, dhowells, brijesh.singh, atishp,
	gregkh, linux-coco, joey.gouly

Dionna Amalie Glaze wrote:
> On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
[..]
> > Hi Dan,
> >
> > On 6/23/23 3:27 PM, Dan Williams wrote:
> > > Dan Williams wrote:
> > >> [ add David, Brijesh, and Atish]
> > >>
> > >> Kuppuswamy Sathyanarayanan wrote:
> > >>> In TDX guest, the second stage of the attestation process is Quote
> > >>> generation. This process is required to convert the locally generated
> > >>> TDREPORT into a remotely verifiable Quote. It involves sending the
> > >>> TDREPORT data to a Quoting Enclave (QE) which will verify the
> > >>> integrity of the TDREPORT and sign it with an attestation key.
> > >>>
> > >>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> > >>> allow the user agent to get the TD Quote.
> > >>>
> > >>> Add a kernel selftest module to verify the Quote generation feature.
> > >>>
> > >>> TD Quote generation involves following steps:
> > >>>
> > >>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> > >>> * Embed the TDREPORT data in quote buffer and request for quote
> > >>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> > >>> * Upon completion of the GetQuote request, check for non zero value
> > >>>   in the status field of Quote header to make sure the generated
> > >>>   quote is valid.
> > >>
> > >> What this cover letter does not say is that this is adding another
> > >> instance of the similar pattern as SNP_GET_REPORT.
> > >>
> > >> Linux is best served when multiple vendors trying to do similar
> > >> operations are brought together behind a common ABI. We see this in the
> > >> history of wrangling SCSI vendors behind common interfaces. Now multiple
> > >> confidential computing vendors trying to develop similar flows with
> > >> differentiated formats where that differentiation need not leak over the
> > >> ABI boundary.
> > > [..]
> > >
> > > Below is a rough mock up of this approach to demonstrate the direction.
> > > Again, the goal is to define an ABI that can support any vendor's
> > > arch-specific attestation method and key provisioning flows without
> > > leaking vendor-specific details, or confidential material over the
> > > user/kernel ABI.
> >
> > Thanks for working on this mock code and helping out. It gives me the
> > general idea about your proposal.
> >
> > >
> > > The observation is that there are a sufficient number of attestation
> > > flows available to review where Linux can define a superset ABI to
> > > contain them all. The other observation is that the implementations have
> > > features that may cross-polinate over time. For example the SEV
> > > privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> > > PCRs) mechanisms address generic Confidential Computing use cases.
> >
> >
> > I agree with your point about VMPL and RTMR feature cases. This observation
> > is valid for AMD SEV and TDX attestation flows. But I am not sure whether
> > it will hold true for other vendor implementations. Our sample set is not
> > good enough to make this conclusion. The reason for my concern is, if you
> > check the ABI interface used in the S390 arch attestation driver
> > (drivers/s390/char/uvdevice.c), you would notice that there is a significant
> > difference between the ABI used in that driver and SEV/TDX drivers. The S390
> > driver attestation request appears to accept two data blobs as input, as well
> > as a variety of vendor-specific header configurations.
> >
> > Maybe the s390 attestation model is a special case, but, I think we consider
> > this issue. Since we don't have a common spec, there is chance that any
> > superset ABI we define now may not meet future vendor requirements. One way to
> > handle it to leave enough space in the generic ABI to handle future vendor
> > requirements.
> >
> > I think it would be better if other vendors (like ARM or RISC) can comment and
> > confirm whether this proposal meets their demands.
> >
> 
> The VMPL-based separation that will house the supervisor module known
> as SVSM can have protocols that implement a TPM command interface, or
> an RTMR-extension interface, and will also need to have an
> SVSM-specific protocol attestation report format to keep the secure
> chain of custody apparent. We'd have different formats and protocols
> in the kernel, at least, to speak to each technology. 

That's where I hope the line can be drawn, i.e. that all of this vendor
differentiation really only matters inside the kernel in the end.

> I'm not sure it's worth the trouble of papering over all the... 3-4
> technologies with similar but still weirdly different formats and ways
> of doing things with an abstracted attestation ABI, especially since
> the output all has to be interpreted in an architecture-specific way
> anyway.

This is where I need help. Can you identify where the following
assertion falls over:

"The minimum viable key-server is one that can generically validate a
blob with an ECDSA signature".

I.e. the fact that SEV and TDX send different length blobs is less
important than validating that signature.

If it is always the case that specific fields in the blob need to be
decoded then yes, that weakens the assertion. However, maybe that means
that kernel code parses the blob and conveys that parsed info along with
vendor attestation payload all signed by a Linux key. I.e. still allow
for a unified output format + signed vendor blob and provide a path to
keep all the vendor specific handling internal to the kernel.

> ARM's Confidential Computing Realm Management Extensions (RME) seems
> to be going along the lines of a runtime measurement register model
> with their hardware enforced security. The number of registers isn't
> prescribed in the spec.
> 
> +Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
> you know who would be best to weigh in on this discussion of a unified
> attestation model?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  0:11           ` Dan Williams
@ 2023-06-28  1:36             ` Dionna Amalie Glaze
  2023-06-28  2:16               ` Huang, Kai
                                 ` (2 more replies)
  2023-06-28 15:24             ` Samuel Ortiz
  1 sibling, 3 replies; 41+ messages in thread
From: Dionna Amalie Glaze @ 2023-06-28  1:36 UTC (permalink / raw
  To: Dan Williams
  Cc: Sathyanarayanan Kuppuswamy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

On Tue, Jun 27, 2023 at 5:13 PM Dan Williams <dan.j.williams@intel.com> wrote:
> [..]
> >
> > The VMPL-based separation that will house the supervisor module known
> > as SVSM can have protocols that implement a TPM command interface, or
> > an RTMR-extension interface, and will also need to have an
> > SVSM-specific protocol attestation report format to keep the secure
> > chain of custody apparent. We'd have different formats and protocols
> > in the kernel, at least, to speak to each technology.
>
> That's where I hope the line can be drawn, i.e. that all of this vendor
> differentiation really only matters inside the kernel in the end.
>
> > I'm not sure it's worth the trouble of papering over all the... 3-4
> > technologies with similar but still weirdly different formats and ways
> > of doing things with an abstracted attestation ABI, especially since
> > the output all has to be interpreted in an architecture-specific way
> > anyway.
>
> This is where I need help. Can you identify where the following
> assertion falls over:
>
> "The minimum viable key-server is one that can generically validate a
> blob with an ECDSA signature".
>
> I.e. the fact that SEV and TDX send different length blobs is less
> important than validating that signature.
>
> If it is always the case that specific fields in the blob need to be
> decoded then yes, that weakens the assertion. However, maybe that means
> that kernel code parses the blob and conveys that parsed info along with
> vendor attestation payload all signed by a Linux key. I.e. still allow
> for a unified output format + signed vendor blob and provide a path to
> keep all the vendor specific handling internal to the kernel.
>

All the specific fields of the blob have to be decoded and subjected
to an acceptance policy. That policy will most always be different
across different platforms and VM owners. I wrote all of
github.com/google/go-sev-guest, including the verification and
validation logic, and it's going to get more complicated, and the
sources of the data that provide validators with notions of what
values can be trusted will be varied. The formats are not
standardized. The Confidential Computing Consortium should be working
toward that, but it's a slow process. There's IETF RATS. There's
in-toto.io attestations. There's Azure's JWT thing. There's a signed
serialized protocol buffer that I've decided is what Google is going
to produce while we figure out all the "right" formats to use. There
will be factions and absolute gridlock for multiple years if we
require solidifying an abstraction for the kernel to manage all this
logic before passing a report on to user space.

Now, not only are the field contents important, the certificates of
the keys that signed the report are important. Each platform has its
own special x509v3 extensions and key hierarchy to express what parts
of the report should be what value if signed by this key, and in TDX's
case there are extra endpoints that you need to query to determine if
there's an active CVE on the associated TCB version. This is how they
avoid adding every cpu's key to the leaf certificate's CRL.

You really shouldn't be putting attestation validation logic in the
kernel. It belongs outside of the VM entirely with the party that will
only release access keys to the VM if it can prove it's running the
software it claims, on the platform it claims. I think Windows puts a
remote procedure call in their guest attestation driver to the Azure
attestation service, and that is an anti-pattern in my mind.

> > ARM's Confidential Computing Realm Management Extensions (RME) seems
> > to be going along the lines of a runtime measurement register model
> > with their hardware enforced security. The number of registers isn't
> > prescribed in the spec.
> >
> > +Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
> > you know who would be best to weigh in on this discussion of a unified
> > attestation model?

-- 
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  1:36             ` Dionna Amalie Glaze
@ 2023-06-28  2:16               ` Huang, Kai
  2023-06-28  6:46                 ` gregkh
  2023-06-28  2:52               ` Dan Williams
  2023-06-28 15:31               ` Samuel Ortiz
  2 siblings, 1 reply; 41+ messages in thread
From: Huang, Kai @ 2023-06-28  2:16 UTC (permalink / raw
  To: Williams, Dan J, dionnaglaze@google.com
  Cc: corbet@lwn.net, Aktas, Erdem, linux-coco@lists.linux.dev,
	shuah@kernel.org, Du, Fan, Luck, Tony,
	dave.hansen@linux.intel.com, brijesh.singh@amd.com,
	joey.gouly@arm.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	hpa@zytor.com, chongc@google.com, bp@alien8.de,
	gregkh@linuxfoundation.org, linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, dhowells@redhat.com,
	Yu, Guorui, x86@kernel.org

On Tue, 2023-06-27 at 18:36 -0700, Dionna Amalie Glaze wrote:
> On Tue, Jun 27, 2023 at 5:13 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > [..]
> > > 
> > > The VMPL-based separation that will house the supervisor module known
> > > as SVSM can have protocols that implement a TPM command interface, or
> > > an RTMR-extension interface, and will also need to have an
> > > SVSM-specific protocol attestation report format to keep the secure
> > > chain of custody apparent. We'd have different formats and protocols
> > > in the kernel, at least, to speak to each technology.
> > 
> > That's where I hope the line can be drawn, i.e. that all of this vendor
> > differentiation really only matters inside the kernel in the end.
> > 
> > > I'm not sure it's worth the trouble of papering over all the... 3-4
> > > technologies with similar but still weirdly different formats and ways
> > > of doing things with an abstracted attestation ABI, especially since
> > > the output all has to be interpreted in an architecture-specific way
> > > anyway.
> > 
> > This is where I need help. Can you identify where the following
> > assertion falls over:
> > 
> > "The minimum viable key-server is one that can generically validate a
> > blob with an ECDSA signature".
> > 
> > I.e. the fact that SEV and TDX send different length blobs is less
> > important than validating that signature.
> > 
> > If it is always the case that specific fields in the blob need to be
> > decoded then yes, that weakens the assertion. However, maybe that means
> > that kernel code parses the blob and conveys that parsed info along with
> > vendor attestation payload all signed by a Linux key. I.e. still allow
> > for a unified output format + signed vendor blob and provide a path to
> > keep all the vendor specific handling internal to the kernel.
> > 
> 
> All the specific fields of the blob have to be decoded and subjected
> to an acceptance policy. That policy will most always be different
> across different platforms and VM owners. I wrote all of
> github.com/google/go-sev-guest, including the verification and
> validation logic, and it's going to get more complicated, and the
> sources of the data that provide validators with notions of what
> values can be trusted will be varied. The formats are not
> standardized. The Confidential Computing Consortium should be working
> toward that, but it's a slow process. There's IETF RATS. There's
> in-toto.io attestations. There's Azure's JWT thing. There's a signed
> serialized protocol buffer that I've decided is what Google is going
> to produce while we figure out all the "right" formats to use. There
> will be factions and absolute gridlock for multiple years if we
> require solidifying an abstraction for the kernel to manage all this
> logic before passing a report on to user space.
> 
> Now, not only are the field contents important, the certificates of
> the keys that signed the report are important. Each platform has its
> own special x509v3 extensions and key hierarchy to express what parts
> of the report should be what value if signed by this key, and in TDX's
> case there are extra endpoints that you need to query to determine if
> there's an active CVE on the associated TCB version. This is how they
> avoid adding every cpu's key to the leaf certificate's CRL.
> 
> You really shouldn't be putting attestation validation logic in the
> kernel.

Agreed.  The data blob for remote verification should be just some data blob to
the kernel.  I think the kernel shouldn't even try to understand the data blob
is for which architecture.  From the kernel's perspective, it should be just
some data blob that the kernel gets from hardware/firmware or whatever embedded
in the root-of-trust in the hardware after taking some input from usrspace for
the unique identity of the blob that can be used to, e.g., mitigate replay-
attack, etc.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-23 22:27     ` Dan Williams
  2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
@ 2023-06-28  2:47       ` Huang, Kai
  1 sibling, 0 replies; 41+ messages in thread
From: Huang, Kai @ 2023-06-28  2:47 UTC (permalink / raw
  To: corbet@lwn.net, dave.hansen@linux.intel.com, x86@kernel.org,
	shuah@kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	Williams, Dan J, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de
  Cc: Aktas, Erdem, Du, Fan, Luck, Tony, dionnaglaze@google.com,
	qinkun@apache.org, kirill.shutemov@linux.intel.com,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	atishp@rivosinc.com, wander@redhat.com, hpa@zytor.com,
	chongc@google.com, gregkh@linuxfoundation.org,
	linux-kselftest@vger.kernel.org, dhowells@redhat.com,
	brijesh.singh@amd.com, Yu, Guorui

On Fri, 2023-06-23 at 15:27 -0700, Dan Williams wrote:
> An example, to show what the strawman patch below enables: (req_key is
> the sample program from "man 2 request_key")
> 
> # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
> Key ID is 10e2f3a7
> # keyctl pipe 0x10e2f3a7 | hexdump -C
> 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
> 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00004000
> 
> This is the kernel instantiating a TDX Quote without the TDREPORT
> implementation detail ever leaving the kernel. 

There might be one small issue here.  The generated Quote has a userspace
provided 'u64 REPORTDATA' (which originally comes from userspace when generating
the TDREPORT) which is supposed to be used by the attestation service to
uniquely identify this Quote to mitigate some sort of reply-attack.  For
instance, the REPORTDATA could be a per-TLS-session data provided by the
attestation service.

I don't know whether other archs have similar thing in their Quote-like blob,
but I believe this in general is a reasonable thing.

IIUC, one problem of using above request_key() to generate the Quote is
potentially other userspace processes are able to see this, while I believe this
REPORTDATA is only supposed to be visible by the application which is
responsible for talking to the attestation service.  

I am not sure whether this is a risk, but using IOCTL() should be able to avoid
this risk.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  1:36             ` Dionna Amalie Glaze
  2023-06-28  2:16               ` Huang, Kai
@ 2023-06-28  2:52               ` Dan Williams
  2023-06-29 16:25                 ` Dionna Amalie Glaze
  2023-06-28 15:31               ` Samuel Ortiz
  2 siblings, 1 reply; 41+ messages in thread
From: Dan Williams @ 2023-06-28  2:52 UTC (permalink / raw
  To: Dionna Amalie Glaze, Dan Williams
  Cc: Sathyanarayanan Kuppuswamy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

Dionna Amalie Glaze wrote:
> On Tue, Jun 27, 2023 at 5:13 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > [..]
> > >
> > > The VMPL-based separation that will house the supervisor module known
> > > as SVSM can have protocols that implement a TPM command interface, or
> > > an RTMR-extension interface, and will also need to have an
> > > SVSM-specific protocol attestation report format to keep the secure
> > > chain of custody apparent. We'd have different formats and protocols
> > > in the kernel, at least, to speak to each technology.
> >
> > That's where I hope the line can be drawn, i.e. that all of this vendor
> > differentiation really only matters inside the kernel in the end.
> >
> > > I'm not sure it's worth the trouble of papering over all the... 3-4
> > > technologies with similar but still weirdly different formats and ways
> > > of doing things with an abstracted attestation ABI, especially since
> > > the output all has to be interpreted in an architecture-specific way
> > > anyway.
> >
> > This is where I need help. Can you identify where the following
> > assertion falls over:
> >
> > "The minimum viable key-server is one that can generically validate a
> > blob with an ECDSA signature".
> >
> > I.e. the fact that SEV and TDX send different length blobs is less
> > important than validating that signature.
> >
> > If it is always the case that specific fields in the blob need to be
> > decoded then yes, that weakens the assertion. However, maybe that means
> > that kernel code parses the blob and conveys that parsed info along with
> > vendor attestation payload all signed by a Linux key. I.e. still allow
> > for a unified output format + signed vendor blob and provide a path to
> > keep all the vendor specific handling internal to the kernel.
> >

First, thank you for engaging, it speeds up the iteration. This
confirmed my worry that the secondary goal of this proposal, a common
verification implementation, is indeed unachievable in the near term. A
few clarifying questions below, but I will let this go.

The primary goal, achievable on a short runway, is more for kernel
developers. It is to have a common infrastructure for marshaling vendor
payloads, provide a mechanism to facilitate kernel initiated requests to
a key-server, and to deploy a common frontend for concepts like runtime
measurement (likely as another backend to what Keys already understands
for various TPM PCR implementations).

> All the specific fields of the blob have to be decoded and subjected
> to an acceptance policy. That policy will most always be different
> across different platforms and VM owners. I wrote all of
> github.com/google/go-sev-guest, including the verification and
> validation logic, and it's going to get more complicated, and the
> sources of the data that provide validators with notions of what
> values can be trusted will be varied.

Can you provide an example? I ask only to include it in the kernel
commit log for a crisp explanation why this proposed Keys format will
continue to convey a raw vendor blob with no kernel abstraction as part
of its payload for the foreseeable future.

> The formats are not standardized. The Confidential Computing
> Consortium should be working toward that, but it's a slow process.
> There's IETF RATS. There's in-toto.io attestations. There's Azure's
> JWT thing. There's a signed serialized protocol buffer that I've
> decided is what Google is going to produce while we figure out all the
> "right" formats to use. There will be factions and absolute gridlock
> for multiple years if we require solidifying an abstraction for the
> kernel to manage all this logic before passing a report on to user
> space.

Understood. When that standardization process completes my expectation
is that it slots into the common conveyance method and no need to go
rewrite code that already knows how to interface with Keys to get
attestation evidence.

> Now, not only are the field contents important, the certificates of
> the keys that signed the report are important. Each platform has its
> own special x509v3 extensions and key hierarchy to express what parts
> of the report should be what value if signed by this key, and in TDX's
> case there are extra endpoints that you need to query to determine if
> there's an active CVE on the associated TCB version. This is how they
> avoid adding every cpu's key to the leaf certificate's CRL.
> 
> You really shouldn't be putting attestation validation logic in the
> kernel.

It was less putting validation logic in the kernel, and more hoping for
a way to abstract some common parsing in advance of a true standard
attestation format, but point taken.

> It belongs outside of the VM entirely with the party that will
> only release access keys to the VM if it can prove it's running the
> software it claims, on the platform it claims. I think Windows puts a
> remote procedure call in their guest attestation driver to the Azure
> attestation service, and that is an anti-pattern in my mind.

I can not speak to the Windows implementation, but the Linux Keys
subsystem is there to handle Key construction that may be requested by
userspace or the kernel and may be serviced by built-in keys,
device/platform instantiated keys, or keys retrieved via an upcall to
userspace.

The observation is that existing calls to request_key() in the kernel
likely have reason to be serviced by a confidential computing key server
somewhere in the chain. So, might as well enlighten the Keys subsystem
to retrieve this information and skip round trips to userspace run
vendor specific ioctls. Make the kernel as self sufficient as possible,
and make SEV, TDX, etc. developers talk more to each other about their
needs.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  2:16               ` Huang, Kai
@ 2023-06-28  6:46                 ` gregkh
  2023-06-28  8:56                   ` Huang, Kai
  0 siblings, 1 reply; 41+ messages in thread
From: gregkh @ 2023-06-28  6:46 UTC (permalink / raw
  To: Huang, Kai
  Cc: Williams, Dan J, dionnaglaze@google.com, corbet@lwn.net,
	Aktas, Erdem, linux-coco@lists.linux.dev, shuah@kernel.org,
	Du, Fan, Luck, Tony, dave.hansen@linux.intel.com,
	brijesh.singh@amd.com, joey.gouly@arm.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, dhowells@redhat.com,
	Yu, Guorui, x86@kernel.org

On Wed, Jun 28, 2023 at 02:16:45AM +0000, Huang, Kai wrote:
> > You really shouldn't be putting attestation validation logic in the
> > kernel.
> 
> Agreed.  The data blob for remote verification should be just some data blob to
> the kernel.  I think the kernel shouldn't even try to understand the data blob
> is for which architecture.  From the kernel's perspective, it should be just
> some data blob that the kernel gets from hardware/firmware or whatever embedded
> in the root-of-trust in the hardware after taking some input from usrspace for
> the unique identity of the blob that can be used to, e.g., mitigate replay-
> attack, etc.

Great, then use the common "data blob" api that we have in the kernel
for a very long time now, the "firwmare download" api, or the sysfs
binary file api.  Both of them just use the kernel as a pass-through and
do not touch the data at all.  No need for crazy custom ioctls and all
that mess :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  6:46                 ` gregkh
@ 2023-06-28  8:56                   ` Huang, Kai
  2023-06-28  9:02                     ` gregkh
  0 siblings, 1 reply; 41+ messages in thread
From: Huang, Kai @ 2023-06-28  8:56 UTC (permalink / raw
  To: gregkh@linuxfoundation.org
  Cc: corbet@lwn.net, linux-coco@lists.linux.dev, shuah@kernel.org,
	Yu, Guorui, Luck, Tony, dave.hansen@linux.intel.com,
	joey.gouly@arm.com, dionnaglaze@google.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com, x86@kernel.org,
	linux-kernel@vger.kernel.org, Du, Fan, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	Aktas, Erdem, hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, brijesh.singh@amd.com,
	Williams, Dan J, dhowells@redhat.com

On Wed, 2023-06-28 at 08:46 +0200, gregkh@linuxfoundation.org wrote:
> On Wed, Jun 28, 2023 at 02:16:45AM +0000, Huang, Kai wrote:
> > > You really shouldn't be putting attestation validation logic in the
> > > kernel.
> > 
> > Agreed.  The data blob for remote verification should be just some data blob to
> > the kernel.  I think the kernel shouldn't even try to understand the data blob
> > is for which architecture.  From the kernel's perspective, it should be just
> > some data blob that the kernel gets from hardware/firmware or whatever embedded
> > in the root-of-trust in the hardware after taking some input from usrspace for
> > the unique identity of the blob that can be used to, e.g., mitigate replay-
> > attack, etc.
> 
> Great, then use the common "data blob" api that we have in the kernel
> for a very long time now, the "firwmare download" api, or the sysfs
> binary file api.  Both of them just use the kernel as a pass-through and
> do not touch the data at all.  No need for crazy custom ioctls and all
> that mess :)
> 

I guess I was talking about from "kernel shouldn't try to parse attestation data
blob" perspective.  Looking at AMD's attestation flow (I have no deep
understanding of AMD's attestation flow), the assumption of "one remote
verifiable data blob" isn't even true -- AMD can return "attestation report"
(remote verifiable) and the "certificate" to verify it separately:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snp-attestation.html

On the other hand, AFAICT Intel SGX-based attestation doesn't have a mechanism
"for the kernel" to return certificate(s), but choose to embed the
certificate(s) to the Quote itself.  I believe we can add such mechanism (e.g.,
another TDVMCALL) for the kernel to get certificate(s) separately, but AFAICT it
doesn't exist yet.

Btw, getting "remote verifiable blob" is only one step of the attestation flow.
For instance, before the blob can be generated, there must be a step to
establish the attestation key between the machine and the attestation service. 
And the flow to do this could be very different between vendors too.

That being said, while I believe all those differences can be unified in some
way, I think the question is whether it is worth to put such effort to try to
unify attestation flow for all vendors.  As Erdem Aktas mentioned earlier, "the
number of CPU vendors for confidential computing seems manageable".

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  8:56                   ` Huang, Kai
@ 2023-06-28  9:02                     ` gregkh
  2023-06-28  9:45                       ` Huang, Kai
  0 siblings, 1 reply; 41+ messages in thread
From: gregkh @ 2023-06-28  9:02 UTC (permalink / raw
  To: Huang, Kai
  Cc: corbet@lwn.net, linux-coco@lists.linux.dev, shuah@kernel.org,
	Yu, Guorui, Luck, Tony, dave.hansen@linux.intel.com,
	joey.gouly@arm.com, dionnaglaze@google.com, qinkun@apache.org,
	kirill.shutemov@linux.intel.com, mingo@redhat.com, x86@kernel.org,
	linux-kernel@vger.kernel.org, Du, Fan, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	Aktas, Erdem, hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, brijesh.singh@amd.com,
	Williams, Dan J, dhowells@redhat.com

On Wed, Jun 28, 2023 at 08:56:30AM +0000, Huang, Kai wrote:
> On Wed, 2023-06-28 at 08:46 +0200, gregkh@linuxfoundation.org wrote:
> > On Wed, Jun 28, 2023 at 02:16:45AM +0000, Huang, Kai wrote:
> > > > You really shouldn't be putting attestation validation logic in the
> > > > kernel.
> > > 
> > > Agreed.  The data blob for remote verification should be just some data blob to
> > > the kernel.  I think the kernel shouldn't even try to understand the data blob
> > > is for which architecture.  From the kernel's perspective, it should be just
> > > some data blob that the kernel gets from hardware/firmware or whatever embedded
> > > in the root-of-trust in the hardware after taking some input from usrspace for
> > > the unique identity of the blob that can be used to, e.g., mitigate replay-
> > > attack, etc.
> > 
> > Great, then use the common "data blob" api that we have in the kernel
> > for a very long time now, the "firwmare download" api, or the sysfs
> > binary file api.  Both of them just use the kernel as a pass-through and
> > do not touch the data at all.  No need for crazy custom ioctls and all
> > that mess :)
> > 
> 
> I guess I was talking about from "kernel shouldn't try to parse attestation data
> blob" perspective.  Looking at AMD's attestation flow (I have no deep
> understanding of AMD's attestation flow), the assumption of "one remote
> verifiable data blob" isn't even true -- AMD can return "attestation report"
> (remote verifiable) and the "certificate" to verify it separately:
> 
> https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snp-attestation.html
> 
> On the other hand, AFAICT Intel SGX-based attestation doesn't have a mechanism
> "for the kernel" to return certificate(s), but choose to embed the
> certificate(s) to the Quote itself.  I believe we can add such mechanism (e.g.,
> another TDVMCALL) for the kernel to get certificate(s) separately, but AFAICT it
> doesn't exist yet.
> 
> Btw, getting "remote verifiable blob" is only one step of the attestation flow.
> For instance, before the blob can be generated, there must be a step to
> establish the attestation key between the machine and the attestation service. 
> And the flow to do this could be very different between vendors too.
> 
> That being said, while I believe all those differences can be unified in some
> way, I think the question is whether it is worth to put such effort to try to
> unify attestation flow for all vendors.  As Erdem Aktas mentioned earlier, "the
> number of CPU vendors for confidential computing seems manageable".

So you think that there should be a custom user/kernel api for every
single different CPU vendor?  That's not how kernel development works,
sorry.  Let's try to unify them to make both the kernel, and userspace,
sane.

And Dan is right, if this is handling keys, then the key subsystem needs
to be used here instead of custom ioctls.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  9:02                     ` gregkh
@ 2023-06-28  9:45                       ` Huang, Kai
  0 siblings, 0 replies; 41+ messages in thread
From: Huang, Kai @ 2023-06-28  9:45 UTC (permalink / raw
  To: gregkh@linuxfoundation.org
  Cc: corbet@lwn.net, linux-coco@lists.linux.dev, dhowells@redhat.com,
	shuah@kernel.org, brijesh.singh@amd.com, Luck, Tony,
	joey.gouly@arm.com, dave.hansen@linux.intel.com,
	dionnaglaze@google.com, qinkun@apache.org,
	linux-kernel@vger.kernel.org, mingo@redhat.com, Williams, Dan J,
	kirill.shutemov@linux.intel.com, tglx@linutronix.de,
	linux-doc@vger.kernel.org, wander@redhat.com, atishp@rivosinc.com,
	Du, Fan, hpa@zytor.com, chongc@google.com, bp@alien8.de,
	linux-kselftest@vger.kernel.org, Aktas, Erdem,
	sathyanarayanan.kuppuswamy@linux.intel.com, Yu, Guorui,
	x86@kernel.org

On Wed, 2023-06-28 at 11:02 +0200, gregkh@linuxfoundation.org wrote:
> On Wed, Jun 28, 2023 at 08:56:30AM +0000, Huang, Kai wrote:
> > On Wed, 2023-06-28 at 08:46 +0200, gregkh@linuxfoundation.org wrote:
> > > On Wed, Jun 28, 2023 at 02:16:45AM +0000, Huang, Kai wrote:
> > > > > You really shouldn't be putting attestation validation logic in the
> > > > > kernel.
> > > > 
> > > > Agreed.  The data blob for remote verification should be just some data blob to
> > > > the kernel.  I think the kernel shouldn't even try to understand the data blob
> > > > is for which architecture.  From the kernel's perspective, it should be just
> > > > some data blob that the kernel gets from hardware/firmware or whatever embedded
> > > > in the root-of-trust in the hardware after taking some input from usrspace for
> > > > the unique identity of the blob that can be used to, e.g., mitigate replay-
> > > > attack, etc.
> > > 
> > > Great, then use the common "data blob" api that we have in the kernel
> > > for a very long time now, the "firwmare download" api, or the sysfs
> > > binary file api.  Both of them just use the kernel as a pass-through and
> > > do not touch the data at all.  No need for crazy custom ioctls and all
> > > that mess :)
> > > 
> > 
> > I guess I was talking about from "kernel shouldn't try to parse attestation data
> > blob" perspective.  Looking at AMD's attestation flow (I have no deep
> > understanding of AMD's attestation flow), the assumption of "one remote
> > verifiable data blob" isn't even true -- AMD can return "attestation report"
> > (remote verifiable) and the "certificate" to verify it separately:
> > 
> > https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snp-attestation.html
> > 
> > On the other hand, AFAICT Intel SGX-based attestation doesn't have a mechanism
> > "for the kernel" to return certificate(s), but choose to embed the
> > certificate(s) to the Quote itself.  I believe we can add such mechanism (e.g.,
> > another TDVMCALL) for the kernel to get certificate(s) separately, but AFAICT it
> > doesn't exist yet.
> > 
> > Btw, getting "remote verifiable blob" is only one step of the attestation flow.
> > For instance, before the blob can be generated, there must be a step to
> > establish the attestation key between the machine and the attestation service. 
> > And the flow to do this could be very different between vendors too.
> > 
> > That being said, while I believe all those differences can be unified in some
> > way, I think the question is whether it is worth to put such effort to try to
> > unify attestation flow for all vendors.  As Erdem Aktas mentioned earlier, "the
> > number of CPU vendors for confidential computing seems manageable".
> 
> So you think that there should be a custom user/kernel api for every
> single different CPU vendor?  That's not how kernel development works,
> sorry.  Let's try to unify them to make both the kernel, and userspace,
> sane.
> 
> And Dan is right, if this is handling keys, then the key subsystem needs
> to be used here instead of custom ioctls.
> 

Sure.  I have no objection to this.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  0:11           ` Dan Williams
  2023-06-28  1:36             ` Dionna Amalie Glaze
@ 2023-06-28 15:24             ` Samuel Ortiz
  1 sibling, 0 replies; 41+ messages in thread
From: Samuel Ortiz @ 2023-06-28 15:24 UTC (permalink / raw
  To: Dan Williams
  Cc: Dionna Amalie Glaze, Sathyanarayanan Kuppuswamy, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet, H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

On Tue, Jun 27, 2023 at 05:11:07PM -0700, Dan Williams wrote:
> Dionna Amalie Glaze wrote:
> > On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
> > <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> [..]
> > > Hi Dan,
> > >
> > > On 6/23/23 3:27 PM, Dan Williams wrote:
> > > > Dan Williams wrote:
> > > >> [ add David, Brijesh, and Atish]
> > > >>
> > > >> Kuppuswamy Sathyanarayanan wrote:
> > > >>> In TDX guest, the second stage of the attestation process is Quote
> > > >>> generation. This process is required to convert the locally generated
> > > >>> TDREPORT into a remotely verifiable Quote. It involves sending the
> > > >>> TDREPORT data to a Quoting Enclave (QE) which will verify the
> > > >>> integrity of the TDREPORT and sign it with an attestation key.
> > > >>>
> > > >>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> > > >>> allow the user agent to get the TD Quote.
> > > >>>
> > > >>> Add a kernel selftest module to verify the Quote generation feature.
> > > >>>
> > > >>> TD Quote generation involves following steps:
> > > >>>
> > > >>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> > > >>> * Embed the TDREPORT data in quote buffer and request for quote
> > > >>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> > > >>> * Upon completion of the GetQuote request, check for non zero value
> > > >>>   in the status field of Quote header to make sure the generated
> > > >>>   quote is valid.
> > > >>
> > > >> What this cover letter does not say is that this is adding another
> > > >> instance of the similar pattern as SNP_GET_REPORT.
> > > >>
> > > >> Linux is best served when multiple vendors trying to do similar
> > > >> operations are brought together behind a common ABI. We see this in the
> > > >> history of wrangling SCSI vendors behind common interfaces. Now multiple
> > > >> confidential computing vendors trying to develop similar flows with
> > > >> differentiated formats where that differentiation need not leak over the
> > > >> ABI boundary.
> > > > [..]
> > > >
> > > > Below is a rough mock up of this approach to demonstrate the direction.
> > > > Again, the goal is to define an ABI that can support any vendor's
> > > > arch-specific attestation method and key provisioning flows without
> > > > leaking vendor-specific details, or confidential material over the
> > > > user/kernel ABI.
> > >
> > > Thanks for working on this mock code and helping out. It gives me the
> > > general idea about your proposal.
> > >
> > > >
> > > > The observation is that there are a sufficient number of attestation
> > > > flows available to review where Linux can define a superset ABI to
> > > > contain them all. The other observation is that the implementations have
> > > > features that may cross-polinate over time. For example the SEV
> > > > privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> > > > PCRs) mechanisms address generic Confidential Computing use cases.
> > >
> > >
> > > I agree with your point about VMPL and RTMR feature cases. This observation
> > > is valid for AMD SEV and TDX attestation flows. But I am not sure whether
> > > it will hold true for other vendor implementations. Our sample set is not
> > > good enough to make this conclusion. The reason for my concern is, if you
> > > check the ABI interface used in the S390 arch attestation driver
> > > (drivers/s390/char/uvdevice.c), you would notice that there is a significant
> > > difference between the ABI used in that driver and SEV/TDX drivers. The S390
> > > driver attestation request appears to accept two data blobs as input, as well
> > > as a variety of vendor-specific header configurations.
> > >
> > > Maybe the s390 attestation model is a special case, but, I think we consider
> > > this issue. Since we don't have a common spec, there is chance that any
> > > superset ABI we define now may not meet future vendor requirements. One way to
> > > handle it to leave enough space in the generic ABI to handle future vendor
> > > requirements.
> > >
> > > I think it would be better if other vendors (like ARM or RISC) can comment and
> > > confirm whether this proposal meets their demands.
> > >
> > 
> > The VMPL-based separation that will house the supervisor module known
> > as SVSM can have protocols that implement a TPM command interface, or
> > an RTMR-extension interface, and will also need to have an
> > SVSM-specific protocol attestation report format to keep the secure
> > chain of custody apparent. We'd have different formats and protocols
> > in the kernel, at least, to speak to each technology. 
> 
> That's where I hope the line can be drawn, i.e. that all of this vendor
> differentiation really only matters inside the kernel in the end.

Looking at your keys subsystem based PoC (thanks for putting it
together), I understand that the intention is to pass an attestation
evidence request as a payload to the kernel, in a abstract way.
i.e. the void *data in:
static int guest_attest_request_key(struct key *authkey, void *data)

And then passiing that down to vendor specific handlers
(tdx_request_attest in your PoC) for it to behave as a key server for
that attestation evidence request. The vendor magic of transforming an
attestation request into an actual attestation evidence (typically
signed with platform derived keys) is stuffed into that handler. The
format, content and protection of both the attestation evidence request
and the evidence itself is left for the guest kernel handler (e.g.
tdx_request_attest) to handle.

Is that a fair description of your proposal?

If it is, then it makes sense to me, and could serve as a generic
abstraction for confidential computing guest attestation evidence
requests. I think it could support the TDX, SEV and also the RISC-V
(aka CoVE) guest attestation request evidence flow.

> > I'm not sure it's worth the trouble of papering over all the... 3-4
> > technologies with similar but still weirdly different formats and ways
> > of doing things with an abstracted attestation ABI, especially since
> > the output all has to be interpreted in an architecture-specific way
> > anyway.
> 
> This is where I need help. Can you identify where the following
> assertion falls over:
> 
> "The minimum viable key-server is one that can generically validate a
> blob with an ECDSA signature".
> 
> I.e. the fact that SEV and TDX send different length blobs is less
> important than validating that signature.

I'm not sure which signature we're talking about here.
The final attestation evidence (The blob the guest workload will send to
a remote attestation service) is signed with a platform derived key,
typically rooted into a manufacturer's CA. It is then up to the *remote*
attestation service to authenticate and validate the evidence signature.
Then it can go through the actual attestation verification flow
(comparison against reference values, policy checks, etc). The latter
should be none of the guest kernel's business, which is what your
proposal seems to be heading to.


> If it is always the case that specific fields in the blob need to be
> decoded then yes, that weakens the assertion. 

Your vendor specific key handler may have to decode the passed void
pointer into a vendor specific structure before sending it down to the
TSM/ASP/etc, and that's fine imho.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  1:36             ` Dionna Amalie Glaze
  2023-06-28  2:16               ` Huang, Kai
  2023-06-28  2:52               ` Dan Williams
@ 2023-06-28 15:31               ` Samuel Ortiz
  2 siblings, 0 replies; 41+ messages in thread
From: Samuel Ortiz @ 2023-06-28 15:31 UTC (permalink / raw
  To: Dionna Amalie Glaze
  Cc: Dan Williams, Sathyanarayanan Kuppuswamy, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet, H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

On Tue, Jun 27, 2023 at 06:36:07PM -0700, Dionna Amalie Glaze wrote:
> On Tue, Jun 27, 2023 at 5:13 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > [..]
> > >
> > > The VMPL-based separation that will house the supervisor module known
> > > as SVSM can have protocols that implement a TPM command interface, or
> > > an RTMR-extension interface, and will also need to have an
> > > SVSM-specific protocol attestation report format to keep the secure
> > > chain of custody apparent. We'd have different formats and protocols
> > > in the kernel, at least, to speak to each technology.
> >
> > That's where I hope the line can be drawn, i.e. that all of this vendor
> > differentiation really only matters inside the kernel in the end.
> >
> > > I'm not sure it's worth the trouble of papering over all the... 3-4
> > > technologies with similar but still weirdly different formats and ways
> > > of doing things with an abstracted attestation ABI, especially since
> > > the output all has to be interpreted in an architecture-specific way
> > > anyway.
> >
> > This is where I need help. Can you identify where the following
> > assertion falls over:
> >
> > "The minimum viable key-server is one that can generically validate a
> > blob with an ECDSA signature".
> >
> > I.e. the fact that SEV and TDX send different length blobs is less
> > important than validating that signature.
> >
> > If it is always the case that specific fields in the blob need to be
> > decoded then yes, that weakens the assertion. However, maybe that means
> > that kernel code parses the blob and conveys that parsed info along with
> > vendor attestation payload all signed by a Linux key. I.e. still allow
> > for a unified output format + signed vendor blob and provide a path to
> > keep all the vendor specific handling internal to the kernel.
> >
> 
> All the specific fields of the blob have to be decoded and subjected
> to an acceptance policy. That policy will most always be different
> across different platforms and VM owners. I wrote all of
> github.com/google/go-sev-guest, including the verification and
> validation logic, and it's going to get more complicated, and the
> sources of the data that provide validators with notions of what
> values can be trusted will be varied. The formats are not
> standardized. The Confidential Computing Consortium should be working
> toward that, but it's a slow process. There's IETF RATS. There's
> in-toto.io attestations. There's Azure's JWT thing. There's a signed
> serialized protocol buffer that I've decided is what Google is going
> to produce while we figure out all the "right" formats to use. There
> will be factions and absolute gridlock for multiple years if we
> require solidifying an abstraction for the kernel to manage all this
> logic before passing a report on to user space.

I agree with most of the above, but all that nightmate^Wcomplexity is
handled on the remote attestation side. If I understand the current
discussion, it's about how to abstract a guest attestation evidence
generation request in a vendor agnostic way. And I think what's proposed
here is simply to pass a binary payload (The evidence request from the
guest userspace) to the kernel key subsystem, hook it into vendor
specific handler and get userspace an attestation evidence (a platform
key signed blob) back to the guest app. The guest app can then give that
to an attestation service, and that's when all the above described
complexity takes place. Am I missing something?

> Now, not only are the field contents important, the certificates of
> the keys that signed the report are important. Each platform has its
> own special x509v3 extensions and key hierarchy to express what parts
> of the report should be what value if signed by this key, and in TDX's
> case there are extra endpoints that you need to query to determine if
> there's an active CVE on the associated TCB version. This is how they
> avoid adding every cpu's key to the leaf certificate's CRL.
> 
> You really shouldn't be putting attestation validation logic in the
> kernel. 

AFAIU, that's not part of the proposal/PoC/mockup. It's all about
funneling an attestation evidence request down to the TSM/PSP/firmware
for it to generate an actually verifiable attestation evidence.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-27  0:39           ` Sathyanarayanan Kuppuswamy
@ 2023-06-28 15:41             ` Samuel Ortiz
  2023-06-28 15:55               ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 41+ messages in thread
From: Samuel Ortiz @ 2023-06-28 15:41 UTC (permalink / raw
  To: Sathyanarayanan Kuppuswamy
  Cc: Dionna Amalie Glaze, Dan Williams, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

On Mon, Jun 26, 2023 at 05:39:12PM -0700, Sathyanarayanan Kuppuswamy wrote:
> +Atish
> 
> Atish, any comments on this topic from RISC-v?

The CoVE (RISC-V confidential computing specification) would benefit
from the proposed abstract API. Similar to at least both TDX and SEV,
the CoVE attestation evidence generation proposed (The spec is not
ratified yet) interface [1] basically takes some binary blobs in (a TVM
public key and an attestation challenge blob) and requests the TSM to
generate an attesation evidence for the caller. The TSM will generate
such evidence from all static and runtime measurements and sign it with
its DICE derived key. Attestation lingo set aside, the pattern here is
similar to SEV and TDX. Having a common API for a generic attestation
evidence generation interface would avoid having to add yet another
ioctl based interface specific to CoVE.

Another interface we could think about commonizing is the measurement
extension one. I think both TDX and CoVE allow for a guest to
dynamically extend its measurements (to dedicated, runtime PCRs).

Cheers,
Samuel.

[1] https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/sbi_cove.adoc#function-cove-guest-get-evidence-fid-8

> On 6/26/23 11:57 AM, Dionna Amalie Glaze wrote:
> > On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
> > <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> >>
> >> Hi Dan,
> >>
> >> On 6/23/23 3:27 PM, Dan Williams wrote:
> >>> Dan Williams wrote:
> >>>> [ add David, Brijesh, and Atish]
> >>>>
> >>>> Kuppuswamy Sathyanarayanan wrote:
> >>>>> In TDX guest, the second stage of the attestation process is Quote
> >>>>> generation. This process is required to convert the locally generated
> >>>>> TDREPORT into a remotely verifiable Quote. It involves sending the
> >>>>> TDREPORT data to a Quoting Enclave (QE) which will verify the
> >>>>> integrity of the TDREPORT and sign it with an attestation key.
> >>>>>
> >>>>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> >>>>> allow the user agent to get the TD Quote.
> >>>>>
> >>>>> Add a kernel selftest module to verify the Quote generation feature.
> >>>>>
> >>>>> TD Quote generation involves following steps:
> >>>>>
> >>>>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> >>>>> * Embed the TDREPORT data in quote buffer and request for quote
> >>>>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
> >>>>> * Upon completion of the GetQuote request, check for non zero value
> >>>>>   in the status field of Quote header to make sure the generated
> >>>>>   quote is valid.
> >>>>
> >>>> What this cover letter does not say is that this is adding another
> >>>> instance of the similar pattern as SNP_GET_REPORT.
> >>>>
> >>>> Linux is best served when multiple vendors trying to do similar
> >>>> operations are brought together behind a common ABI. We see this in the
> >>>> history of wrangling SCSI vendors behind common interfaces. Now multiple
> >>>> confidential computing vendors trying to develop similar flows with
> >>>> differentiated formats where that differentiation need not leak over the
> >>>> ABI boundary.
> >>> [..]
> >>>
> >>> Below is a rough mock up of this approach to demonstrate the direction.
> >>> Again, the goal is to define an ABI that can support any vendor's
> >>> arch-specific attestation method and key provisioning flows without
> >>> leaking vendor-specific details, or confidential material over the
> >>> user/kernel ABI.
> >>
> >> Thanks for working on this mock code and helping out. It gives me the
> >> general idea about your proposal.
> >>
> >>>
> >>> The observation is that there are a sufficient number of attestation
> >>> flows available to review where Linux can define a superset ABI to
> >>> contain them all. The other observation is that the implementations have
> >>> features that may cross-polinate over time. For example the SEV
> >>> privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
> >>> PCRs) mechanisms address generic Confidential Computing use cases.
> >>
> >>
> >> I agree with your point about VMPL and RTMR feature cases. This observation
> >> is valid for AMD SEV and TDX attestation flows. But I am not sure whether
> >> it will hold true for other vendor implementations. Our sample set is not
> >> good enough to make this conclusion. The reason for my concern is, if you
> >> check the ABI interface used in the S390 arch attestation driver
> >> (drivers/s390/char/uvdevice.c), you would notice that there is a significant
> >> difference between the ABI used in that driver and SEV/TDX drivers. The S390
> >> driver attestation request appears to accept two data blobs as input, as well
> >> as a variety of vendor-specific header configurations.
> >>
> >> Maybe the s390 attestation model is a special case, but, I think we consider
> >> this issue. Since we don't have a common spec, there is chance that any
> >> superset ABI we define now may not meet future vendor requirements. One way to
> >> handle it to leave enough space in the generic ABI to handle future vendor
> >> requirements.
> >>
> >> I think it would be better if other vendors (like ARM or RISC) can comment and
> >> confirm whether this proposal meets their demands.
> >>
> > 
> > The VMPL-based separation that will house the supervisor module known
> > as SVSM can have protocols that implement a TPM command interface, or
> > an RTMR-extension interface, and will also need to have an
> > SVSM-specific protocol attestation report format to keep the secure
> > chain of custody apparent. We'd have different formats and protocols
> > in the kernel, at least, to speak to each technology. I'm not sure
> > it's worth the trouble of papering over all the... 3-4 technologies
> > with similar but still weirdly different formats and ways of doing
> > things with an abstracted attestation ABI, especially since the output
> > all has to be interpreted in an architecture-specific way anyway.
> > 
> > ARM's Confidential Computing Realm Management Extensions (RME) seems
> > to be going along the lines of a runtime measurement register model
> > with their hardware enforced security. The number of registers isn't
> > prescribed in the spec.
> > 
> > +Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
> > you know who would be best to weigh in on this discussion of a unified
> > attestation model?
> 
> 
> > 
> >>>
> >>> Vendor specific ioctls for all of this feels like surrender when Linux
> >>> already has the keys subsystem which has plenty of degrees of freedom
> >>> for tracking blobs with signatures and using those blobs to instantiate
> >>> other blobs. It already serves as the ABI wrapping various TPM
> >>> implementations and marshaling keys for storage encryption and other use
> >>> cases that intersect Confidential Computing.
> >>>
> >>> The benefit of deprecating vendor-specific abstraction layers in
> >>> userspace is secondary. The primary benefit is collaboration. It enables
> >>> kernel developers from various architectures to collaborate on common
> >>> infrastructure. If, referring back to my previous example, SEV adopts an
> >>> RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
> >>> unfortunate if those efforts were siloed, duplicated, and needlessly
> >>> differentiated to userspace. So while there are arguably a manageable
> >>> number of basic arch attestation methods the planned expansion of those
> >>> to build incremental functionality is where I believe we, as a
> >>> community, will be glad that we invested in a "Linux format" for all of
> >>> this.
> >>>
> >>> An example, to show what the strawman patch below enables: (req_key is
> >>> the sample program from "man 2 request_key")
> >>>
> >>> # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
> >>> Key ID is 10e2f3a7
> >>> # keyctl pipe 0x10e2f3a7 | hexdump -C
> >>> 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
> >>> 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
> >>> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> >>> *
> >>> 00004000
> >>>
> >>> This is the kernel instantiating a TDX Quote without the TDREPORT
> >>> implementation detail ever leaving the kernel. Now, this is only the
> >>
> >> IIUC, the idea here is to cache the quote data and return it to the user whenever
> >> possible, right? If yes, I think such optimization may not be very useful for our
> >> case. AFAIK, the quote data will change whenever there is a change in the guest
> >> measurement data. Since the validity of the generated quote will not be long,
> >> and the frequency of quote generation requests is expected to be less, we may not
> >> get much benefit from caching the quote data. I think we can keep this logic simple
> >> by directly retrieving the quote data from the quoting enclave whenever there is a
> >> request from the user.
> >>
> >>> top-half of what is needed. The missing bottom half takes that material
> >>> and uses it to instantiate derived key material like the storage
> >>> decryption key internal to the kernel. See "The Process" in
> >>> Documentation/security/keys/request-key.rst for how the Keys subsystem
> >>> handles the "keys for keys" use case.
> >>
> >> This is only useful for key-server use case, right? Attestation can also be
> >> used for use cases like pattern matching or uploading some secure data, etc.
> >> Since key-server is not the only use case, does it make sense to suppport
> >> this derived key feature?
> >>
> >>>
> >>> ---
> >>> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> >>> index f79ab13a5c28..0f775847028e 100644
> >>> --- a/drivers/virt/Kconfig
> >>> +++ b/drivers/virt/Kconfig
> >>> @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
> >>>
> >>>  source "drivers/virt/coco/tdx-guest/Kconfig"
> >>>
> >>> +config GUEST_ATTEST
> >>> +     tristate
> >>> +     select KEYS
> >>> +
> >>>  endif
> >>> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> >>> index e9aa6fc96fab..66f6b838f8f4 100644
> >>> --- a/drivers/virt/Makefile
> >>> +++ b/drivers/virt/Makefile
> >>> @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)              += acrn/
> >>>  obj-$(CONFIG_EFI_SECRET)     += coco/efi_secret/
> >>>  obj-$(CONFIG_SEV_GUEST)              += coco/sev-guest/
> >>>  obj-$(CONFIG_INTEL_TDX_GUEST)        += coco/tdx-guest/
> >>> +obj-$(CONFIG_GUEST_ATTEST)   += coco/guest-attest/
> >>> diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
> >>> new file mode 100644
> >>> index 000000000000..5581c5a27588
> >>> --- /dev/null
> >>> +++ b/drivers/virt/coco/guest-attest/Makefile
> >>> @@ -0,0 +1,2 @@
> >>> +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
> >>> +guest_attest-y := key.o
> >>> diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
> >>> new file mode 100644
> >>> index 000000000000..2a494b6dd7a7
> >>> --- /dev/null
> >>> +++ b/drivers/virt/coco/guest-attest/key.c
> >>> @@ -0,0 +1,159 @@
> >>> +// SPDX-License-Identifier: GPL-2.0-only
> >>> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> >>> +
> >>> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >>> +#include <linux/seq_file.h>
> >>> +#include <linux/key-type.h>
> >>> +#include <linux/module.h>
> >>> +#include <linux/base64.h>
> >>> +
> >>> +#include <keys/request_key_auth-type.h>
> >>> +#include <keys/user-type.h>
> >>> +
> >>> +#include "guest-attest.h"
> >>
> >> Can you share you guest-attest.h?
> >>
> >>> +
> >>> +static LIST_HEAD(guest_attest_list);
> >>> +static DECLARE_RWSEM(guest_attest_rwsem);
> >>> +
> >>> +static struct guest_attest_ops *fetch_ops(void)
> >>> +{
> >>> +     return list_first_entry_or_null(&guest_attest_list,
> >>> +                                     struct guest_attest_ops, list);
> >>> +}
> >>> +
> >>> +static struct guest_attest_ops *get_ops(void)
> >>> +{
> >>> +     down_read(&guest_attest_rwsem);
> >>> +     return fetch_ops();
> >>> +}
> >>> +
> >>> +static void put_ops(void)
> >>> +{
> >>> +     up_read(&guest_attest_rwsem);
> >>> +}
> >>> +
> >>> +int register_guest_attest_ops(struct guest_attest_ops *ops)
> >>> +{
> >>> +     struct guest_attest_ops *conflict;
> >>> +     int rc;
> >>> +
> >>> +     down_write(&guest_attest_rwsem);
> >>> +     conflict = fetch_ops();
> >>> +     if (conflict) {
> >>> +             pr_err("\"%s\" ops already registered\n", conflict->name);
> >>> +             rc = -EEXIST;
> >>> +             goto out;
> >>> +     }
> >>> +     list_add(&ops->list, &guest_attest_list);
> >>> +     try_module_get(ops->module);
> >>> +     rc = 0;
> >>> +out:
> >>> +     up_write(&guest_attest_rwsem);
> >>> +     return rc;
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
> >>> +
> >>> +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
> >>> +{
> >>> +     down_write(&guest_attest_rwsem);
> >>> +     list_del(&ops->list);
> >>> +     up_write(&guest_attest_rwsem);
> >>> +     module_put(ops->module);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
> >>> +
> >>> +static int __guest_attest_request_key(struct key *key, int level,
> >>> +                                   struct key *dest_keyring,
> >>> +                                   const char *callout_info, int callout_len,
> >>> +                                   struct key *authkey)
> >>> +{
> >>> +     struct guest_attest_ops *ops;
> >>> +     void *payload = NULL;
> >>> +     int rc, payload_len;
> >>> +
> >>> +     ops = get_ops();
> >>> +     if (!ops)
> >>> +             return -ENOKEY;
> >>> +
> >>> +     payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
> >>> +     if (!payload) {
> >>> +             rc = -ENOMEM;
> >>> +             goto out;
> >>> +     }
> >>
> >> Is the idea to get the values like vmpl part of the payload?
> >>
> >>> +
> >>> +     payload_len = base64_decode(callout_info, callout_len, payload);
> >>> +     if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
> >>> +             rc = -EINVAL;
> >>> +             goto out;
> >>> +     }
> >>> +
> >>> +     rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
> >>> +                              authkey);
> >>> +out:
> >>> +     kfree(payload);
> >>> +     put_ops();
> >>> +     return rc;
> >>> +}
> >>> +
> >>> +static int guest_attest_request_key(struct key *authkey, void *data)
> >>> +{
> >>> +     struct request_key_auth *rka = get_request_key_auth(authkey);
> >>> +     struct key *key = rka->target_key;
> >>> +     unsigned long long id;
> >>> +     int rc, level;
> >>> +
> >>> +     pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
> >>> +              rka->callout_info ? (char *)rka->callout_info : "\"none\"");
> >>> +
> >>> +     if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
> >>> +             return -EINVAL;
> >>> +
> >>
> >> Can you explain some details about the id and level? It is not very clear why
> >> we need it.
> >>
> >>> +     if (!rka->callout_info) {
> >>> +             rc = -EINVAL;
> >>> +             goto out;
> >>> +     }
> >>> +
> >>> +     rc = __guest_attest_request_key(key, level, rka->dest_keyring,
> >>> +                                     rka->callout_info, rka->callout_len,
> >>> +                                     authkey);
> >>> +out:
> >>> +     complete_request_key(authkey, rc);
> >>> +     return rc;
> >>> +}
> >>> +
> >>> +static int guest_attest_vet_description(const char *desc)
> >>> +{
> >>> +     unsigned long long id;
> >>> +     int level;
> >>> +
> >>> +     if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
> >>> +             return -EINVAL;
> >>> +     return 0;
> >>> +}
> >>> +
> >>> +static struct key_type key_type_guest_attest = {
> >>> +     .name = "guest_attest",
> >>> +     .preparse = user_preparse,
> >>> +     .free_preparse = user_free_preparse,
> >>> +     .instantiate = generic_key_instantiate,
> >>> +     .revoke = user_revoke,
> >>> +     .destroy = user_destroy,
> >>> +     .describe = user_describe,
> >>> +     .read = user_read,
> >>> +     .vet_description = guest_attest_vet_description,
> >>> +     .request_key = guest_attest_request_key,
> >>> +};
> >>> +
> >>> +static int __init guest_attest_init(void)
> >>> +{
> >>> +     return register_key_type(&key_type_guest_attest);
> >>> +}
> >>> +
> >>> +static void __exit guest_attest_exit(void)
> >>> +{
> >>> +     unregister_key_type(&key_type_guest_attest);
> >>> +}
> >>> +
> >>> +module_init(guest_attest_init);
> >>> +module_exit(guest_attest_exit);
> >>> +MODULE_LICENSE("GPL v2");
> >>> diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
> >>> index 14246fc2fb02..9a1ec85369fe 100644
> >>> --- a/drivers/virt/coco/tdx-guest/Kconfig
> >>> +++ b/drivers/virt/coco/tdx-guest/Kconfig
> >>> @@ -1,6 +1,7 @@
> >>>  config TDX_GUEST_DRIVER
> >>>       tristate "TDX Guest driver"
> >>>       depends on INTEL_TDX_GUEST
> >>> +     select GUEST_ATTEST
> >>>       help
> >>>         The driver provides userspace interface to communicate with
> >>>         the TDX module to request the TDX guest details like attestation
> >>> diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
> >>> index 388491fa63a1..65b5aab284d9 100644
> >>> --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
> >>> +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
> >>> @@ -13,11 +13,13 @@
> >>>  #include <linux/string.h>
> >>>  #include <linux/uaccess.h>
> >>>  #include <linux/set_memory.h>
> >>> +#include <linux/key-type.h>
> >>>
> >>>  #include <uapi/linux/tdx-guest.h>
> >>>
> >>>  #include <asm/cpu_device_id.h>
> >>>  #include <asm/tdx.h>
> >>> +#include "../guest-attest/guest-attest.h"
> >>>
> >>>  /*
> >>>   * Intel's SGX QE implementation generally uses Quote size less
> >>> @@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
> >>>  };
> >>>  MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
> >>>
> >>> +static int tdx_request_attest(struct key *key, int level,
> >>> +                           struct key *dest_keyring, void *payload,
> >>> +                           int payload_len, struct key *authkey)
> >>> +{
> >>> +     u8 *tdreport;
> >>> +     long ret;
> >>> +
> >>> +     tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
> >>> +     if (!tdreport)
> >>> +             return -ENOMEM;
> >>> +
> >>> +     /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
> >>> +     ret = tdx_mcall_get_report0(payload, tdreport);
> >>> +     if (ret)
> >>> +             goto out;
> >>> +
> >>> +     mutex_lock(&quote_lock);
> >>> +
> >>> +     memset(qentry->buf, 0, qentry->buf_len);
> >>> +     reinit_completion(&qentry->compl);
> >>> +     qentry->valid = true;
> >>> +
> >>> +     /* Submit GetQuote Request using GetQuote hyperetall */
> >>> +     ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
> >>> +     if (ret) {
> >>> +             pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
> >>> +             ret = -EIO;
> >>> +             goto quote_failed;
> >>> +     }
> >>> +
> >>> +     /*
> >>> +      * Although the GHCI specification does not state explicitly that
> >>> +      * the VMM must not wait indefinitely for the Quote request to be
> >>> +      * completed, a sane VMM should always notify the guest after a
> >>> +      * certain time, regardless of whether the Quote generation is
> >>> +      * successful or not.  For now just assume the VMM will do so.
> >>> +      */
> >>> +     wait_for_completion(&qentry->compl);
> >>> +
> >>> +     ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
> >>> +                                    dest_keyring, authkey);
> >>> +
> >>> +quote_failed:
> >>> +     qentry->valid = false;
> >>> +     mutex_unlock(&quote_lock);
> >>> +out:
> >>> +     kfree(tdreport);
> >>> +     return ret;
> >>> +}
> >>> +
> >>> +static struct guest_attest_ops tdx_attest_ops = {
> >>> +     .name = KBUILD_MODNAME,
> >>> +     .module = THIS_MODULE,
> >>> +     .request_attest = tdx_request_attest,
> >>> +};
> >>> +
> >>>  static int __init tdx_guest_init(void)
> >>>  {
> >>>       int ret;
> >>> @@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
> >>>       if (ret)
> >>>               goto free_quote;
> >>>
> >>> +     ret = register_guest_attest_ops(&tdx_attest_ops);
> >>> +     if (ret)
> >>> +             goto free_irq;
> >>> +
> >>>       return 0;
> >>>
> >>> +free_irq:
> >>> +     tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
> >>>  free_quote:
> >>>       free_quote_entry(qentry);
> >>>  free_misc:
> >>> @@ -264,6 +328,7 @@ module_init(tdx_guest_init);
> >>>
> >>>  static void __exit tdx_guest_exit(void)
> >>>  {
> >>> +     unregister_guest_attest_ops(&tdx_attest_ops);
> >>>       tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
> >>>       free_quote_entry(qentry);
> >>>       misc_deregister(&tdx_misc_dev);
> >>
> >> --
> >> Sathyanarayanan Kuppuswamy
> >> Linux Kernel Developer
> > 
> > 
> > 
> > --
> > -Dionna Glaze, PhD (she/her)
> 
> -- 
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28 15:41             ` Samuel Ortiz
@ 2023-06-28 15:55               ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 41+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-28 15:55 UTC (permalink / raw
  To: Samuel Ortiz
  Cc: Dionna Amalie Glaze, Dan Williams, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

Hi Samuel,

On 6/28/23 8:41 AM, Samuel Ortiz wrote:
> On Mon, Jun 26, 2023 at 05:39:12PM -0700, Sathyanarayanan Kuppuswamy wrote:
>> +Atish
>>
>> Atish, any comments on this topic from RISC-v?
> 
> The CoVE (RISC-V confidential computing specification) would benefit
> from the proposed abstract API. Similar to at least both TDX and SEV,
> the CoVE attestation evidence generation proposed (The spec is not
> ratified yet) interface [1] basically takes some binary blobs in (a TVM
> public key and an attestation challenge blob) and requests the TSM to
> generate an attesation evidence for the caller. The TSM will generate
> such evidence from all static and runtime measurements and sign it with
> its DICE derived key. Attestation lingo set aside, the pattern here is
> similar to SEV and TDX. Having a common API for a generic attestation
> evidence generation interface would avoid having to add yet another
> ioctl based interface specific to CoVE.

Great, this gives us more confidence about the generic ABI design.

> 
> Another interface we could think about commonizing is the measurement
> extension one. I think both TDX and CoVE allow for a guest to
> dynamically extend its measurements (to dedicated, runtime PCRs).
> 

Yes. I think most vendors will need similar support. We are planning
to add a generic ABI for this as well.


> Cheers,
> Samuel.
> 
> [1] https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/sbi_cove.adoc#function-cove-guest-get-evidence-fid-8
> 
>> On 6/26/23 11:57 AM, Dionna Amalie Glaze wrote:
>>> On Sun, Jun 25, 2023 at 8:06 PM Sathyanarayanan Kuppuswamy
>>> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>>>>
>>>> Hi Dan,
>>>>
>>>> On 6/23/23 3:27 PM, Dan Williams wrote:
>>>>> Dan Williams wrote:
>>>>>> [ add David, Brijesh, and Atish]
>>>>>>
>>>>>> Kuppuswamy Sathyanarayanan wrote:
>>>>>>> In TDX guest, the second stage of the attestation process is Quote
>>>>>>> generation. This process is required to convert the locally generated
>>>>>>> TDREPORT into a remotely verifiable Quote. It involves sending the
>>>>>>> TDREPORT data to a Quoting Enclave (QE) which will verify the
>>>>>>> integrity of the TDREPORT and sign it with an attestation key.
>>>>>>>
>>>>>>> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
>>>>>>> allow the user agent to get the TD Quote.
>>>>>>>
>>>>>>> Add a kernel selftest module to verify the Quote generation feature.
>>>>>>>
>>>>>>> TD Quote generation involves following steps:
>>>>>>>
>>>>>>> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
>>>>>>> * Embed the TDREPORT data in quote buffer and request for quote
>>>>>>>   generation via TDX_CMD_GET_QUOTE IOCTL request.
>>>>>>> * Upon completion of the GetQuote request, check for non zero value
>>>>>>>   in the status field of Quote header to make sure the generated
>>>>>>>   quote is valid.
>>>>>>
>>>>>> What this cover letter does not say is that this is adding another
>>>>>> instance of the similar pattern as SNP_GET_REPORT.
>>>>>>
>>>>>> Linux is best served when multiple vendors trying to do similar
>>>>>> operations are brought together behind a common ABI. We see this in the
>>>>>> history of wrangling SCSI vendors behind common interfaces. Now multiple
>>>>>> confidential computing vendors trying to develop similar flows with
>>>>>> differentiated formats where that differentiation need not leak over the
>>>>>> ABI boundary.
>>>>> [..]
>>>>>
>>>>> Below is a rough mock up of this approach to demonstrate the direction.
>>>>> Again, the goal is to define an ABI that can support any vendor's
>>>>> arch-specific attestation method and key provisioning flows without
>>>>> leaking vendor-specific details, or confidential material over the
>>>>> user/kernel ABI.
>>>>
>>>> Thanks for working on this mock code and helping out. It gives me the
>>>> general idea about your proposal.
>>>>
>>>>>
>>>>> The observation is that there are a sufficient number of attestation
>>>>> flows available to review where Linux can define a superset ABI to
>>>>> contain them all. The other observation is that the implementations have
>>>>> features that may cross-polinate over time. For example the SEV
>>>>> privelege level consideration ("vmpl"), and the TDX RTMR (think TPM
>>>>> PCRs) mechanisms address generic Confidential Computing use cases.
>>>>
>>>>
>>>> I agree with your point about VMPL and RTMR feature cases. This observation
>>>> is valid for AMD SEV and TDX attestation flows. But I am not sure whether
>>>> it will hold true for other vendor implementations. Our sample set is not
>>>> good enough to make this conclusion. The reason for my concern is, if you
>>>> check the ABI interface used in the S390 arch attestation driver
>>>> (drivers/s390/char/uvdevice.c), you would notice that there is a significant
>>>> difference between the ABI used in that driver and SEV/TDX drivers. The S390
>>>> driver attestation request appears to accept two data blobs as input, as well
>>>> as a variety of vendor-specific header configurations.
>>>>
>>>> Maybe the s390 attestation model is a special case, but, I think we consider
>>>> this issue. Since we don't have a common spec, there is chance that any
>>>> superset ABI we define now may not meet future vendor requirements. One way to
>>>> handle it to leave enough space in the generic ABI to handle future vendor
>>>> requirements.
>>>>
>>>> I think it would be better if other vendors (like ARM or RISC) can comment and
>>>> confirm whether this proposal meets their demands.
>>>>
>>>
>>> The VMPL-based separation that will house the supervisor module known
>>> as SVSM can have protocols that implement a TPM command interface, or
>>> an RTMR-extension interface, and will also need to have an
>>> SVSM-specific protocol attestation report format to keep the secure
>>> chain of custody apparent. We'd have different formats and protocols
>>> in the kernel, at least, to speak to each technology. I'm not sure
>>> it's worth the trouble of papering over all the... 3-4 technologies
>>> with similar but still weirdly different formats and ways of doing
>>> things with an abstracted attestation ABI, especially since the output
>>> all has to be interpreted in an architecture-specific way anyway.
>>>
>>> ARM's Confidential Computing Realm Management Extensions (RME) seems
>>> to be going along the lines of a runtime measurement register model
>>> with their hardware enforced security. The number of registers isn't
>>> prescribed in the spec.
>>>
>>> +Joey Gouly +linux-coco@lists.linux.dev as far as RME is concerned, do
>>> you know who would be best to weigh in on this discussion of a unified
>>> attestation model?
>>
>>
>>>
>>>>>
>>>>> Vendor specific ioctls for all of this feels like surrender when Linux
>>>>> already has the keys subsystem which has plenty of degrees of freedom
>>>>> for tracking blobs with signatures and using those blobs to instantiate
>>>>> other blobs. It already serves as the ABI wrapping various TPM
>>>>> implementations and marshaling keys for storage encryption and other use
>>>>> cases that intersect Confidential Computing.
>>>>>
>>>>> The benefit of deprecating vendor-specific abstraction layers in
>>>>> userspace is secondary. The primary benefit is collaboration. It enables
>>>>> kernel developers from various architectures to collaborate on common
>>>>> infrastructure. If, referring back to my previous example, SEV adopts an
>>>>> RTMR-like mechanism and TDX adopts a vmpl-like mechanism it would be
>>>>> unfortunate if those efforts were siloed, duplicated, and needlessly
>>>>> differentiated to userspace. So while there are arguably a manageable
>>>>> number of basic arch attestation methods the planned expansion of those
>>>>> to build incremental functionality is where I believe we, as a
>>>>> community, will be glad that we invested in a "Linux format" for all of
>>>>> this.
>>>>>
>>>>> An example, to show what the strawman patch below enables: (req_key is
>>>>> the sample program from "man 2 request_key")
>>>>>
>>>>> # ./req_key guest_attest guest_attest:0:0-$desc $(cat user_data | base64)
>>>>> Key ID is 10e2f3a7
>>>>> # keyctl pipe 0x10e2f3a7 | hexdump -C
>>>>> 00000000  54 44 58 20 47 65 6e 65  72 61 74 65 64 20 51 75  |TDX Generated Qu|
>>>>> 00000010  6f 74 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |ote.............|
>>>>> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>>>>> *
>>>>> 00004000
>>>>>
>>>>> This is the kernel instantiating a TDX Quote without the TDREPORT
>>>>> implementation detail ever leaving the kernel. Now, this is only the
>>>>
>>>> IIUC, the idea here is to cache the quote data and return it to the user whenever
>>>> possible, right? If yes, I think such optimization may not be very useful for our
>>>> case. AFAIK, the quote data will change whenever there is a change in the guest
>>>> measurement data. Since the validity of the generated quote will not be long,
>>>> and the frequency of quote generation requests is expected to be less, we may not
>>>> get much benefit from caching the quote data. I think we can keep this logic simple
>>>> by directly retrieving the quote data from the quoting enclave whenever there is a
>>>> request from the user.
>>>>
>>>>> top-half of what is needed. The missing bottom half takes that material
>>>>> and uses it to instantiate derived key material like the storage
>>>>> decryption key internal to the kernel. See "The Process" in
>>>>> Documentation/security/keys/request-key.rst for how the Keys subsystem
>>>>> handles the "keys for keys" use case.
>>>>
>>>> This is only useful for key-server use case, right? Attestation can also be
>>>> used for use cases like pattern matching or uploading some secure data, etc.
>>>> Since key-server is not the only use case, does it make sense to suppport
>>>> this derived key feature?
>>>>
>>>>>
>>>>> ---
>>>>> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
>>>>> index f79ab13a5c28..0f775847028e 100644
>>>>> --- a/drivers/virt/Kconfig
>>>>> +++ b/drivers/virt/Kconfig
>>>>> @@ -54,4 +54,8 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>>>>>
>>>>>  source "drivers/virt/coco/tdx-guest/Kconfig"
>>>>>
>>>>> +config GUEST_ATTEST
>>>>> +     tristate
>>>>> +     select KEYS
>>>>> +
>>>>>  endif
>>>>> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
>>>>> index e9aa6fc96fab..66f6b838f8f4 100644
>>>>> --- a/drivers/virt/Makefile
>>>>> +++ b/drivers/virt/Makefile
>>>>> @@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM)              += acrn/
>>>>>  obj-$(CONFIG_EFI_SECRET)     += coco/efi_secret/
>>>>>  obj-$(CONFIG_SEV_GUEST)              += coco/sev-guest/
>>>>>  obj-$(CONFIG_INTEL_TDX_GUEST)        += coco/tdx-guest/
>>>>> +obj-$(CONFIG_GUEST_ATTEST)   += coco/guest-attest/
>>>>> diff --git a/drivers/virt/coco/guest-attest/Makefile b/drivers/virt/coco/guest-attest/Makefile
>>>>> new file mode 100644
>>>>> index 000000000000..5581c5a27588
>>>>> --- /dev/null
>>>>> +++ b/drivers/virt/coco/guest-attest/Makefile
>>>>> @@ -0,0 +1,2 @@
>>>>> +obj-$(CONFIG_GUEST_ATTEST) += guest_attest.o
>>>>> +guest_attest-y := key.o
>>>>> diff --git a/drivers/virt/coco/guest-attest/key.c b/drivers/virt/coco/guest-attest/key.c
>>>>> new file mode 100644
>>>>> index 000000000000..2a494b6dd7a7
>>>>> --- /dev/null
>>>>> +++ b/drivers/virt/coco/guest-attest/key.c
>>>>> @@ -0,0 +1,159 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
>>>>> +
>>>>> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>>> +#include <linux/seq_file.h>
>>>>> +#include <linux/key-type.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/base64.h>
>>>>> +
>>>>> +#include <keys/request_key_auth-type.h>
>>>>> +#include <keys/user-type.h>
>>>>> +
>>>>> +#include "guest-attest.h"
>>>>
>>>> Can you share you guest-attest.h?
>>>>
>>>>> +
>>>>> +static LIST_HEAD(guest_attest_list);
>>>>> +static DECLARE_RWSEM(guest_attest_rwsem);
>>>>> +
>>>>> +static struct guest_attest_ops *fetch_ops(void)
>>>>> +{
>>>>> +     return list_first_entry_or_null(&guest_attest_list,
>>>>> +                                     struct guest_attest_ops, list);
>>>>> +}
>>>>> +
>>>>> +static struct guest_attest_ops *get_ops(void)
>>>>> +{
>>>>> +     down_read(&guest_attest_rwsem);
>>>>> +     return fetch_ops();
>>>>> +}
>>>>> +
>>>>> +static void put_ops(void)
>>>>> +{
>>>>> +     up_read(&guest_attest_rwsem);
>>>>> +}
>>>>> +
>>>>> +int register_guest_attest_ops(struct guest_attest_ops *ops)
>>>>> +{
>>>>> +     struct guest_attest_ops *conflict;
>>>>> +     int rc;
>>>>> +
>>>>> +     down_write(&guest_attest_rwsem);
>>>>> +     conflict = fetch_ops();
>>>>> +     if (conflict) {
>>>>> +             pr_err("\"%s\" ops already registered\n", conflict->name);
>>>>> +             rc = -EEXIST;
>>>>> +             goto out;
>>>>> +     }
>>>>> +     list_add(&ops->list, &guest_attest_list);
>>>>> +     try_module_get(ops->module);
>>>>> +     rc = 0;
>>>>> +out:
>>>>> +     up_write(&guest_attest_rwsem);
>>>>> +     return rc;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(register_guest_attest_ops);
>>>>> +
>>>>> +void unregister_guest_attest_ops(struct guest_attest_ops *ops)
>>>>> +{
>>>>> +     down_write(&guest_attest_rwsem);
>>>>> +     list_del(&ops->list);
>>>>> +     up_write(&guest_attest_rwsem);
>>>>> +     module_put(ops->module);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(unregister_guest_attest_ops);
>>>>> +
>>>>> +static int __guest_attest_request_key(struct key *key, int level,
>>>>> +                                   struct key *dest_keyring,
>>>>> +                                   const char *callout_info, int callout_len,
>>>>> +                                   struct key *authkey)
>>>>> +{
>>>>> +     struct guest_attest_ops *ops;
>>>>> +     void *payload = NULL;
>>>>> +     int rc, payload_len;
>>>>> +
>>>>> +     ops = get_ops();
>>>>> +     if (!ops)
>>>>> +             return -ENOKEY;
>>>>> +
>>>>> +     payload = kzalloc(max(GUEST_ATTEST_DATALEN, callout_len), GFP_KERNEL);
>>>>> +     if (!payload) {
>>>>> +             rc = -ENOMEM;
>>>>> +             goto out;
>>>>> +     }
>>>>
>>>> Is the idea to get the values like vmpl part of the payload?
>>>>
>>>>> +
>>>>> +     payload_len = base64_decode(callout_info, callout_len, payload);
>>>>> +     if (payload_len < 0 || payload_len > GUEST_ATTEST_DATALEN) {
>>>>> +             rc = -EINVAL;
>>>>> +             goto out;
>>>>> +     }
>>>>> +
>>>>> +     rc = ops->request_attest(key, level, dest_keyring, payload, payload_len,
>>>>> +                              authkey);
>>>>> +out:
>>>>> +     kfree(payload);
>>>>> +     put_ops();
>>>>> +     return rc;
>>>>> +}
>>>>> +
>>>>> +static int guest_attest_request_key(struct key *authkey, void *data)
>>>>> +{
>>>>> +     struct request_key_auth *rka = get_request_key_auth(authkey);
>>>>> +     struct key *key = rka->target_key;
>>>>> +     unsigned long long id;
>>>>> +     int rc, level;
>>>>> +
>>>>> +     pr_debug("desc: %s op: %s callout: %s\n", key->description, rka->op,
>>>>> +              rka->callout_info ? (char *)rka->callout_info : "\"none\"");
>>>>> +
>>>>> +     if (sscanf(key->description, "guest_attest:%d:%llu", &level, &id) != 2)
>>>>> +             return -EINVAL;
>>>>> +
>>>>
>>>> Can you explain some details about the id and level? It is not very clear why
>>>> we need it.
>>>>
>>>>> +     if (!rka->callout_info) {
>>>>> +             rc = -EINVAL;
>>>>> +             goto out;
>>>>> +     }
>>>>> +
>>>>> +     rc = __guest_attest_request_key(key, level, rka->dest_keyring,
>>>>> +                                     rka->callout_info, rka->callout_len,
>>>>> +                                     authkey);
>>>>> +out:
>>>>> +     complete_request_key(authkey, rc);
>>>>> +     return rc;
>>>>> +}
>>>>> +
>>>>> +static int guest_attest_vet_description(const char *desc)
>>>>> +{
>>>>> +     unsigned long long id;
>>>>> +     int level;
>>>>> +
>>>>> +     if (sscanf(desc, "guest_attest:%d:%llu", &level, &id) != 2)
>>>>> +             return -EINVAL;
>>>>> +     return 0;
>>>>> +}
>>>>> +
>>>>> +static struct key_type key_type_guest_attest = {
>>>>> +     .name = "guest_attest",
>>>>> +     .preparse = user_preparse,
>>>>> +     .free_preparse = user_free_preparse,
>>>>> +     .instantiate = generic_key_instantiate,
>>>>> +     .revoke = user_revoke,
>>>>> +     .destroy = user_destroy,
>>>>> +     .describe = user_describe,
>>>>> +     .read = user_read,
>>>>> +     .vet_description = guest_attest_vet_description,
>>>>> +     .request_key = guest_attest_request_key,
>>>>> +};
>>>>> +
>>>>> +static int __init guest_attest_init(void)
>>>>> +{
>>>>> +     return register_key_type(&key_type_guest_attest);
>>>>> +}
>>>>> +
>>>>> +static void __exit guest_attest_exit(void)
>>>>> +{
>>>>> +     unregister_key_type(&key_type_guest_attest);
>>>>> +}
>>>>> +
>>>>> +module_init(guest_attest_init);
>>>>> +module_exit(guest_attest_exit);
>>>>> +MODULE_LICENSE("GPL v2");
>>>>> diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
>>>>> index 14246fc2fb02..9a1ec85369fe 100644
>>>>> --- a/drivers/virt/coco/tdx-guest/Kconfig
>>>>> +++ b/drivers/virt/coco/tdx-guest/Kconfig
>>>>> @@ -1,6 +1,7 @@
>>>>>  config TDX_GUEST_DRIVER
>>>>>       tristate "TDX Guest driver"
>>>>>       depends on INTEL_TDX_GUEST
>>>>> +     select GUEST_ATTEST
>>>>>       help
>>>>>         The driver provides userspace interface to communicate with
>>>>>         the TDX module to request the TDX guest details like attestation
>>>>> diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
>>>>> index 388491fa63a1..65b5aab284d9 100644
>>>>> --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
>>>>> +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
>>>>> @@ -13,11 +13,13 @@
>>>>>  #include <linux/string.h>
>>>>>  #include <linux/uaccess.h>
>>>>>  #include <linux/set_memory.h>
>>>>> +#include <linux/key-type.h>
>>>>>
>>>>>  #include <uapi/linux/tdx-guest.h>
>>>>>
>>>>>  #include <asm/cpu_device_id.h>
>>>>>  #include <asm/tdx.h>
>>>>> +#include "../guest-attest/guest-attest.h"
>>>>>
>>>>>  /*
>>>>>   * Intel's SGX QE implementation generally uses Quote size less
>>>>> @@ -229,6 +231,62 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
>>>>>  };
>>>>>  MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
>>>>>
>>>>> +static int tdx_request_attest(struct key *key, int level,
>>>>> +                           struct key *dest_keyring, void *payload,
>>>>> +                           int payload_len, struct key *authkey)
>>>>> +{
>>>>> +     u8 *tdreport;
>>>>> +     long ret;
>>>>> +
>>>>> +     tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
>>>>> +     if (!tdreport)
>>>>> +             return -ENOMEM;
>>>>> +
>>>>> +     /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
>>>>> +     ret = tdx_mcall_get_report0(payload, tdreport);
>>>>> +     if (ret)
>>>>> +             goto out;
>>>>> +
>>>>> +     mutex_lock(&quote_lock);
>>>>> +
>>>>> +     memset(qentry->buf, 0, qentry->buf_len);
>>>>> +     reinit_completion(&qentry->compl);
>>>>> +     qentry->valid = true;
>>>>> +
>>>>> +     /* Submit GetQuote Request using GetQuote hyperetall */
>>>>> +     ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
>>>>> +     if (ret) {
>>>>> +             pr_err("GetQuote hyperetall failed, status:%lx\n", ret);
>>>>> +             ret = -EIO;
>>>>> +             goto quote_failed;
>>>>> +     }
>>>>> +
>>>>> +     /*
>>>>> +      * Although the GHCI specification does not state explicitly that
>>>>> +      * the VMM must not wait indefinitely for the Quote request to be
>>>>> +      * completed, a sane VMM should always notify the guest after a
>>>>> +      * certain time, regardless of whether the Quote generation is
>>>>> +      * successful or not.  For now just assume the VMM will do so.
>>>>> +      */
>>>>> +     wait_for_completion(&qentry->compl);
>>>>> +
>>>>> +     ret = key_instantiate_and_link(key, qentry->buf, qentry->buf_len,
>>>>> +                                    dest_keyring, authkey);
>>>>> +
>>>>> +quote_failed:
>>>>> +     qentry->valid = false;
>>>>> +     mutex_unlock(&quote_lock);
>>>>> +out:
>>>>> +     kfree(tdreport);
>>>>> +     return ret;
>>>>> +}
>>>>> +
>>>>> +static struct guest_attest_ops tdx_attest_ops = {
>>>>> +     .name = KBUILD_MODNAME,
>>>>> +     .module = THIS_MODULE,
>>>>> +     .request_attest = tdx_request_attest,
>>>>> +};
>>>>> +
>>>>>  static int __init tdx_guest_init(void)
>>>>>  {
>>>>>       int ret;
>>>>> @@ -251,8 +309,14 @@ static int __init tdx_guest_init(void)
>>>>>       if (ret)
>>>>>               goto free_quote;
>>>>>
>>>>> +     ret = register_guest_attest_ops(&tdx_attest_ops);
>>>>> +     if (ret)
>>>>> +             goto free_irq;
>>>>> +
>>>>>       return 0;
>>>>>
>>>>> +free_irq:
>>>>> +     tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>>>>>  free_quote:
>>>>>       free_quote_entry(qentry);
>>>>>  free_misc:
>>>>> @@ -264,6 +328,7 @@ module_init(tdx_guest_init);
>>>>>
>>>>>  static void __exit tdx_guest_exit(void)
>>>>>  {
>>>>> +     unregister_guest_attest_ops(&tdx_attest_ops);
>>>>>       tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
>>>>>       free_quote_entry(qentry);
>>>>>       misc_deregister(&tdx_misc_dev);
>>>>
>>>> --
>>>> Sathyanarayanan Kuppuswamy
>>>> Linux Kernel Developer
>>>
>>>
>>>
>>> --
>>> -Dionna Glaze, PhD (she/her)
>>
>> -- 
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer
>>

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-06-28  2:52               ` Dan Williams
@ 2023-06-29 16:25                 ` Dionna Amalie Glaze
  0 siblings, 0 replies; 41+ messages in thread
From: Dionna Amalie Glaze @ 2023-06-29 16:25 UTC (permalink / raw
  To: Dan Williams
  Cc: Sathyanarayanan Kuppuswamy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Chong Cai, Qinkun Bao,
	Guorui Yu, Du Fan, linux-kernel, linux-kselftest, linux-doc,
	dhowells, brijesh.singh, atishp, gregkh, linux-coco, joey.gouly

>
> First, thank you for engaging, it speeds up the iteration. This
> confirmed my worry that the secondary goal of this proposal, a common
> verification implementation, is indeed unachievable in the near term. A
> few clarifying questions below, but I will let this go.
>
> The primary goal, achievable on a short runway, is more for kernel
> developers. It is to have a common infrastructure for marshaling vendor
> payloads, provide a mechanism to facilitate kernel initiated requests to
> a key-server, and to deploy a common frontend for concepts like runtime
> measurement (likely as another backend to what Keys already understands
> for various TPM PCR implementations).
>

That sounds good, though the devil is in the details. The TPM
situation will be exacerbated by a lower root of trust. The TPM itself
doesn't have a specification for its own attested firmware.

> > All the specific fields of the blob have to be decoded and subjected
> > to an acceptance policy. That policy will most always be different
> > across different platforms and VM owners. I wrote all of
> > github.com/google/go-sev-guest, including the verification and
> > validation logic, and it's going to get more complicated, and the
> > sources of the data that provide validators with notions of what
> > values can be trusted will be varied.
>
> Can you provide an example? I ask only to include it in the kernel
> commit log for a crisp explanation why this proposed Keys format will
> continue to convey a raw vendor blob with no kernel abstraction as part
> of its payload for the foreseeable future.
>

An example is that while there is a common notion that each report
will have some attestation key whose certificate needs to be verified,
there is additional collateral that must be downloaded to

* verify a TDX key certificate against updates to known weaknesses of
the key's details
* verify the measurement in the report against a vendor's signed
golden measurement
* [usually offline and signed by the analyzing principal that the
analysis was done] fully verify the measurement given a build
provenance document like SLSA. The complexity of this analysis could
even engage in static analysis of every commit since a certain date,
or from a developer of low repute... whatever the verifier wants to
do.

These are all in the realm of interpreting the blob for acceptance, so
it's best to keep uninterpreted.

> > The formats are not standardized. The Confidential Computing
> > Consortium should be working toward that, but it's a slow process.
> > There's IETF RATS. There's in-toto.io attestations. There's Azure's
> > JWT thing. There's a signed serialized protocol buffer that I've
> > decided is what Google is going to produce while we figure out all the
> > "right" formats to use. There will be factions and absolute gridlock
> > for multiple years if we require solidifying an abstraction for the
> > kernel to manage all this logic before passing a report on to user
> > space.
>
> Understood. When that standardization process completes my expectation
> is that it slots into the common conveyance method and no need to go
> rewrite code that already knows how to interface with Keys to get
> attestation evidence.
>

I can get on board with that. I don't think there will be much cause
for more than a handful of attestation requests with different report
data, so it shouldn't overwhelm the key subsystem.

> >
> > You really shouldn't be putting attestation validation logic in the
> > kernel.
>
> It was less putting validation logic in the kernel, and more hoping for
> a way to abstract some common parsing in advance of a true standard
> attestation format, but point taken.
>

I think we'll have hardware-provided blobs and host-provided cached
collateral. The caching could be achieved with a hosted proxy server,
but given SEV-SNP already has GET_EXT_GUEST_REQUEST to simplify
delivery, I think it's fair to offer other technologies the chance at
supporting a similar simple solution.

Everything else will have to come from the network or the workload itself.

> > It belongs outside of the VM entirely with the party that will
> > only release access keys to the VM if it can prove it's running the
> > software it claims, on the platform it claims. I think Windows puts a
> > remote procedure call in their guest attestation driver to the Azure
> > attestation service, and that is an anti-pattern in my mind.
>
> I can not speak to the Windows implementation, but the Linux Keys
> subsystem is there to handle Key construction that may be requested by
> userspace or the kernel and may be serviced by built-in keys,
> device/platform instantiated keys, or keys retrieved via an upcall to
> userspace.
>
> The observation is that existing calls to request_key() in the kernel
> likely have reason to be serviced by a confidential computing key server
> somewhere in the chain. So, might as well enlighten the Keys subsystem
> to retrieve this information and skip round trips to userspace run
> vendor specific ioctls. Make the kernel as self sufficient as possible,
> and make SEV, TDX, etc. developers talk more to each other about their
> needs.

That sounds reasonable. I think one wrinkle in the current design is
that SGX and SEV-SNP provide derived keys as a thing separate from
attestation but still based on firmware measurement, and TDX doesn't
yet. It may in the future come with a TDX module update that gets
derived keys through an SGX enclave–who knows. The MSG_KEY_REQ guest
request for a SEV-SNP derived key has some bits and bobs to select
different VM material to mix into the key derivation, so that would
need to be in the API as well. It makes request_key a little weird to
use for both. I don't even think there's a sufficient abstraction for
the guest-attest device to provide, since there isn't a common
REPORT_DATA + attestation level pair of inputs that drive it. If we're
fine with a technology-tagged uninterpreted input blob for key
derivation, and the device returns an error if the real hardware
doesn't match the technology tag, then that could be an okay enough
interface.

I could be convinced to leave MSG_KEY_REQ out of Linux entirely, but
only for selfish reasons. The alternative is to set up a sealing key
escrow service that releases sealing keys when a VM's attestation
matches a pre-registered policy, which is extremely heavy-handed when
you can enforce workload identity at VM launch time and have a safe
derived key with this technology. I think there's a Decentriq blog
post about setting that whole supply chain up ("swiss cheese to
cheddar"), so they'd likely have some words about that.

-- 
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] TDX Guest Quote generation support
  2023-06-27  7:50       ` Chong Cai
@ 2023-08-23  7:33         ` Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: Thomas Gleixner @ 2023-08-23  7:33 UTC (permalink / raw
  To: Chong Cai, Dan Williams
  Cc: Sathyanarayanan Kuppuswamy, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa, Erdem Aktas,
	Dionna Amalie Glaze, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

On Tue, Jun 27 2023 at 00:50, Chong Cai wrote:
> On Sun, Jun 25, 2023 at 9:32 PM Dan Williams <dan.j.williams@intel.com> wrote:
>> What I would ask of those who absolutely cannot support the TDVMCALL
>> method is to contribute a solution that intercepts the "upcall" to the
>> platform "guest_attest_ops" and turn it into a typical keys upcall to
>> userspace that can use the report data with a vsock tunnel.
>>
>> That way the end result is still the same, a key established with the
>> TDX Quote evidence contained within a Linux-defined envelope.
>
> I agree a unified ABI across vendors would be ideal in the long term.
> However, it sounds like a non-trivial task and could take quite some
> time to achieve.
> Given there's already an AMD equivalent approach upstreamed, can we
> also allow this TDVMCALL patch as an intermediate step to unblock
> various TDX attestation user cases while targeting unified ABI? The
> TDVMCALL here is quite isolated and serves a very specific purpose, it
> should be very low risk to other kernel features and easy to be
> reverted in the future.

No way. This is exactly how the kernel ends up with an unmaintainable
mess simply because this creates an user space ABI which is not
revertable ever.

It's bad enough that nobody paid attention when the AMD muck was merged,
but that does not make an argument or any form of justification to add
more of this.

Dan's proposal makes a lot of sense and allows to implement this in a
mostly vendor agnostic way. While the AMD interface is not going away
due to that, I'm 100% confident (pun intended) that such an unified
interface is going to be utilized and supported by AMD (or any other
vendor) sooner than later simply because the user space people who have
to implement vendor agnostic orchestration tools will go for it as it
makes their life easier too.

The time wasted to argue about this TDX ioctl mess could have been spent
to actually migrate TDX over to this scheme. But sure it's way simpler
to flog a dead horse instead of actually sitting down and getting useful
work done.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature
       [not found]     ` <CAAYXXYyK4g9k7a78CU9w6Sn9KTBdoNLOu9gcgrSHJfp+3-tO=w@mail.gmail.com>
  2023-06-23 22:49       ` Dan Williams
@ 2023-08-23  8:25       ` Thomas Gleixner
  1 sibling, 0 replies; 41+ messages in thread
From: Thomas Gleixner @ 2023-08-23  8:25 UTC (permalink / raw
  To: Erdem Aktas, Dan Williams
  Cc: Kuppuswamy Sathyanarayanan, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa,
	Dionna Amalie Glaze, Chong Cai, Qinkun Bao, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc, dhowells, brijesh.singh,
	atishp

On Thu, Jun 22 2023 at 14:01, Erdem Aktas wrote:
> On Mon, Jun 12, 2023 at 12:03 PM Dan Williams <dan.j.williams@intel.com>
> wrote:
>> Now multiple
>> confidential computing vendors trying to develop similar flows with
>> differentiated formats where that differentiation need not leak over the
>> ABI boundary.
>>
>
> <Just my personal opinion below>
> I agree with this statement in the high level but it is also somehow
> surprising for me after all the discussion happened around this topic.
> Honestly, I feel like there are multiple versions of "Intel"  working in
> different directions.
>
> If we want multiple vendors trying to do the similar things behind a common
> ABI, it should start with the spec. Since this comment is coming from
> Intel, I wonder if there is any plan to combine the GHCB and GHCI
> interfaces under common ABI in the future or why it did not even happen in
> the first place.

You are conflating things here.

The GETQUOTE TDVMCALL interface is part of the Guest-Hypervisor
Communication Interface (GHCI), which is a firmware interface.

Firmware (likewise hardware) interfaces have the unfortunate property
that they are mostly cast in stone.

But that has absolutely nothing to do with the way how the kernel
implements support for them. If we'd follow your reasoning then we'd
have a gazillion of vendor specific SCSI stacks in the kernel.

> What I see is that Intel has GETQUOTE TDVMCALL interface in its spec and
> again Intel does not really want to provide support for it in linux. It
> feels really frustrating.

Intel definitely wants to provide support for this interface and this
very thread is about that support. But Intel is not in a position to
define what the kernel community has to accept or not, neither is
Google.

Sure, it would have been more efficient to come up with a better
interface earlier, but that's neither an Intel nor a TDX specific
problem.

It's just how kernel development works. Some ideas look good on first
sight, some stuff slips through and at some point the maintainers
realize that this is not the way to go and request a proper generalized
and maintainable implementation.

If you can provide compelling technical reasons why the IOCTL is the
better and more maintainable approach for the kernel, then we are all
ears and happy to debate that on the technical level.

Feel free to be frustrated, but I can assure you that the only way to
resolve this dilemma is to sit down and actually get work done in a way
which is acceptable by the kernel community at the technical level.

Everything else is frustrating for everyone involved, not only you.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
  2023-06-12 12:49   ` Huang, Kai
@ 2023-08-23 20:47   ` Thomas Gleixner
  1 sibling, 0 replies; 41+ messages in thread
From: Thomas Gleixner @ 2023-08-23 20:47 UTC (permalink / raw
  To: Kuppuswamy Sathyanarayanan, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Dionna Amalie Glaze,
	Chong Cai, Qinkun Bao, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

On Sun, May 14 2023 at 00:23, Kuppuswamy Sathyanarayanan wrote:
> +static int __init tdx_event_irq_init(void)
> +{
> +	struct irq_affinity_desc desc;
> +	struct irq_alloc_info info;
> +	struct irq_cfg *cfg;
> +	int irq;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> +		return 0;
> +
> +	init_irq_alloc_info(&info, NULL);
> +
> +	cpumask_set_cpu(smp_processor_id(), &desc.mask);

desc is completely uninitialized here and therefore desc.mask and
desc.is_managed contain random stack content.

The only reason why this "works" is that tdx_event_irq_init() is an
early initcall which is invoked before SMP init, so the only choice of
the allocator is to pick an interrupt on CPU0.

Thanks,

        tglx




^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2023-08-23 20:49 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-14  7:23 [PATCH v3 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
2023-05-14  7:23 ` [PATCH v3 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
2023-06-12 12:49   ` Huang, Kai
2023-08-23 20:47   ` Thomas Gleixner
2023-05-14  7:23 ` [PATCH v3 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
2023-06-12 12:50   ` Huang, Kai
2023-05-14  7:23 ` [PATCH v3 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
2023-06-12 19:03   ` Dan Williams
2023-06-19  5:38     ` Sathyanarayanan Kuppuswamy
2023-06-22 23:31     ` Erdem Aktas
2023-06-22 23:44       ` Huang, Kai
2023-06-23 22:31         ` Dan Williams
2023-06-23 22:27     ` Dan Williams
2023-06-26  3:05       ` Sathyanarayanan Kuppuswamy
2023-06-26 18:57         ` Dionna Amalie Glaze
2023-06-27  0:39           ` Sathyanarayanan Kuppuswamy
2023-06-28 15:41             ` Samuel Ortiz
2023-06-28 15:55               ` Sathyanarayanan Kuppuswamy
2023-06-28  0:11           ` Dan Williams
2023-06-28  1:36             ` Dionna Amalie Glaze
2023-06-28  2:16               ` Huang, Kai
2023-06-28  6:46                 ` gregkh
2023-06-28  8:56                   ` Huang, Kai
2023-06-28  9:02                     ` gregkh
2023-06-28  9:45                       ` Huang, Kai
2023-06-28  2:52               ` Dan Williams
2023-06-29 16:25                 ` Dionna Amalie Glaze
2023-06-28 15:31               ` Samuel Ortiz
2023-06-28 15:24             ` Samuel Ortiz
2023-06-27 23:44         ` Dan Williams
2023-06-28  2:47       ` Huang, Kai
     [not found]     ` <CAAYXXYyK4g9k7a78CU9w6Sn9KTBdoNLOu9gcgrSHJfp+3-tO=w@mail.gmail.com>
2023-06-23 22:49       ` Dan Williams
2023-08-23  8:25       ` Thomas Gleixner
2023-05-24 21:33 ` [PATCH v3 0/3] TDX Guest Quote generation support Chong Cai
2023-05-25 22:55   ` Sathyanarayanan Kuppuswamy
2023-06-24  4:05 ` Dan Williams
2023-06-25 20:21   ` Dan Williams
2023-06-26  3:07   ` Sathyanarayanan Kuppuswamy
2023-06-26  4:31     ` Dan Williams
2023-06-27  7:50       ` Chong Cai
2023-08-23  7:33         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).