All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] xen: Hardware domain support
@ 2014-03-04 22:51 Daniel De Graaf
  2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel

This adds support to the hypervisor for the creation of a hardware
domain distinct from domain 0, allowing further disaggregation of the
duties of domain 0.  The commit message for patch 1 contains a more
complete description of the distinction between the hardware domain and
control domain(s).  Making the hardware domain distinct from domain 0
allows it to be further de-privileged using an XSM policy: the hardware
domain does not need to be permitted access to create or modify other
domains in order to act as a device backend for them.

A domain builder suitable for use as domain 0 in this disaggregated
setup will be posted in a separate mail.  This domain builder has two
modes of operation determined at compile time: the initial domain
builder will build a pre-selected set of domains taken from its ramdisk,
and relies on one of the booted domains to continue the boot process and
handle other actions such as Xenstore introductions.  The domain builder
service relies on an inter-domain communications mechanism to retrieve
kernels from an image service which currently runs as a process in the
hardware domain.  Because this requires additional patches to the
hypervisor, I am planning to post only the initial domain builder at
this time.  In the future, when V4V support is present in the hypervisor
and the control and domain builder servers have been modified to use
V4V, the complete version will be posted.

An earlier version of the first patch was Acked by Jan Beulich, but the
rebase for 4.5 added additional changes to the patch.

[PATCH 1/6] xen: use domid check in is_hardware_domain
[PATCH 2/6] xen/iommu: Move dom0 setup code out of __init
[PATCH 3/6] xen: prevent 0 from being used as a dynamic domid
[PATCH 4/6] xen: Allow hardare domain != dom0
[PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
[PATCH 6/6] xenstored: add --master-domid to support domain builder

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-05  3:44   ` Julien Grall
  2014-03-05  9:23   ` Jan Beulich
  2014-03-04 22:51 ` [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init Daniel De Graaf
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, Stefano Stabellini,
	Jan Beulich, Daniel De Graaf, Suravee Suthikulpanit,
	Xiantao Zhang

Instead of checking is_privileged to determine if a domain should
control the hardware, check that the domain_id is equal to zero (which
is currently the only domain for which is_privileged is true).  This
allows other places where domain_id is checked for zero to be replaced
with is_hardware_domain.

The distinction between is_hardware_domain, is_control_domain, and
domain 0 is based on the following disaggregation model:

Domain 0 bootstraps the system.  It may remain to perform requested
builds of domains that need a minimal trust chain (i.e. vTPM domains).
Other than being built by the hypervisor, nothing is special about this
domain - although it may be useful to have is_control_domain() return
true depending on the toolstack it uses to build other domains.

The hardware domain manages devices for PCI pass-through to driver
domains or can act as a driver domain itself, depending on the desired
degree of disaggregation.  It is also the domain managing devices that
do not support pass-through: PCI configuration space access, parsing the
hardware ACPI tables and system power or machine check events.  This is
the only domain where is_hardware_domain() is true.  The return of
is_control_domain() is false for this domain.

The control domain manages other domains, controls guest launch and
shutdown, and manages resource constraints; is_control_domain() returns
true.  The functionality guarded by is_control_domain may in the future
be adapted to use explicit hypercalls, eliminating the special treatment
of this domain.  It may be reasonable to have multiple control domains
on a multi-tenant system.

Guest domains and other service or driver domains are all treated
identically by the hypervisor; the security policy may further constrain
administrative actions on or communication between these domains.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
---
 xen/arch/arm/domain.c                       |  2 +-
 xen/arch/arm/gic.c                          |  2 +-
 xen/arch/arm/vgic.c                         |  2 +-
 xen/arch/arm/vuart.c                        |  2 +-
 xen/arch/x86/domain.c                       |  2 +-
 xen/arch/x86/hvm/i8254.c                    |  2 +-
 xen/arch/x86/time.c                         |  4 ++--
 xen/arch/x86/traps.c                        |  4 ++--
 xen/common/domain.c                         | 10 +++++-----
 xen/common/xenoprof.c                       |  2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/iommu.c             |  2 +-
 xen/drivers/passthrough/vtd/iommu.c         |  8 ++++----
 xen/drivers/passthrough/vtd/x86/vtd.c       |  2 +-
 xen/include/xen/sched.h                     |  4 ++--
 15 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 8f20fdf..4b9afb2 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -547,7 +547,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
      * Only use it for dom0 because the linux kernel may not support
      * multi-platform.
      */
-    if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) )
+    if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
         goto fail;
 
     return 0;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 074624e..5d7ae3d 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -862,7 +862,7 @@ int gicv_setup(struct domain *d)
      * Domain 0 gets the hardware address.
      * Guests get the virtual platform layout.
      */
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
     {
         d->arch.vgic.dbase = gic.dbase;
         d->arch.vgic.cbase = gic.cbase;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 553411d..65f48f2 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d)
     /* Currently nr_lines in vgic and gic doesn't have the same meanings
      * Here nr_lines = number of SPIs
      */
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
         d->arch.vgic.nr_lines = gic_number_lines() - 32;
     else
         d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
diff --git a/xen/arch/arm/vuart.c b/xen/arch/arm/vuart.c
index b9d3ced..c02a8a9 100644
--- a/xen/arch/arm/vuart.c
+++ b/xen/arch/arm/vuart.c
@@ -46,7 +46,7 @@
 
 int domain_vuart_init(struct domain *d)
 {
-    ASSERT( !d->domain_id );
+    ASSERT( is_hardware_domain(d) );
 
     d->arch.vuart.info = serial_vuart_info(SERHND_DTUART);
     if ( !d->arch.vuart.info )
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 6618ae6..7bcdd40 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1505,7 +1505,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
         }
 
         set_cpuid_faulting(is_pv_vcpu(next) &&
-                           (next->domain->domain_id != 0));
+                           !is_control_domain(next->domain));
     }
 
     if (is_hvm_vcpu(next) && (prev != next) )
diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index c0d6bc2..add61e3 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -541,7 +541,7 @@ int pv_pit_handler(int port, int data, int write)
         .data = data
     };
 
-    if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) )
+    if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) )
     {
         /* nothing to do */;
     }
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 6e31e1f..6798cd6 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1833,7 +1833,7 @@ void tsc_set_info(struct domain *d,
                   uint32_t tsc_mode, uint64_t elapsed_nsec,
                   uint32_t gtsc_khz, uint32_t incarnation)
 {
-    if ( is_idle_domain(d) || (d->domain_id == 0) )
+    if ( is_idle_domain(d) || is_hardware_domain(d) )
     {
         d->arch.vtsc = 0;
         return;
@@ -1937,7 +1937,7 @@ static void dump_softtsc(unsigned char key)
                "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count);
     for_each_domain ( d )
     {
-        if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT )
+        if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT )
             continue;
         printk("dom%u%s: mode=%d",d->domain_id,
                 is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0bd43b9..b8da135 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -738,7 +738,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
     c = regs->ecx;
     d = regs->edx;
 
-    if ( current->domain->domain_id != 0 )
+    if ( !is_control_domain(current->domain) )
     {
         unsigned int cpuid_leaf = a, sub_leaf = c;
 
@@ -1859,7 +1859,7 @@ static inline uint64_t guest_misc_enable(uint64_t val)
 static int is_cpufreq_controller(struct domain *d)
 {
     return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            (d->domain_id == 0));
+            is_hardware_domain(d));
 }
 
 #include "x86_64/mmconfig.h"
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 2636fc9..6e0cb0f 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -242,7 +242,7 @@ struct domain *domain_create(
     else if ( domcr_flags & DOMCRF_pvh )
         d->guest_type = guest_type_pvh;
 
-    if ( domid == 0 )
+    if ( is_hardware_domain(d) )
     {
         d->is_pinned = opt_dom0_vcpus_pin;
         d->disable_migrate = 1;
@@ -267,10 +267,10 @@ struct domain *domain_create(
         d->is_paused_by_controller = 1;
         atomic_inc(&d->pause_count);
 
-        if ( domid )
-            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
-        else
+        if ( is_hardware_domain(d) )
             d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
+        else
+            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
         if ( d->nr_pirqs > nr_irqs )
             d->nr_pirqs = nr_irqs;
 
@@ -599,7 +599,7 @@ void domain_shutdown(struct domain *d, u8 reason)
         d->shutdown_code = reason;
     reason = d->shutdown_code;
 
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
         dom0_shutdown(reason);
 
     if ( d->is_shutting_down )
diff --git a/xen/common/xenoprof.c b/xen/common/xenoprof.c
index 52ab00d..b148233 100644
--- a/xen/common/xenoprof.c
+++ b/xen/common/xenoprof.c
@@ -603,7 +603,7 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) arg)
 
     xenoprof_init.is_primary = 
         ((xenoprof_primary_profiler == d) ||
-         ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0)));
+         ((xenoprof_primary_profiler == NULL) && is_control_domain(d)));
     if ( xenoprof_init.is_primary )
         xenoprof_primary_profiler = current->domain;
 
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index c26aabc..cf67494 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device(
 
     BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer );
 
-    if ( iommu_passthrough && (domain->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(domain) )
         valid = 0;
 
     if ( ats_enabled )
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 19b0e23..16c99db 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -825,7 +825,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( !d->domain_id )
+        if ( is_hardware_domain(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 5f10034..cea70a1 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1338,7 +1338,7 @@ int domain_context_mapping_one(
         return res;
     }
 
-    if ( iommu_passthrough && (domain->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(domain) )
     {
         context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
         agaw = level_to_agaw(iommu->nr_pt_levels);
@@ -1730,7 +1730,7 @@ static int intel_iommu_map_page(
         return 0;
 
     /* do nothing if dom0 and iommu supports pass thru */
-    if ( iommu_passthrough && (d->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
     spin_lock(&hd->mapping_lock);
@@ -1774,7 +1774,7 @@ static int intel_iommu_map_page(
 static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     /* Do nothing if dom0 and iommu supports pass thru. */
-    if ( iommu_passthrough && (d->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
     dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
@@ -1947,7 +1947,7 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
     /* If the device belongs to dom0, and it has RMRR, don't remove it
      * from dom0, because BIOS may use RMRR at booting time.
      */
-    if ( pdev->domain->domain_id == 0 )
+    if ( is_hardware_domain(pdev->domain) )
     {
         for_each_rmrr_device ( rmrr, bdf, i )
         {
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index ca17cb1..f271a42 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -111,7 +111,7 @@ void __init iommu_set_dom0_mapping(struct domain *d)
 {
     unsigned long i, j, tmp, top;
 
-    BUG_ON(d->domain_id != 0);
+    BUG_ON(!is_hardware_domain(d));
 
     top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1);
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index fb8bd36..3be26ec 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -775,10 +775,10 @@ void watchdog_domain_destroy(struct domain *d);
 /* 
  * Use this check when the following are both true:
  *  - Using this feature or interface requires full access to the hardware
- *    (that is, this is would not be suitable for a driver domain)
+ *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny dom0 access to this
  */
-#define is_hardware_domain(_d) ((_d)->is_privileged)
+#define is_hardware_domain(_d) ((_d)->domain_id == 0)
 
 /* This check is for functionality specific to a control domain */
 #define is_control_domain(_d) ((_d)->is_privileged)
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
  2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-05  9:56   ` Jan Beulich
  2014-03-04 22:51 ` [PATCH 3/6] xen: prevent 0 from being used as a dynamic domid Daniel De Graaf
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel
  Cc: Daniel De Graaf, Xiantao Zhang, Keir Fraser,
	Suravee Suthikulpanit, Jan Beulich

This is required to allow the hardware domain to be built by dom0 after
the __init sections have been discarded.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
---
 xen/arch/x86/setup.c                        | 4 ++--
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 4 ++--
 xen/drivers/passthrough/iommu.c             | 6 +++---
 xen/drivers/passthrough/pci.c               | 4 ++--
 xen/drivers/passthrough/vtd/iommu.c         | 6 +++---
 xen/drivers/passthrough/vtd/x86/vtd.c       | 4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index b49256d..3a4f69c 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1445,7 +1445,7 @@ void arch_get_xen_caps(xen_capabilities_info_t *info)
     }
 }
 
-int __init xen_in_range(unsigned long mfn)
+int xen_in_range(unsigned long mfn)
 {
     paddr_t start, end;
     int i;
@@ -1453,7 +1453,7 @@ int __init xen_in_range(unsigned long mfn)
     enum { region_s3, region_text, region_bss, nr_regions };
     static struct {
         paddr_t s, e;
-    } xen_regions[nr_regions] __initdata;
+    } xen_regions[nr_regions];
 
     /* initialize first time */
     if ( !xen_regions[0].s )
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index cf67494..64d4ec4 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -169,7 +169,7 @@ static void amd_iommu_setup_domain_device(
     }
 }
 
-static int __init amd_iommu_setup_dom0_device(u8 devfn, struct pci_dev *pdev)
+static int amd_iommu_setup_dom0_device(u8 devfn, struct pci_dev *pdev)
 {
     int bdf = PCI_BDF2(pdev->bus, pdev->devfn);
     struct amd_iommu *iommu = find_iommu_for_device(pdev->seg, bdf);
@@ -280,7 +280,7 @@ static int amd_iommu_domain_init(struct domain *d)
     return 0;
 }
 
-static void __init amd_iommu_dom0_init(struct domain *d)
+static void amd_iommu_dom0_init(struct domain *d)
 {
     unsigned long i; 
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 16c99db..6bd8abe 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -45,7 +45,7 @@ custom_param("iommu", parse_iommu_param);
 bool_t __initdata iommu_enable = 1;
 bool_t __read_mostly iommu_enabled;
 bool_t __read_mostly force_iommu;
-bool_t __initdata iommu_dom0_strict;
+bool_t __read_mostly iommu_dom0_strict;
 bool_t __read_mostly iommu_verbose;
 bool_t __read_mostly iommu_workaround_bios_bug;
 bool_t __read_mostly iommu_passthrough;
@@ -130,7 +130,7 @@ int iommu_domain_init(struct domain *d)
     return hd->platform_ops->init(d);
 }
 
-static __init void check_dom0_pvh_reqs(struct domain *d)
+static void check_dom0_pvh_reqs(struct domain *d)
 {
     if ( !iommu_enabled )
         panic("Presently, iommu must be enabled for pvh dom0\n");
@@ -141,7 +141,7 @@ static __init void check_dom0_pvh_reqs(struct domain *d)
     iommu_dom0_strict = 1;
 }
 
-void __init iommu_dom0_init(struct domain *d)
+void iommu_dom0_init(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c5c8344..fdc7ee2 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -892,7 +892,7 @@ static void setup_one_dom0_device(const struct setup_dom0 *ctxt,
               PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
 }
 
-static int __init _setup_dom0_pci_devices(struct pci_seg *pseg, void *arg)
+static int _setup_dom0_pci_devices(struct pci_seg *pseg, void *arg)
 {
     struct setup_dom0 *ctxt = arg;
     int bus, devfn;
@@ -928,7 +928,7 @@ static int __init _setup_dom0_pci_devices(struct pci_seg *pseg, void *arg)
     return 0;
 }
 
-void __init setup_dom0_pci_devices(
+void setup_dom0_pci_devices(
     struct domain *d, int (*handler)(u8 devfn, struct pci_dev *))
 {
     struct setup_dom0 ctxt = { .d = d, .handler = handler };
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index cea70a1..e2bf5cb 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1253,7 +1253,7 @@ static int intel_iommu_domain_init(struct domain *d)
     return 0;
 }
 
-static void __init intel_iommu_dom0_init(struct domain *d)
+static void intel_iommu_dom0_init(struct domain *d)
 {
     struct acpi_drhd_unit *drhd;
 
@@ -1961,7 +1961,7 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
     return domain_context_unmap(pdev->domain, devfn, pdev);
 }
 
-static int __init setup_dom0_device(u8 devfn, struct pci_dev *pdev)
+static int setup_dom0_device(u8 devfn, struct pci_dev *pdev)
 {
     int err;
 
@@ -2109,7 +2109,7 @@ static int init_vtd_hw(void)
     return 0;
 }
 
-static void __init setup_dom0_rmrr(struct domain *d)
+static void setup_dom0_rmrr(struct domain *d)
 {
     struct acpi_rmrr_unit *rmrr;
     u16 bdf;
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index f271a42..cbbf278 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -36,7 +36,7 @@
  * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
  * 1:1 iommu mappings except xen and unusable regions.
  */
-static bool_t __initdata iommu_inclusive_mapping = 1;
+static bool_t iommu_inclusive_mapping = 1;
 boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping);
 
 void *map_vtd_domain_page(u64 maddr)
@@ -107,7 +107,7 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
     spin_unlock(&d->event_lock);
 }
 
-void __init iommu_set_dom0_mapping(struct domain *d)
+void iommu_set_dom0_mapping(struct domain *d)
 {
     unsigned long i, j, tmp, top;
 
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/6] xen: prevent 0 from being used as a dynamic domid
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
  2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
  2014-03-04 22:51 ` [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-04 22:51 ` [PATCH 4/6] xen: Allow hardare domain != dom0 Daniel De Graaf
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel; +Cc: Daniel De Graaf, Keir Fraser

When the hardware domain is made distinct from dom0, it becomes possible
to shut down and destroy domain 0 while leaving the hypervisor running.
If this happens, prevent this domain ID from being considered for
allocation to a new guest.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Keir Fraser <keir@xen.org>
---
 xen/common/domctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 7cf610a..eebeee7 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -436,7 +436,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             for ( dom = rover + 1; dom != rover; dom++ )
             {
                 if ( dom == DOMID_FIRST_RESERVED )
-                    dom = 0;
+                    dom = 1;
                 if ( is_free_domid(dom) )
                     break;
             }
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
                   ` (2 preceding siblings ...)
  2014-03-04 22:51 ` [PATCH 3/6] xen: prevent 0 from being used as a dynamic domid Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-05  3:50   ` Julien Grall
  2014-03-05 10:04   ` Jan Beulich
  2014-03-04 22:51 ` [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed Daniel De Graaf
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel; +Cc: Daniel De Graaf, Keir Fraser, Jan Beulich

This adds a hypervisor command line option "hardware_dom=" which takes a
domain ID.  When the domain with this ID is created, it will be used as
the hardware domain.

This is intended to be used when dom0 is a dedicated stub domain for
domain building, allowing the hardware domain to be de-privileged and
act only as a driver domain.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/domain_build.c |  4 +++-
 xen/arch/x86/setup.c        |  3 +++
 xen/common/domctl.c         |  8 ++++++++
 xen/common/rangeset.c       | 26 ++++++++++++++++++++++++++
 xen/include/xen/rangeset.h  |  3 +++
 xen/include/xen/sched.h     |  3 ++-
 6 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 84ce392..e9de496 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -1149,7 +1149,9 @@ int __init construct_dom0(
         printk(" Xen warning: dom0 kernel broken ELF: %s\n",
                elf_check_broken(&elf));
 
-    iommu_dom0_init(dom0);
+    if( is_hardware_domain(dom0) )
+        iommu_dom0_init(dom0);
+
     return 0;
 
 out:
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 3a4f69c..3480854 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -88,6 +88,9 @@ unsigned long __initdata highmem_start;
 size_param("highmem-start", highmem_start);
 #endif
 
+unsigned int __read_mostly hardware_dom;
+integer_param("hardware_dom", hardware_dom);
+
 cpumask_t __read_mostly cpu_present_map;
 
 unsigned long __read_mostly xen_phys_start;
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index eebeee7..11e6b94 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -472,6 +472,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             break;
         }
 
+        if (d->domain_id == hardware_dom) {
+            printk("Initialising hardware domain %d\n", hardware_dom);
+            rangeset_swap(d->irq_caps, dom0->irq_caps);
+
+            dom0 = d;
+            iommu_dom0_init(dom0);
+        }
+
         ret = 0;
 
         memcpy(d->handle, op->u.createdomain.handle,
diff --git a/xen/common/rangeset.c b/xen/common/rangeset.c
index f09c0c4..cd44634 100644
--- a/xen/common/rangeset.c
+++ b/xen/common/rangeset.c
@@ -438,3 +438,29 @@ void rangeset_domain_printk(
 
     spin_unlock(&d->rangesets_lock);
 }
+
+void rangeset_swap(struct rangeset *a, struct rangeset *b)
+{
+    struct list_head tmp;
+    spin_lock(&a->lock);
+    spin_lock(&b->lock);
+    memcpy(&tmp, &a->range_list, sizeof(tmp));
+    memcpy(&a->range_list, &b->range_list, sizeof(tmp));
+    memcpy(&b->range_list, &tmp, sizeof(tmp));
+    if (a->range_list.next == &b->range_list) {
+        a->range_list.next = &a->range_list;
+        a->range_list.prev = &a->range_list;
+    } else {
+        a->range_list.next->prev = &a->range_list;
+        a->range_list.prev->next = &a->range_list;
+    }
+    if (b->range_list.next == &a->range_list) {
+        b->range_list.next = &b->range_list;
+        b->range_list.prev = &b->range_list;
+    } else {
+        b->range_list.next->prev = &b->range_list;
+        b->range_list.prev->next = &b->range_list;
+    }
+    spin_unlock(&a->lock);
+    spin_unlock(&b->lock);
+}
diff --git a/xen/include/xen/rangeset.h b/xen/include/xen/rangeset.h
index 1e16a6b..805ebde 100644
--- a/xen/include/xen/rangeset.h
+++ b/xen/include/xen/rangeset.h
@@ -73,4 +73,7 @@ void rangeset_printk(
 void rangeset_domain_printk(
     struct domain *d);
 
+/* swap contents */
+void rangeset_swap(struct rangeset *a, struct rangeset *b);
+
 #endif /* __XEN_RANGESET_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 3be26ec..6c4237b 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -778,7 +778,8 @@ void watchdog_domain_destroy(struct domain *d);
  *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny dom0 access to this
  */
-#define is_hardware_domain(_d) ((_d)->domain_id == 0)
+extern unsigned int hardware_dom;
+#define is_hardware_domain(d)  ((d)->domain_id == hardware_dom)
 
 /* This check is for functionality specific to a control domain */
 #define is_control_domain(_d) ((_d)->is_privileged)
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
                   ` (3 preceding siblings ...)
  2014-03-04 22:51 ` [PATCH 4/6] xen: Allow hardare domain != dom0 Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-05 10:07   ` Jan Beulich
  2014-03-05 12:02   ` Ian Jackson
  2014-03-04 22:51 ` [PATCH 6/6] xenstored: add --master-domid to support domain builder Daniel De Graaf
  2014-03-04 23:32 ` Domain Builder Daniel De Graaf
  6 siblings, 2 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel; +Cc: Daniel De Graaf, Ian Jackson, Ian Campbell, Stefano Stabellini

When dom0 is not the hardware domain, it can be destroyed in the same
way as any other service domain. Since the hypervisor already prevents a
domain from destroying itself, this patch only changes behavior when
used in a disaggregated environment.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/xl_cmdimpl.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 4fc46eb..70cd652 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3059,11 +3059,6 @@ static void unpause_domain(uint32_t domid)
 static void destroy_domain(uint32_t domid)
 {
     int rc;
-
-    if (domid == 0) {
-        fprintf(stderr, "Cannot destroy privileged domain 0.\n\n");
-        exit(-1);
-    }
     rc = libxl_domain_destroy(ctx, domid, 0);
     if (rc) { fprintf(stderr,"destroy failed (rc=%d)\n",rc); exit(-1); }
 }
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/6] xenstored: add --master-domid to support domain builder
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
                   ` (4 preceding siblings ...)
  2014-03-04 22:51 ` [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed Daniel De Graaf
@ 2014-03-04 22:51 ` Daniel De Graaf
  2014-03-10 12:14   ` Ian Jackson
  2014-03-04 23:32 ` Domain Builder Daniel De Graaf
  6 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 22:51 UTC (permalink / raw
  To: xen-devel; +Cc: Daniel De Graaf, Ian Jackson, Ian Campbell, Stefano Stabellini

When a domain builder stub domain is used, the initial xenstore
connection to domain 0 may use a different domain ID as the endpoint;
allow this domain ID to be specified on the command line.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/xenstore/xenstored_core.c   | 5 +++++
 tools/xenstore/xenstored_core.h   | 3 +++
 tools/xenstore/xenstored_domain.c | 2 +-
 tools/xenstore/xenstored_minios.c | 2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 2324e53..47f0722 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1795,6 +1795,7 @@ static struct option options[] = {
 	{ "entry-nb", 1, NULL, 'E' },
 	{ "pid-file", 1, NULL, 'F' },
 	{ "event", 1, NULL, 'e' },
+	{ "master-domid", 1, NULL, 'm' },
 	{ "help", 0, NULL, 'H' },
 	{ "no-fork", 0, NULL, 'N' },
 	{ "priv-domid", 1, NULL, 'p' },
@@ -1810,6 +1811,7 @@ static struct option options[] = {
 	{ NULL, 0, NULL, 0 } };
 
 extern void dump_conn(struct connection *conn); 
+int dom0_domid = 0;
 int dom0_event = 0;
 int priv_domid = 0;
 
@@ -1871,6 +1873,9 @@ int main(int argc, char *argv[])
 		case 'e':
 			dom0_event = strtol(optarg, NULL, 10);
 			break;
+		case 'm':
+			dom0_domid = strtol(optarg, NULL, 10);
+			break;
 		case 'p':
 			priv_domid = strtol(optarg, NULL, 10);
 			break;
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index cfbcf6f..dcf95b5 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -170,6 +170,7 @@ void trace(const char *fmt, ...);
 void dtrace_io(const struct connection *conn, const struct buffered_data *data, int out);
 
 extern int event_fd;
+extern int dom0_domid;
 extern int dom0_event;
 extern int priv_domid;
 
@@ -177,6 +178,8 @@ extern int priv_domid;
 void *xenbus_map(void);
 void unmap_xenbus(void *interface);
 
+static inline int xenbus_master_domid(void) { return dom0_domid; }
+
 /* Return the event channel used by xenbus. */
 evtchn_port_t xenbus_evtchn(void);
 
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index f24bd6b..f7bbb03 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -606,7 +606,7 @@ static int dom0_init(void)
 	if (port == -1)
 		return -1;
 
-	dom0 = new_domain(NULL, 0, port); 
+	dom0 = new_domain(NULL, xenbus_master_domid(), port);
 	if (dom0 == NULL)
 		return -1;
 
diff --git a/tools/xenstore/xenstored_minios.c b/tools/xenstore/xenstored_minios.c
index 1c6f794..f9c921e 100644
--- a/tools/xenstore/xenstored_minios.c
+++ b/tools/xenstore/xenstored_minios.c
@@ -51,7 +51,7 @@ evtchn_port_t xenbus_evtchn(void)
 
 void *xenbus_map(void)
 {
-	return xc_gnttab_map_grant_ref(*xcg_handle, 0,
+	return xc_gnttab_map_grant_ref(*xcg_handle, xenbus_master_domid(),
 			GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
 }
 
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Domain Builder
  2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
                   ` (5 preceding siblings ...)
  2014-03-04 22:51 ` [PATCH 6/6] xenstored: add --master-domid to support domain builder Daniel De Graaf
@ 2014-03-04 23:32 ` Daniel De Graaf
  6 siblings, 0 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-04 23:32 UTC (permalink / raw
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2791 bytes --]

A domain builder suitable for creating a disaggregated Xen system is
attached.  Using this domain builder requires the patch series that
this email is in reply to, and requires that the hypervisor be built
with XSM enabled (with the XSM policy loaded from the bootloader).

The executable passed to Xen from GRUB is called db-boot. It takes a
CPIO archive containing specification files (with extension .cfg.db)
and their dependencies (kernels and initrds).  The specification
files are built using compile-db-spec.

Example contents of the ramdisk:
     control             - control domain kernel (in tarball)
     control.cpio        - configuration for control domain
     control.cfg.db      - spec file for control domain
     xenstore            - mini-os version of the C xenstore
     xenstore.cfg.db     - spec file for Xenstore
     vtpmmgr             - mini-os TPM manager
     vtpmmgr.cfg.db      - spec file for TPM Manager
     vmlinuz             - Kernel for both Linux domains
     initramfs.img       - Common initrd for Linux domains
     hardware.cfg.db     - spec file for hardware domain
     toolstack.cfg.db    - spec file for toolstack domain

control.cfg:
vcpus=1
memory=4
kernel="control"
ramdisk="control.cpio"
extra="SVP xenstore=2 hs=3 drivers=3 tpm=4 self=1 schema=platform.ctl"
domid=1
access_control="system_u:system_r:control_t"

xenstore.cfg:
vcpus=1
memory=20
kernel="xenstore"
extra="--priv-domid 5 --internal-db"
domid=2
access_control="system_u:system_r:xenstore_t"

hardware.cfg:
memory=2000
vcpus=1
domid=3
dom_flags=['INITDOMAIN']
access_control="system_u:system_r:hardware_t"
kernel="vmlinuz"
ramdisk="initramfs.img"
extra="ro root=/dev/lvm_foo/hardware_root console=hvc0 earlyprintk=xen"
iomem=['enable 0(0xfed40)', 'enable 0xfed45(0xff012bb)']
ioports=[
	# This is derived from "xl debug-key q" output for dom0
         'enable 0(0x20)',
         'enable 0x22(0x1e)',
         'enable 0x44(0x1c)',
         'enable 0x68(0x38)',
         'enable 0xA2(0x356)',
         'enable 0x400(8)',
         'enable 0x40C(0x8EC)',
         'enable 0xD00(0xF300)',
]

vtpmmgr.cfg:
vcpus=1
memory=5
kernel="boot/vtpmmgr"
extra="tpmlocality=2"
domid=4
access_control="system_u:system_r:vtpm_mgr_t"
iomem=['enable 0xfed42(1)']

toolstack.cfg:
memory=2000
vcpus=1
domid=5
access_control="system_u:system_r:toolstack_t"
kernel="vmlinuz"
ramdisk="initramfs.img"
extra="ro root=/dev/lvm_foo/toolstack_root console=hvc0 earlyprintk=xen"

control.cpio:
	platform.ctl - compiled with compile-control-schema from svp.schema

svp.schema should be an empty file.  Its contents describe additional
domains to build from disk, but this support requires using the domain
builder server which was removed due to its dependency on IVC which is
not available in upstream Xen.

[-- Attachment #2: domain_builder.tgz --]
[-- Type: application/x-compressed-tar, Size: 365921 bytes --]

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
@ 2014-03-05  3:44   ` Julien Grall
  2014-03-05  9:23   ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Julien Grall @ 2014-03-05  3:44 UTC (permalink / raw
  To: Daniel De Graaf, xen-devel
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, Stefano Stabellini,
	Jan Beulich, Suravee Suthikulpanit, Xiantao Zhang

Hello Daniel,

On 05/03/14 06:51, Daniel De Graaf wrote:
> Instead of checking is_privileged to determine if a domain should
> control the hardware, check that the domain_id is equal to zero (which
> is currently the only domain for which is_privileged is true).  This
> allows other places where domain_id is checked for zero to be replaced
> with is_hardware_domain.
>
> The distinction between is_hardware_domain, is_control_domain, and
> domain 0 is based on the following disaggregation model:
>
> Domain 0 bootstraps the system.  It may remain to perform requested
> builds of domains that need a minimal trust chain (i.e. vTPM domains).
> Other than being built by the hypervisor, nothing is special about this
> domain - although it may be useful to have is_control_domain() return
> true depending on the toolstack it uses to build other domains.
>
> The hardware domain manages devices for PCI pass-through to driver
> domains or can act as a driver domain itself, depending on the desired
> degree of disaggregation.  It is also the domain managing devices that
> do not support pass-through: PCI configuration space access, parsing the
> hardware ACPI tables and system power or machine check events.  This is
> the only domain where is_hardware_domain() is true.  The return of
> is_control_domain() is false for this domain.
>
> The control domain manages other domains, controls guest launch and
> shutdown, and manages resource constraints; is_control_domain() returns
> true.  The functionality guarded by is_control_domain may in the future
> be adapted to use explicit hypercalls, eliminating the special treatment
> of this domain.  It may be reasonable to have multiple control domains
> on a multi-tenant system.
>
> Guest domains and other service or driver domains are all treated
> identically by the hypervisor; the security policy may further constrain
> administrative actions on or communication between these domains.
>
> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Stefano Stabellini <stefano.stabellini@citrix.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Xiantao Zhang <xiantao.zhang@intel.com>
> ---
>   xen/arch/arm/domain.c                       |  2 +-
>   xen/arch/arm/gic.c                          |  2 +-
>   xen/arch/arm/vgic.c                         |  2 +-
>   xen/arch/arm/vuart.c                        |  2 +-
>   xen/arch/x86/domain.c                       |  2 +-
>   xen/arch/x86/hvm/i8254.c                    |  2 +-
>   xen/arch/x86/time.c                         |  4 ++--
>   xen/arch/x86/traps.c                        |  4 ++--
>   xen/common/domain.c                         | 10 +++++-----
>   xen/common/xenoprof.c                       |  2 +-
>   xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
>   xen/drivers/passthrough/iommu.c             |  2 +-
>   xen/drivers/passthrough/vtd/iommu.c         |  8 ++++----
>   xen/drivers/passthrough/vtd/x86/vtd.c       |  2 +-
>   xen/include/xen/sched.h                     |  4 ++--
>   15 files changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 8f20fdf..4b9afb2 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -547,7 +547,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
>        * Only use it for dom0 because the linux kernel may not support
>        * multi-platform.
>        */
> -    if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) )
> +    if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )

Can you update the comment above the check?

>           goto fail;
>
>       return 0;
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 074624e..5d7ae3d 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -862,7 +862,7 @@ int gicv_setup(struct domain *d)
>        * Domain 0 gets the hardware address.
>        * Guests get the virtual platform layout.
>        */
> -    if ( d->domain_id == 0 )
> +    if ( is_hardware_domain(d) )

Same here.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-04 22:51 ` [PATCH 4/6] xen: Allow hardare domain != dom0 Daniel De Graaf
@ 2014-03-05  3:50   ` Julien Grall
  2014-03-05 23:04     ` Daniel De Graaf
  2014-03-05 10:04   ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Julien Grall @ 2014-03-05  3:50 UTC (permalink / raw
  To: Daniel De Graaf, xen-devel
  Cc: Keir Fraser, Ian Campbell, Jan Beulich, Stefano Stabellini

Hello Daniel,

On 05/03/14 06:51, Daniel De Graaf wrote:
> This adds a hypervisor command line option "hardware_dom=" which takes a
> domain ID.  When the domain with this ID is created, it will be used as
> the hardware domain.
>
> This is intended to be used when dom0 is a dedicated stub domain for
> domain building, allowing the hardware domain to be de-privileged and
> act only as a driver domain.
>
> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
>   xen/arch/x86/domain_build.c |  4 +++-
>   xen/arch/x86/setup.c        |  3 +++
>   xen/common/domctl.c         |  8 ++++++++
>   xen/common/rangeset.c       | 26 ++++++++++++++++++++++++++
>   xen/include/xen/rangeset.h  |  3 +++
>   xen/include/xen/sched.h     |  3 ++-
>   6 files changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
> index 84ce392..e9de496 100644
> --- a/xen/arch/x86/domain_build.c
> +++ b/xen/arch/x86/domain_build.c
> @@ -1149,7 +1149,9 @@ int __init construct_dom0(
>           printk(" Xen warning: dom0 kernel broken ELF: %s\n",
>                  elf_check_broken(&elf));
>
> -    iommu_dom0_init(dom0);
> +    if( is_hardware_domain(dom0) )
> +        iommu_dom0_init(dom0);
> +
>       return 0;
>
>   out:
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 3a4f69c..3480854 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -88,6 +88,9 @@ unsigned long __initdata highmem_start;
>   size_param("highmem-start", highmem_start);
>   #endif
>
> +unsigned int __read_mostly hardware_dom;
> +integer_param("hardware_dom", hardware_dom);
> +
>   cpumask_t __read_mostly cpu_present_map;
>
>   unsigned long __read_mostly xen_phys_start;
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index eebeee7..11e6b94 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -472,6 +472,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>               break;
>           }
>
> +        if (d->domain_id == hardware_dom) {
> +            printk("Initialising hardware domain %d\n", hardware_dom);
> +            rangeset_swap(d->irq_caps, dom0->irq_caps);
> +
> +            dom0 = d;
> +            iommu_dom0_init(dom0);
> +        }
> +

This patch will break compilation on ARM. You are using hardware_dom 
which is defined in xen/arch/x86/setup.c.

I'm not sure if the best solution is to move the defined for 
hardware_dom in common code ... because settings this variable to a 
value other than 0 will break ARM boot with your changes in patch #1.

Supporting hardware domain on ARM will need rework on dom0 builder.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
  2014-03-05  3:44   ` Julien Grall
@ 2014-03-05  9:23   ` Jan Beulich
  2014-03-05 15:25     ` Daniel De Graaf
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2014-03-05  9:23 UTC (permalink / raw
  To: Daniel De Graaf
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, xen-devel,
	Stefano Stabellini, Suravee Suthikulpanit, Xiantao Zhang

>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> The hardware domain manages devices for PCI pass-through to driver
> domains or can act as a driver domain itself, depending on the desired
> degree of disaggregation.  It is also the domain managing devices that
> do not support pass-through: PCI configuration space access, parsing the
> hardware ACPI tables and system power or machine check events.  This is
> the only domain where is_hardware_domain() is true.  The return of
> is_control_domain() is false for this domain.

"s/is/may be/" ?

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1505,7 +1505,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>          }
>  
>          set_cpuid_faulting(is_pv_vcpu(next) &&
> -                           (next->domain->domain_id != 0));
> +                           !is_control_domain(next->domain));

I can't see why the hardware domain (which can't be migrated)
should be prevented from seeing the real CPUID values. It's
less obvious whether a control domain should always see real
values. Hence if you think the latter should be the case (as the
change above shows), I think you should check for both cases
here.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -738,7 +738,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
>      c = regs->ecx;
>      d = regs->edx;
>  
> -    if ( current->domain->domain_id != 0 )
> +    if ( !is_control_domain(current->domain) )

The same consideration applies here then, obviously.

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -242,7 +242,7 @@ struct domain *domain_create(
>      else if ( domcr_flags & DOMCRF_pvh )
>          d->guest_type = guest_type_pvh;
>  
> -    if ( domid == 0 )
> +    if ( is_hardware_domain(d) )
>      {
>          d->is_pinned = opt_dom0_vcpus_pin;
>          d->disable_migrate = 1;
> @@ -267,10 +267,10 @@ struct domain *domain_create(
>          d->is_paused_by_controller = 1;
>          atomic_inc(&d->pause_count);
>  
> -        if ( domid )
> -            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
> -        else
> +        if ( is_hardware_domain(d) )
>              d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
> +        else
> +            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;

I'd prefer the if/else cases to remain as they are - makes the patch
smaller, and fits better with the (weak) model of using the if branch
for the common case and the else one for the special one (outside
of error handling of course).

> --- a/xen/common/xenoprof.c
> +++ b/xen/common/xenoprof.c
> @@ -603,7 +603,7 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      xenoprof_init.is_primary = 
>          ((xenoprof_primary_profiler == d) ||
> -         ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0)));
> +         ((xenoprof_primary_profiler == NULL) && is_control_domain(d)));

Do you really consider profiling a control domain property? This is
even more so questionable without knowing whether you checked
that there are no issues with all of the sudden there perhaps
being more than one domain eligible for becoming the primary
profiler - the oprofile code isn't in that good a shape to be certain
without explicit checking.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init
  2014-03-04 22:51 ` [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init Daniel De Graaf
@ 2014-03-05  9:56   ` Jan Beulich
  2014-03-05 22:25     ` Daniel De Graaf
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2014-03-05  9:56 UTC (permalink / raw
  To: Daniel De Graaf
  Cc: Keir Fraser, Xiantao Zhang, Suravee Suthikulpanit, xen-devel

>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> This is required to allow the hardware domain to be built by dom0 after
> the __init sections have been discarded.

Can you quantify the amount of code/data getting moved?

Overall I wonder what the purpose is of allowing the hardware
domain to be created later, rather than keeping Dom0 for this
purpose. After all booting the system is very much hardware
control. Once a control domain was spawned, Dom0 could then
be revoked the right to create further domains.

> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> @@ -36,7 +36,7 @@
>   * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
>   * 1:1 iommu mappings except xen and unusable regions.
>   */
> -static bool_t __initdata iommu_inclusive_mapping = 1;
> +static bool_t iommu_inclusive_mapping = 1;

__read_mostly?

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-04 22:51 ` [PATCH 4/6] xen: Allow hardare domain != dom0 Daniel De Graaf
  2014-03-05  3:50   ` Julien Grall
@ 2014-03-05 10:04   ` Jan Beulich
  2014-03-05 23:04     ` Daniel De Graaf
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2014-03-05 10:04 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Keir Fraser, xen-devel

>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> This adds a hypervisor command line option "hardware_dom=" which takes a
> domain ID.  When the domain with this ID is created, it will be used as
> the hardware domain.
> 
> This is intended to be used when dom0 is a dedicated stub domain for
> domain building, allowing the hardware domain to be de-privileged and
> act only as a driver domain.

Apart from the abstract question regarding the purpose of this, a
couple of comments on the patch a such:

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -88,6 +88,9 @@ unsigned long __initdata highmem_start;
>  size_param("highmem-start", highmem_start);
>  #endif
>  
> +unsigned int __read_mostly hardware_dom;
> +integer_param("hardware_dom", hardware_dom);

This ought to be domid_t, and live in common code.

> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -472,6 +472,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>              break;
>          }
>  
> +        if (d->domain_id == hardware_dom) {

Coding style.

> +            printk("Initialising hardware domain %d\n", hardware_dom);
> +            rangeset_swap(d->irq_caps, dom0->irq_caps);

Why interrupts, but not I/O ports and MMIO?

> +
> +            dom0 = d;

This, I think, is the point where the variable name becomes
intolerable: ASSERT(!dom0 || !dom0->domain_id) should be
valid at all times as long as the variable name is dom0.

> --- a/xen/common/rangeset.c
> +++ b/xen/common/rangeset.c
> @@ -438,3 +438,29 @@ void rangeset_domain_printk(
>  
>      spin_unlock(&d->rangesets_lock);
>  }
> +
> +void rangeset_swap(struct rangeset *a, struct rangeset *b)
> +{
> +    struct list_head tmp;
> +    spin_lock(&a->lock);
> +    spin_lock(&b->lock);
> +    memcpy(&tmp, &a->range_list, sizeof(tmp));
> +    memcpy(&a->range_list, &b->range_list, sizeof(tmp));
> +    memcpy(&b->range_list, &tmp, sizeof(tmp));
> +    if (a->range_list.next == &b->range_list) {
> +        a->range_list.next = &a->range_list;
> +        a->range_list.prev = &a->range_list;
> +    } else {
> +        a->range_list.next->prev = &a->range_list;
> +        a->range_list.prev->next = &a->range_list;
> +    }
> +    if (b->range_list.next == &a->range_list) {
> +        b->range_list.next = &b->range_list;
> +        b->range_list.prev = &b->range_list;
> +    } else {
> +        b->range_list.next->prev = &b->range_list;
> +        b->range_list.prev->next = &b->range_list;
> +    }

Coding style again.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-04 22:51 ` [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed Daniel De Graaf
@ 2014-03-05 10:07   ` Jan Beulich
  2014-03-05 12:02   ` Ian Jackson
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2014-03-05 10:07 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -3059,11 +3059,6 @@ static void unpause_domain(uint32_t domid)
>  static void destroy_domain(uint32_t domid)
>  {
>      int rc;
> -

Don't remove the blank line here.

> -    if (domid == 0) {
> -        fprintf(stderr, "Cannot destroy privileged domain 0.\n\n");
> -        exit(-1);
> -    }

So this gets deleted without replacement? How is the hardware
domain being protected from (accidental or malicious) deletion
then? Even if this is being dealt with in the hypervisor, I'd be
afraid of the failure resulting in a cryptic error message instead
of the very clear one above.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-04 22:51 ` [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed Daniel De Graaf
  2014-03-05 10:07   ` Jan Beulich
@ 2014-03-05 12:02   ` Ian Jackson
  2014-03-05 22:36     ` Daniel De Graaf
  1 sibling, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2014-03-05 12:02 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Stefano Stabellini, Ian Campbell, xen-devel

Daniel De Graaf writes ("[PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
> When dom0 is not the hardware domain, it can be destroyed in the same
> way as any other service domain. Since the hypervisor already prevents a
> domain from destroying itself, this patch only changes behavior when
> used in a disaggregated environment.

Thanks for this patch.  You must be doing something very interesting
to want this :-).

I have a concern though: I wonder what the error behaviour is in the
case where you try to destroy dom0 from itself.  Does it get halfway
through, breaking the system, and then fail ?

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-05  9:23   ` Jan Beulich
@ 2014-03-05 15:25     ` Daniel De Graaf
  2014-03-05 15:45       ` Jan Beulich
  2014-03-11 13:10       ` Ian Campbell
  0 siblings, 2 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 15:25 UTC (permalink / raw
  To: Jan Beulich
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, xen-devel,
	Stefano Stabellini, Suravee Suthikulpanit, Xiantao Zhang

On 03/05/2014 04:23 AM, Jan Beulich wrote:
>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>> The hardware domain manages devices for PCI pass-through to driver
>> domains or can act as a driver domain itself, depending on the desired
>> degree of disaggregation.  It is also the domain managing devices that
>> do not support pass-through: PCI configuration space access, parsing the
>> hardware ACPI tables and system power or machine check events.  This is
>> the only domain where is_hardware_domain() is true.  The return of
>> is_control_domain() is false for this domain.
>
> "s/is/may be/" ?

I had intended this sentence to describe the model, where the return is
always false. However, I agree with the change to avoid confusion that
the two is_*_domain() functions are exclusive.

>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1505,7 +1505,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>>           }
>>
>>           set_cpuid_faulting(is_pv_vcpu(next) &&
>> -                           (next->domain->domain_id != 0));
>> +                           !is_control_domain(next->domain));
>
> I can't see why the hardware domain (which can't be migrated)
> should be prevented from seeing the real CPUID values. It's
> less obvious whether a control domain should always see real
> values. Hence if you think the latter should be the case (as the
> change above shows), I think you should check for both cases
> here.

Agreed, the hardware domain also needs to be checked for here. The
reason the control domain is present is that it needs to see real
CPUID values in order to set CPUID policy for guests based on the
real hardware values.

>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -738,7 +738,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
>>       c = regs->ecx;
>>       d = regs->edx;
>>
>> -    if ( current->domain->domain_id != 0 )
>> +    if ( !is_control_domain(current->domain) )
>
> The same consideration applies here then, obviously.
>
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -242,7 +242,7 @@ struct domain *domain_create(
>>       else if ( domcr_flags & DOMCRF_pvh )
>>           d->guest_type = guest_type_pvh;
>>
>> -    if ( domid == 0 )
>> +    if ( is_hardware_domain(d) )
>>       {
>>           d->is_pinned = opt_dom0_vcpus_pin;
>>           d->disable_migrate = 1;
>> @@ -267,10 +267,10 @@ struct domain *domain_create(
>>           d->is_paused_by_controller = 1;
>>           atomic_inc(&d->pause_count);
>>
>> -        if ( domid )
>> -            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
>> -        else
>> +        if ( is_hardware_domain(d) )
>>               d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
>> +        else
>> +            d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
>
> I'd prefer the if/else cases to remain as they are - makes the patch
> smaller, and fits better with the (weak) model of using the if branch
> for the common case and the else one for the special one (outside
> of error handling of course).

OK. I prefer to avoid the if (!foo) bar; else baz; construct where
possible, but common case first is a good reason to use it.

>> --- a/xen/common/xenoprof.c
>> +++ b/xen/common/xenoprof.c
>> @@ -603,7 +603,7 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) arg)
>>
>>       xenoprof_init.is_primary =
>>           ((xenoprof_primary_profiler == d) ||
>> -         ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0)));
>> +         ((xenoprof_primary_profiler == NULL) && is_control_domain(d)));
>
> Do you really consider profiling a control domain property? This is
> even more so questionable without knowing whether you checked
> that there are no issues with all of the sudden there perhaps
> being more than one domain eligible for becoming the primary
> profiler - the oprofile code isn't in that good a shape to be certain
> without explicit checking.
>
> Jan

I don't directly consider profiling to be a control domain property, but
I also don't consider it a hardware domain property, and it does need to
be restricted in some way. I could make a separate patch changing the
condition to use an XSM hook, only setting xenoprof_primary_profiler if
the domain is allowed the XEN__PRIVPROFILE permission, but this still
could cause multiple domains to be eligible.

 From my cursory examination when I made this change, the first domain to
try becoming the primary profiler will succeed and be assigned to
xenoprof_primary_profiler. Later domains will not be considered since the
primary will already be set.

One thing I had not considered that may be a problem: if the profiling
domain is shut down without calling XENOPROF_shutdown, it will not be
possible to use profiling this boot unless the struct domain* is reused.
This may then become a security issue because an arbitrary domain is
now the primary profiler, although the XSM policy should prevent any
actions by a domain not permitted to use the profiling hypercalls.
Using is_hardware_domain here avoids that problem (as the hardware domain
may never shut down or be destroyed), so that may be the simplest
solution until a better model of who is responsible for profiling is
presented.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-05 15:25     ` Daniel De Graaf
@ 2014-03-05 15:45       ` Jan Beulich
  2014-03-05 21:23         ` Daniel De Graaf
  2014-03-11 13:10       ` Ian Campbell
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2014-03-05 15:45 UTC (permalink / raw
  To: Daniel De Graaf
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, xen-devel,
	Stefano Stabellini, Suravee Suthikulpanit, Xiantao Zhang

>>> On 05.03.14 at 16:25, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> On 03/05/2014 04:23 AM, Jan Beulich wrote:
>>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -1505,7 +1505,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>>>           }
>>>
>>>           set_cpuid_faulting(is_pv_vcpu(next) &&
>>> -                           (next->domain->domain_id != 0));
>>> +                           !is_control_domain(next->domain));
>>
>> I can't see why the hardware domain (which can't be migrated)
>> should be prevented from seeing the real CPUID values. It's
>> less obvious whether a control domain should always see real
>> values. Hence if you think the latter should be the case (as the
>> change above shows), I think you should check for both cases
>> here.
> 
> Agreed, the hardware domain also needs to be checked for here. The
> reason the control domain is present is that it needs to see real
> CPUID values in order to set CPUID policy for guests based on the
> real hardware values.

So don't envision a staged system where the root domain hides
some features from creation-capable ones, and those may hide
further features from their descendants? Up to where even the
controlling domains might become migratable?

> Using is_hardware_domain here avoids that problem (as the hardware domain
> may never shut down or be destroyed), so that may be the simplest
> solution until a better model of who is responsible for profiling is
> presented.

Then please do so, with a short note to that effect in either the
description or a code comment.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-05 15:45       ` Jan Beulich
@ 2014-03-05 21:23         ` Daniel De Graaf
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 21:23 UTC (permalink / raw
  To: Jan Beulich
  Cc: Keir Fraser, Ian Campbell, Tim Deegan, xen-devel,
	Stefano Stabellini, Suravee Suthikulpanit, Xiantao Zhang

On 03/05/2014 10:45 AM, Jan Beulich wrote:
>>>> On 05.03.14 at 16:25, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>> On 03/05/2014 04:23 AM, Jan Beulich wrote:
>>>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>>>> --- a/xen/arch/x86/domain.c
>>>> +++ b/xen/arch/x86/domain.c
>>>> @@ -1505,7 +1505,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>>>>            }
>>>>
>>>>            set_cpuid_faulting(is_pv_vcpu(next) &&
>>>> -                           (next->domain->domain_id != 0));
>>>> +                           !is_control_domain(next->domain));
>>>
>>> I can't see why the hardware domain (which can't be migrated)
>>> should be prevented from seeing the real CPUID values. It's
>>> less obvious whether a control domain should always see real
>>> values. Hence if you think the latter should be the case (as the
>>> change above shows), I think you should check for both cases
>>> here.
>>
>> Agreed, the hardware domain also needs to be checked for here. The
>> reason the control domain is present is that it needs to see real
>> CPUID values in order to set CPUID policy for guests based on the
>> real hardware values.
>
> So don't envision a staged system where the root domain hides
> some features from creation-capable ones, and those may hide
> further features from their descendants? Up to where even the
> controlling domains might become migratable?

Perhaps is_control_domain should be renamed to is_root_control_domain;
it is not necessary for every control domain to have is_control_domain
return true.  In fact, the domain builder I posted does not set the
is_privileged field on any guest it creates, and so once it shuts down,
there are no domains that the hypervisor considers control domains. The
XSM policy governs which domains are permitted to create, pause, and do
all the usual "control" operations.

A quick grep actually seems to point out that is_control_domain could
easily be removed, as the only references that remain are these CPUID
fields. This would end up simplifying the disaggregation model a bit.

>> Using is_hardware_domain here avoids that problem (as the hardware domain
>> may never shut down or be destroyed), so that may be the simplest
>> solution until a better model of who is responsible for profiling is
>> presented.
>
> Then please do so, with a short note to that effect in either the
> description or a code comment.
>
> Jan

Right, this change and a comment will be in the v2 when I post it.


-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init
  2014-03-05  9:56   ` Jan Beulich
@ 2014-03-05 22:25     ` Daniel De Graaf
  2014-03-06  9:53       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 22:25 UTC (permalink / raw
  To: Jan Beulich; +Cc: Keir Fraser, Xiantao Zhang, Suravee Suthikulpanit, xen-devel

On 03/05/2014 04:56 AM, Jan Beulich wrote:
>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>> This is required to allow the hardware domain to be built by dom0 after
>> the __init sections have been discarded.
>
> Can you quantify the amount of code/data getting moved?

It looks like about one page of code and very little data (fits in slack space):

objdump -x old-xen-syms:
Idx Name          Size      VMA               LMA               File off  Algn
   0 .text         00138312  ffff82d080100000  ffff82d080100000  00001000  2**12
                   CONTENTS, ALLOC, LOAD, CODE
   1 .rodata       0004a830  ffff82d080238320  ffff82d080238320  00139320  2**5
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   2 .data.read_mostly 0000b1d8  ffff82d080282b80  ffff82d080282b80  00183b80  2**7
                   CONTENTS, ALLOC, LOAD, DATA
   3 .data         0000e828  ffff82d08028e000  ffff82d08028e000  0018f000  2**12
                   CONTENTS, ALLOC, LOAD, DATA
   4 .init.text    0002e96b  ffff82d08029d000  ffff82d08029d000  0019e000  2**4
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
   5 .init.data    0000f638  ffff82d0802cb980  ffff82d0802cb980  001cc980  2**5
                   CONTENTS, ALLOC, LOAD, DATA
   6 .init.setup   000010c0  ffff82d0802dafc0  ffff82d0802dafc0  001dbfc0  2**5
                   CONTENTS, ALLOC, LOAD, DATA
   7 .initcall.init 00000210  ffff82d0802dc080  ffff82d0802dc080  001dd080  2**3
                   CONTENTS, ALLOC, LOAD, DATA
   8 .xsm_initcall.init 00000008  ffff82d0802dc290  ffff82d0802dc290  001dd290  2**3
                   CONTENTS, ALLOC, LOAD, DATA
   9 .bss          0004c900  ffff82d0802e0000  ffff82d0802e0000  001dd298  2**7
                   ALLOC

objdump -x new-xen-syms:
   0 .text         00139312  ffff82d080100000  ffff82d080100000  00001000  2**12
                   CONTENTS, ALLOC, LOAD, CODE
   1 .rodata       0004a830  ffff82d080239320  ffff82d080239320  0013a320  2**5
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   2 .data.read_mostly 0000b1d8  ffff82d080283b80  ffff82d080283b80  00184b80  2**7
                   CONTENTS, ALLOC, LOAD, DATA
   3 .data         0000e828  ffff82d08028f000  ffff82d08028f000  00190000  2**12
                   CONTENTS, ALLOC, LOAD, DATA
   4 .init.text    0002dfcb  ffff82d08029e000  ffff82d08029e000  0019f000  2**4
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
   5 .init.data    0000f5f8  ffff82d0802cbfe0  ffff82d0802cbfe0  001ccfe0  2**5
                   CONTENTS, ALLOC, LOAD, DATA
   6 .init.setup   000010c0  ffff82d0802db5e0  ffff82d0802db5e0  001dc5e0  2**5
                   CONTENTS, ALLOC, LOAD, DATA
   7 .initcall.init 00000210  ffff82d0802dc6a0  ffff82d0802dc6a0  001dd6a0  2**3
                   CONTENTS, ALLOC, LOAD, DATA
   8 .xsm_initcall.init 00000008  ffff82d0802dc8b0  ffff82d0802dc8b0  001dd8b0  2**3
                   CONTENTS, ALLOC, LOAD, DATA
   9 .bss          0004c980  ffff82d0802e0000  ffff82d0802e0000  001dd8b8  2**7
                   ALLOC

> Overall I wonder what the purpose is of allowing the hardware
> domain to be created later, rather than keeping Dom0 for this
> purpose. After all booting the system is very much hardware
> control. Once a control domain was spawned, Dom0 could then
> be revoked the right to create further domains.

One reason is to maintain a small trusted computing base for critical
components of the platform, such as the control domains and the vTPM
infrastructure.  We would prefer to avoid as much parsing of unmeasured
(and therefore untrusted) data as possible in the early boot process
because any vulnerability in these parsers becomes an attack vector for
components depending on them.

When using TXT/TBOOT as the basis for trust, the Xen hypervisor and
domain 0 are measured and may be considered more trusted than the state
of hardware and any structures supplied by the BIOS that could be
modified by a bootloader or other unmeasured "gap code".  When a domain
0 running Linux boots, it parses the ACPI tables and enumerates hardware
based on them prior to executing any user-space code.  Exploits using
malicious ACPI tables can be used for privilege escalation (see
CVE-2010-4347 for an example), so if the domain building process were
run from the initrd, all domains on the system would be vulnerable to an
attack using this.  ACPI is not the only attack vector here; malicious
hardware is very difficult to defend the hardware domain from once the
IOMMU begins allowing DMA access to the domain's memory.

A possible alternative that does not involve hypervisor modification
would be to have the domain builder use kexec() after building domains
and de-privileging itself.  This requires a more complex domain builder,
and makes a number of security assertions more difficult to evaluate
from the security policy.  With a separate domain builder and hardware
domain, certain ranges of memory (such as the TPM MMIO pages) can be
restricted from ever being accessible to the hardware domain in the
security policy.  Using separate domains also avoids the possibility
that the domain builder neglects to clean up a mapping to a just-created
domain prior to kexec which could be used to later change that domain's
behavior; while a given version of the domain builder may not have such
bugs, new features (such as the ability to change the Xenstore start
info page after event channels are added) can introduce bugs that
silently violate the stated security policy.

>> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
>> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
>> @@ -36,7 +36,7 @@
>>    * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
>>    * 1:1 iommu mappings except xen and unusable regions.
>>    */
>> -static bool_t __initdata iommu_inclusive_mapping = 1;
>> +static bool_t iommu_inclusive_mapping = 1;
>
> __read_mostly?

Yes. I used __read_mostly in the size tests above, and will in v2.

> Jan

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-05 12:02   ` Ian Jackson
@ 2014-03-05 22:36     ` Daniel De Graaf
  2014-03-10 16:45       ` Ian Jackson
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 22:36 UTC (permalink / raw
  To: Ian Jackson, Jan Beulich; +Cc: Stefano Stabellini, Ian Campbell, xen-devel

On 03/05/2014 07:02 AM, Ian Jackson wrote:
> Daniel De Graaf writes ("[PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
>> When dom0 is not the hardware domain, it can be destroyed in the same
>> way as any other service domain. Since the hypervisor already prevents a
>> domain from destroying itself, this patch only changes behavior when
>> used in a disaggregated environment.
>
> Thanks for this patch.  You must be doing something very interesting
> to want this :-).
>
> I have a concern though: I wonder what the error behaviour is in the
> case where you try to destroy dom0 from itself.  Does it get halfway
> through, breaking the system, and then fail ?
>
> Ian.

It gets partway though, with the most important (bad) action being the
removal of /local/domain/0 from xenstore succeeds. Testing on my
development system to destroy a hardware domain (domid 3) from itself:

# xl destroy 3
libxl: error: libxl.c:1411:libxl__destroy_domid: xc_domain_pause failed for 3
libxl: error: libxl_device.c:894:device_backend_callback: unable to remove device with path /local/domain/8/backend/vtpm/3/0
libxl: error: libxl.c:1451:devices_destroy_cb: libxl__devices_destroy failed for 3
libxl: error: libxl.c:1456:devices_destroy_cb: xs_rm failed for /vm/3: No such file or directory
libxl: error: libxl.c:1471:devices_destroy_cb: xc_domain_destroy failed for 3
libxl: error: libxl.c:1342:domain_destroy_callback: unable to destroy guest with domid 3
libxl: error: libxl.c:1269:domain_destroy_cb: destruction of domain 3 failed
destroy failed (rc=-3)
# xl li
Name                                        ID   Mem VCPUs      State   Time(s)
xenstore                                     1    20     1     -b----       2.9
vtpm-mgr                                     2     5     1     -b----       3.2
(null)                                       3   489     1     r-----    9308.7
db-server                                    4    16     1     -b----       1.7
vtpm-os                                      5     5     1     -b----       2.4
ctl                                          6     4     1     -b----       0.3
os-ctl                                       7  5000     1     -b----   72923.1
vtpm-hw                                      8     5     1     -b----       2.3
(The domain with ID 0 was already destroyed during bootup).

The system still works after this, although I would expect possible
issues to come up if I were to do if{up,down} on a guest or take other
actions that cause Linux to read xenstore keys.  If a domain were being
started up during the destroy, it would become broken.

In reply to both this and Jan's earlier email:
> So this gets deleted without replacement? How is the hardware
> domain being protected from (accidental or malicious) deletion
> then? Even if this is being dealt with in the hypervisor, I'd be
> afraid of the failure resulting in a cryptic error message instead
> of the very clear one above.

The existing check seems to be a useful guard against accidentally
breaking parts of a running system.  Would requiring a -f flag on the
destroy operation to work on domain 0 be preferable?


-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-05  3:50   ` Julien Grall
@ 2014-03-05 23:04     ` Daniel De Graaf
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 23:04 UTC (permalink / raw
  To: Julien Grall, xen-devel
  Cc: Keir Fraser, Ian Campbell, Jan Beulich, Stefano Stabellini

On 03/04/2014 10:50 PM, Julien Grall wrote:
> Hello Daniel,
>
> On 05/03/14 06:51, Daniel De Graaf wrote:
>> This adds a hypervisor command line option "hardware_dom=" which takes a
>> domain ID.  When the domain with this ID is created, it will be used as
>> the hardware domain.
>>
>> This is intended to be used when dom0 is a dedicated stub domain for
>> domain building, allowing the hardware domain to be de-privileged and
>> act only as a driver domain.
>>
>> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> ---
>>   xen/arch/x86/domain_build.c |  4 +++-
>>   xen/arch/x86/setup.c        |  3 +++
>>   xen/common/domctl.c         |  8 ++++++++
>>   xen/common/rangeset.c       | 26 ++++++++++++++++++++++++++
>>   xen/include/xen/rangeset.h  |  3 +++
>>   xen/include/xen/sched.h     |  3 ++-
>>   6 files changed, 45 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
>> index 84ce392..e9de496 100644
>> --- a/xen/arch/x86/domain_build.c
>> +++ b/xen/arch/x86/domain_build.c
>> @@ -1149,7 +1149,9 @@ int __init construct_dom0(
>>           printk(" Xen warning: dom0 kernel broken ELF: %s\n",
>>                  elf_check_broken(&elf));
>>
>> -    iommu_dom0_init(dom0);
>> +    if( is_hardware_domain(dom0) )
>> +        iommu_dom0_init(dom0);
>> +
>>       return 0;
>>
>>   out:
>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>> index 3a4f69c..3480854 100644
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -88,6 +88,9 @@ unsigned long __initdata highmem_start;
>>   size_param("highmem-start", highmem_start);
>>   #endif
>>
>> +unsigned int __read_mostly hardware_dom;
>> +integer_param("hardware_dom", hardware_dom);
>> +
>>   cpumask_t __read_mostly cpu_present_map;
>>
>>   unsigned long __read_mostly xen_phys_start;
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index eebeee7..11e6b94 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -472,6 +472,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>               break;
>>           }
>>
>> +        if (d->domain_id == hardware_dom) {
>> +            printk("Initialising hardware domain %d\n", hardware_dom);
>> +            rangeset_swap(d->irq_caps, dom0->irq_caps);
>> +
>> +            dom0 = d;
>> +            iommu_dom0_init(dom0);
>> +        }
>> +
>
> This patch will break compilation on ARM. You are using hardware_dom which is defined in xen/arch/x86/setup.c.
>
> I'm not sure if the best solution is to move the defined for hardware_dom in common code ... because settings this variable to a value other than 0 will break ARM boot with your changes in patch #1.
>
> Supporting hardware domain on ARM will need rework on dom0 builder.

For the moment, I think adding hardware_dom as a #define to 0 for ARM
may be the best solution.  Some of the arguments for needing a
separate hardware domain and domain builder are less relevant on an
ARM system (at least based on my limited understanding of how ARM
hardware usually speaks with other devices), and the domain builder
that I posted currently will not compile for ARM.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-05 10:04   ` Jan Beulich
@ 2014-03-05 23:04     ` Daniel De Graaf
  2014-03-06  9:54       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-05 23:04 UTC (permalink / raw
  To: Jan Beulich; +Cc: Keir Fraser, xen-devel

On 03/05/2014 05:04 AM, Jan Beulich wrote:
>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>> This adds a hypervisor command line option "hardware_dom=" which takes a
>> domain ID.  When the domain with this ID is created, it will be used as
>> the hardware domain.
>>
>> This is intended to be used when dom0 is a dedicated stub domain for
>> domain building, allowing the hardware domain to be de-privileged and
>> act only as a driver domain.
>
> Apart from the abstract question regarding the purpose of this, a
> couple of comments on the patch a such:
>
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -88,6 +88,9 @@ unsigned long __initdata highmem_start;
>>   size_param("highmem-start", highmem_start);
>>   #endif
>>
>> +unsigned int __read_mostly hardware_dom;
>> +integer_param("hardware_dom", hardware_dom);
>
> This ought to be domid_t, and live in common code.
>
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -472,6 +472,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>               break;
>>           }
>>
>> +        if (d->domain_id == hardware_dom) {
>
> Coding style.
>
>> +            printk("Initialising hardware domain %d\n", hardware_dom);
>> +            rangeset_swap(d->irq_caps, dom0->irq_caps);
>
> Why interrupts, but not I/O ports and MMIO?

I/O ports and MMIO are currently set up by the domain builder to
mirror those of dom0 with some exceptions.  I could swap all of
the ranges and then de-privilege the newly created domain if that
seems more symmetric (it may also simplify the creation of the
domain builder's specifications); I'll try testing that to ensure
it works as expected.

>> +
>> +            dom0 = d;
>
> This, I think, is the point where the variable name becomes
> intolerable: ASSERT(!dom0 || !dom0->domain_id) should be
> valid at all times as long as the variable name is dom0.

I will prepare a pure rename patch prior to this one which
renames the dom0 variable (and perhaps some functions with
dom0 in their name) to another name: hwdom or hardware_domain
would be my preference, with the hardware_dom global renamed
to hardware_domid to avoid confusion.

>> --- a/xen/common/rangeset.c
>> +++ b/xen/common/rangeset.c
>> @@ -438,3 +438,29 @@ void rangeset_domain_printk(
>>
>>       spin_unlock(&d->rangesets_lock);
>>   }
>> +
>> +void rangeset_swap(struct rangeset *a, struct rangeset *b)
>> +{
>> +    struct list_head tmp;
>> +    spin_lock(&a->lock);
>> +    spin_lock(&b->lock);
>> +    memcpy(&tmp, &a->range_list, sizeof(tmp));
>> +    memcpy(&a->range_list, &b->range_list, sizeof(tmp));
>> +    memcpy(&b->range_list, &tmp, sizeof(tmp));
>> +    if (a->range_list.next == &b->range_list) {
>> +        a->range_list.next = &a->range_list;
>> +        a->range_list.prev = &a->range_list;
>> +    } else {
>> +        a->range_list.next->prev = &a->range_list;
>> +        a->range_list.prev->next = &a->range_list;
>> +    }
>> +    if (b->range_list.next == &a->range_list) {
>> +        b->range_list.next = &b->range_list;
>> +        b->range_list.prev = &b->range_list;
>> +    } else {
>> +        b->range_list.next->prev = &b->range_list;
>> +        b->range_list.prev->next = &b->range_list;
>> +    }
>
> Coding style again.
>
> Jan

Will clean up this patch for style when I resubmit.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init
  2014-03-05 22:25     ` Daniel De Graaf
@ 2014-03-06  9:53       ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2014-03-06  9:53 UTC (permalink / raw
  To: Daniel De Graaf
  Cc: Keir Fraser, Xiantao Zhang, Suravee Suthikulpanit, xen-devel

>>> On 05.03.14 at 23:25, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> On 03/05/2014 04:56 AM, Jan Beulich wrote:
>>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>>> This is required to allow the hardware domain to be built by dom0 after
>>> the __init sections have been discarded.
>>
>> Can you quantify the amount of code/data getting moved?
> 
> It looks like about one page of code and very little data (fits in slack 
> space):

Anyway (and taking into consideration the rest of what you said)
I would lean towards asking for a special construct to accommodate
both your needs and the fact that likely most downstreams would
continue to build without XSM for a while - something along the
lines of Linux'es __init_or_module (i.e. in your case evaluating to
__init/__initdata when !XSM_ENABLE and to nothing in the opposite
case). Of course that's under the assumption that the rooting call
sites can be done reasonably cleanly wrt to making sure .init.*
references don't occur from normal code/data without being
properly guarded (mistakes here are rather difficult to track with
us not having a utility similar to Linux'es modpost checking section
references).

And all that of course only unless others feel the movement out
of .init.* isn't that big a deal to invent special constructs for.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] xen: Allow hardare domain != dom0
  2014-03-05 23:04     ` Daniel De Graaf
@ 2014-03-06  9:54       ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2014-03-06  9:54 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Keir Fraser, xen-devel

>>> On 06.03.14 at 00:04, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> On 03/05/2014 05:04 AM, Jan Beulich wrote:
>>>>> On 04.03.14 at 23:51, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>>> +
>>> +            dom0 = d;
>>
>> This, I think, is the point where the variable name becomes
>> intolerable: ASSERT(!dom0 || !dom0->domain_id) should be
>> valid at all times as long as the variable name is dom0.
> 
> I will prepare a pure rename patch prior to this one which
> renames the dom0 variable (and perhaps some functions with
> dom0 in their name) to another name: hwdom or hardware_domain
> would be my preference, with the hardware_dom global renamed
> to hardware_domid to avoid confusion.

Sounds reasonable.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/6] xenstored: add --master-domid to support domain builder
  2014-03-04 22:51 ` [PATCH 6/6] xenstored: add --master-domid to support domain builder Daniel De Graaf
@ 2014-03-10 12:14   ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2014-03-10 12:14 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Stefano Stabellini, Ian Campbell, xen-devel

Daniel De Graaf writes ("[PATCH 6/6] xenstored: add --master-domid to support domain builder"):
> When a domain builder stub domain is used, the initial xenstore
> connection to domain 0 may use a different domain ID as the endpoint;
> allow this domain ID to be specified on the command line.
> 
> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-05 22:36     ` Daniel De Graaf
@ 2014-03-10 16:45       ` Ian Jackson
  2014-03-12 14:27         ` Daniel De Graaf
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2014-03-10 16:45 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Stefano Stabellini, Ian Campbell, Jan Beulich, xen-devel

Daniel De Graaf writes ("Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
> In reply to both this and Jan's earlier email:
> > So this gets deleted without replacement? How is the hardware
> > domain being protected from (accidental or malicious) deletion
> > then? Even if this is being dealt with in the hypervisor, I'd be
> > afraid of the failure resulting in a cryptic error message instead
> > of the very clear one above.
> 
> The existing check seems to be a useful guard against accidentally
> breaking parts of a running system.  Would requiring a -f flag on the
> destroy operation to work on domain 0 be preferable?

That would be tolerable if we can't find a better way to tell whether
it's safe or not.

I guess you don't want dom0 to be able to destroy itself - or do you ?
Perhaps the right answer is to require -f for a domain to destroy
itself.

ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] xen: use domid check in is_hardware_domain
  2014-03-05 15:25     ` Daniel De Graaf
  2014-03-05 15:45       ` Jan Beulich
@ 2014-03-11 13:10       ` Ian Campbell
  1 sibling, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-11 13:10 UTC (permalink / raw
  To: Daniel De Graaf
  Cc: Keir Fraser, Jan Beulich, Tim Deegan, xen-devel,
	Stefano Stabellini, Suravee Suthikulpanit, Xiantao Zhang

On Wed, 2014-03-05 at 10:25 -0500, Daniel De Graaf wrote:
> 
> One thing I had not considered that may be a problem: if the profiling
> domain is shut down without calling XENOPROF_shutdown, it will not be
> possible to use profiling this boot unless the struct domain* is
> reused. 

Shouldn't this therefore be cleaned up in domain_relinquish_resources or
some other suitable domain destroy path?

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-10 16:45       ` Ian Jackson
@ 2014-03-12 14:27         ` Daniel De Graaf
  2014-03-13 17:17           ` Ian Jackson
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-12 14:27 UTC (permalink / raw
  To: Ian Jackson; +Cc: Stefano Stabellini, Ian Campbell, Jan Beulich, xen-devel

On 03/10/2014 12:45 PM, Ian Jackson wrote:
> Daniel De Graaf writes ("Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
>> In reply to both this and Jan's earlier email:
>>> So this gets deleted without replacement? How is the hardware
>>> domain being protected from (accidental or malicious) deletion
>>> then? Even if this is being dealt with in the hypervisor, I'd be
>>> afraid of the failure resulting in a cryptic error message instead
>>> of the very clear one above.
>>
>> The existing check seems to be a useful guard against accidentally
>> breaking parts of a running system.  Would requiring a -f flag on the
>> destroy operation to work on domain 0 be preferable?
>
> That would be tolerable if we can't find a better way to tell whether
> it's safe or not.
>
> I guess you don't want dom0 to be able to destroy itself - or do you ?
> Perhaps the right answer is to require -f for a domain to destroy
> itself.
>
> ian.

A domain can't destroy itself anyway (the hypervisor prevents this), so
if there was a simple way for xl to check if the domain ID was its own
ID, this would work.  I am not aware of a good, simple way to make this
check, so leaving it at preventing dom0's destruction will at least not
regress in usability.

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-12 14:27         ` Daniel De Graaf
@ 2014-03-13 17:17           ` Ian Jackson
  2014-03-13 17:41             ` Daniel De Graaf
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2014-03-13 17:17 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Stefano Stabellini, Ian Campbell, Jan Beulich, xen-devel

Daniel De Graaf writes ("Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
> A domain can't destroy itself anyway (the hypervisor prevents this), so
> if there was a simple way for xl to check if the domain ID was its own
> ID, this would work.  I am not aware of a good, simple way to make this
> check, so leaving it at preventing dom0's destruction will at least not
> regress in usability.

What does XEN_DOMCTL_getdomaininfo do when passed DOMID_SELF ?

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-13 17:17           ` Ian Jackson
@ 2014-03-13 17:41             ` Daniel De Graaf
  2014-03-14 14:32               ` Ian Jackson
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel De Graaf @ 2014-03-13 17:41 UTC (permalink / raw
  To: Ian Jackson; +Cc: Stefano Stabellini, Ian Campbell, Jan Beulich, xen-devel

On 03/13/2014 01:17 PM, Ian Jackson wrote:
> Daniel De Graaf writes ("Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
>> A domain can't destroy itself anyway (the hypervisor prevents this), so
>> if there was a simple way for xl to check if the domain ID was its own
>> ID, this would work.  I am not aware of a good, simple way to make this
>> check, so leaving it at preventing dom0's destruction will at least not
>> regress in usability.
>
> What does XEN_DOMCTL_getdomaininfo do when passed DOMID_SELF ?
>
> Ian.
>
>

It returns -ESRCH. The domain passed in is a lower bound for the domain
ID, and is not resolved using the usual domid resolution function.

One possible method for a domain to determine its own domain ID is to
create an unbound inter-domain event channel to DOMID_SELF and then use
EVTCHNOP_status to read the now-resolved endpoint.  This seems rather
cumbersome to implement: the toolstack only exposes this as far as
xc_evtchn_status, so libxl would need modification in addition to xl.

Reading the "domid" key from xenstore is another alternative, but it
also requires adding a function to libxl and needs some fall-back if
the key does not exist.

The constant LIBXL_TOOLSTACK_DOMID is defined inlibxl_internal.h, but
I'm not sure it is best to expose this: I would prefer to eventually
remove the uses of this #define to support toolstack domains with a
dynamic domain ID.  For now, that #define is not used in any location
that impedes functionality if it is incorrect (it only makes errors
a bit less user-friendly).

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed
  2014-03-13 17:41             ` Daniel De Graaf
@ 2014-03-14 14:32               ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2014-03-14 14:32 UTC (permalink / raw
  To: Daniel De Graaf; +Cc: Stefano Stabellini, Ian Campbell, Jan Beulich, xen-devel

Daniel De Graaf writes ("Re: [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed"):
> On 03/13/2014 01:17 PM, Ian Jackson wrote:
> > What does XEN_DOMCTL_getdomaininfo do when passed DOMID_SELF ?
...
> It returns -ESRCH. The domain passed in is a lower bound for the domain
> ID, and is not resolved using the usual domid resolution function.

How annoying.

> One possible method for a domain to determine its own domain ID is to
> create an unbound inter-domain event channel to DOMID_SELF and then use
> EVTCHNOP_status to read the now-resolved endpoint.  This seems rather
> cumbersome to implement: the toolstack only exposes this as far as
> xc_evtchn_status, so libxl would need modification in addition to xl.

Glrghk.

OK, I'm persuaded that your original proposal, to require -f to
destroy dom0, is the best approach for now at least.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-03-14 14:32 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-04 22:51 [PATCH 0/6] xen: Hardware domain support Daniel De Graaf
2014-03-04 22:51 ` [PATCH 1/6] xen: use domid check in is_hardware_domain Daniel De Graaf
2014-03-05  3:44   ` Julien Grall
2014-03-05  9:23   ` Jan Beulich
2014-03-05 15:25     ` Daniel De Graaf
2014-03-05 15:45       ` Jan Beulich
2014-03-05 21:23         ` Daniel De Graaf
2014-03-11 13:10       ` Ian Campbell
2014-03-04 22:51 ` [PATCH 2/6] xen/iommu: Move dom0 setup code out of __init Daniel De Graaf
2014-03-05  9:56   ` Jan Beulich
2014-03-05 22:25     ` Daniel De Graaf
2014-03-06  9:53       ` Jan Beulich
2014-03-04 22:51 ` [PATCH 3/6] xen: prevent 0 from being used as a dynamic domid Daniel De Graaf
2014-03-04 22:51 ` [PATCH 4/6] xen: Allow hardare domain != dom0 Daniel De Graaf
2014-03-05  3:50   ` Julien Grall
2014-03-05 23:04     ` Daniel De Graaf
2014-03-05 10:04   ` Jan Beulich
2014-03-05 23:04     ` Daniel De Graaf
2014-03-06  9:54       ` Jan Beulich
2014-03-04 22:51 ` [PATCH 5/6] tools/libxl: Allow dom0 to be destroyed Daniel De Graaf
2014-03-05 10:07   ` Jan Beulich
2014-03-05 12:02   ` Ian Jackson
2014-03-05 22:36     ` Daniel De Graaf
2014-03-10 16:45       ` Ian Jackson
2014-03-12 14:27         ` Daniel De Graaf
2014-03-13 17:17           ` Ian Jackson
2014-03-13 17:41             ` Daniel De Graaf
2014-03-14 14:32               ` Ian Jackson
2014-03-04 22:51 ` [PATCH 6/6] xenstored: add --master-domid to support domain builder Daniel De Graaf
2014-03-10 12:14   ` Ian Jackson
2014-03-04 23:32 ` Domain Builder Daniel De Graaf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.