From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [RFC] Xen PV IOMMU interface draft B Date: Tue, 16 Jun 2015 14:19:42 +0100 Message-ID: <55803E8E020000780008583F@mail.emea.novell.com> References: <557B0C35.4080907@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z4qmK-0002th-MN for xen-devel@lists.xenproject.org; Tue, 16 Jun 2015 13:19:48 +0000 In-Reply-To: <557B0C35.4080907@citrix.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Malcolm Crossley Cc: Kevin Tian , Yu C Zhang , AndrewCooper , Paul Durrant , David Vrabel , xen-devel , Zhiyuan Lv List-Id: xen-devel@lists.xenproject.org >>> On 12.06.15 at 18:43, wrote: > IOMMUOP_query_caps > ------------------ > > This subop queries the runtime capabilities of the PV-IOMMU interface for > the > specific called domain. This subop uses `struct pv_iommu_op` directly. "calling domain" perhaps? > ---------------------------------------------------------------------------- > -- > Field Purpose > ----- > --------------------------------------------------------------- > `flags` [out] This field details the IOMMUOP capabilities. > > `status` [out] Status of this op, op specific values listed below > ---------------------------------------------------------------------------- > -- > > Defined bits for flags field: > > ---------------------------------------------------------------------------- > -- > Name Bit Definition > ---- ------ ---------------------------------- > IOMMU_QUERY_map_cap 0 IOMMUOP_map_page or IOMMUOP_map_foreign > can be used for this domain "this" (see also above) perhaps being the calling domain? In which case I wonder how the "for" and IOMMUOP_map_foreign are meant to fit together: I assume the flag to indicate that mapping into the (calling) domain is possible. Which then makes me wonder - what use if the new hypercall when this flag isn't set? > IOMMU_QUERY_map_all_gfns 1 IOMMUOP_map_page subop can map any MFN > not used by Xen "gfns" or "MFN"? > Defined values for map_page subop status field: > > Value Reason > ------ > ---------------------------------------------------------------------- > 0 subop successfully returned > -EIO IOMMU unit returned error when attempting to map BFN to GFN. > -EPERM GFN could not be mapped because the GFN belongs to Xen. > -EPERM Domain is not a domain and GFN does not belong to domain "is not a hardware domain"? Also, I think we're pretty determined for there to ever only be one, so perhaps it should be "the hardware domain" here and elsewhere. > IOMMUOP_unmap_page > ------------------ > This subop uses `struct unmap_page` part of the `struct pv_iommu_op`. > > The subop usage of the "struct pv_iommu_op" and ``struct unmap_page` fields > are detailed below: > > -------------------------------------------------------------------- > Field Purpose > ----- ----------------------------------------------------- > `bfn` [in] Bus address frame number to be unmapped in DOMID_SELF Has it been determined that unmapping based on GFN is never going to be needed, and that unmapping by BFN is the more practical solution? The map counterpart doesn't seem to exclude establishing multiple mappings for the same BFN, and hence the inverse here would become kind of fuzzy in that case. > IOMMUOP_map_foreign_page > ---------------- > This subop uses `struct map_foreign_page` part of the `struct pv_iommu_op`. > > It is not valid to use domid representing the calling domain. And what's the point of that? Was it considered to have only one map/unmap pair, capable of mapping both local and foreign pages? If so, what speaks against that? > The M2B mechanism is a MFN to (BFN,domid,ioserver) tuple. I didn't see anything explaining the significance of this (namely the ioserver part, I think I can see the need for the domid), can you explain the backgound here please? > Every new M2B entry will take a reference to the MFN backing the GFN. What happens when that GFN->MFN mapping changes? > All the following conditions are required to be true for PV IOMMU > map_foreign > subop to succeed: > > 1. IOMMU detected and supported by Xen > 2. The domain has IOMMU controlled hardware allocated to it > 3. The domain is a hardware_domain and the following Xen IOMMU options are > NOT enabled: dom0-passthrough Is there a way for the hardware domain to know it is running in pass-through mode? Also, "the domain" is ambiguous here; I'm sure you mean the invoking domain, not the one owning the page. > This subop usage of the "struct pv_iommu_op" and ``struct map_foreign_page` > fields are detailed below: > > -------------------------------------------------------------------- > Field Purpose > ----- ----------------------------------------------------- > `domid` [in] The domain ID for which the gfn field applies > > `ioserver` [in] IOREQ server id associated with mapping > > `bfn` [in] Bus address frame number for gfn address In the description above you speak of returning data in this field. Is [in] really correct? > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > IOMMUOP_readable 0 BFN IOMMU mapping is readable > IOMMUOP_writeable 1 BFN IOMMU mapping is writeable > IOMMUOP_swap_mfn 2 BFN IOMMU mapping can be safely > swapped to scratch page Scratch page? This needs some explanation. > Reserved for future use 3-9 Reserved flag bits should be 0 > IOMMU_page_order 10-15 Returns maximum possible page order for > all other IOMMUOP subops "Returns"? "other"? > IOMMU_lookup_foreign_page > ---------------- > This subop uses `struct lookup_foreign_page` part of the `struct > pv_iommu_op`. > > If the BFN is specified as an input and parameter and there is no IOMMU > support > for the calling domain then an error will be returned. "input and parameter"? The field is marked [out] below, and I also can't see a flag allowing this to be optional (the more that flags is also [out]). > It is the calling domain responsibility to ensure there are no conflicts No races you mean? > Each successful subop will add to the M2B if there was not an existing > identical > M2B entry. > > Every new M2B entry will take a reference to the MFN backing the GFN. This is a lookup - why would it add something somewhere? Perhaps this is just a copy-and-paste mistake? Or is the use of "lookup" here misleading (as the last section of the document seems to suggest)? > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > IOMMUOP_readable 0 Returned BFN IOMMU mapping is > readable > IOMMUOP_writeable 1 Returned BFN IOMMU mapping is > writeable > Reserved for future use 2-9 Reserved flag bits should be 0 Reserved flags will be returned as 0. > Defined values for lookup_foreign_page subop status field: > > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -EPERM Calling domain does not have sufficient privilege over domid > -ENOENT There is no available BFN for provided GFN + domid combination Throughout here there seems to be a mixture of references to (GFN, domid) pairs anmd (GFN,domid,ioserver) triplets. Along with there being some explanation missing, these need to be made consistent to avoid confusion. > IOMMUOP_unmap_foreign_page > ---------------- > This subop uses `struct unmap_foreign_page` part of the `struct > pv_iommu_op`. > > If there is no IOMMU support then the MFN is returned in the BFN field (that > is > the only valid bus address for the GFN + domid combination). Copy-and-paste mistake again (doesn't seem to apply to unmap)? > If there is IOMMU support then the specified BFN is returned for the GFN + > domid > combination > > Each successful subop will add to the M2B if there was not an existing > identical > M2B entry. The > > Every new M2B entry will take a reference to the MFN backing the GFN. And again? > This subop usage of the "struct pv_iommu_op" and ``struct > unmap_foreign_page` fields > are detailed below: > > ----------------------------------------------------------------------- > Field Purpose > ----- -------------------------------------------------------- > `ioserver` [in] IOREQ server id associated with mapping > > `bfn` [in] Bus address frame number for gfn address > > `flags` [out] Flags for signalling page order of unmap operation [out]? > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -ENOENT There is no mapped BFN + ioserver id combination to unmap Yet another pair, the unique mapping properties of which don't follow from anything said earlier. > PV IOMMU interactions with self ballooning > ========================================== > > The guest should clear any IOMMU mappings it has of it's own pages before > releasing a page back to Xen. It will need to add IOMMU mappings after > repopulating a page with the populate_physmap hypercall. > > This requires that IOMMU mappings get a writeable page type reference count > and > that guests clear any IOMMU mappings before pinning page table pages. I suppose this is only for aware PV guests. If so, perhaps this should be made explicit. > PV IOMMU interactions with grant map/unmap operations > ===================================================== > > Grant map operations return a Physical device accessible address (BFN) if > the > GNTMAP_device_map flag is set. This operation currently returns the MFN for > PV > guests which may conflict with the BFN address space the guest uses if PV > IOMMU > map support is available to the guest. > > This design proposes to allow the calling domain to control the BFN address > that > a grant map operation uses. > > This can be achieved by specifying that the dev_bus_addr in the > gnttab_map_grant_ref structure is used an input parameter instead of the > output parameter it is currently. > > Only PAGE_SIZE aligned addresses are allowed for dev_bus_addr input > parameter. > > The revised structure is shown below for convenience. > > struct gnttab_map_grant_ref { > /* IN parameters. */ > uint64_t host_addr; > uint32_t flags; /* GNTMAP_* */ > grant_ref_t ref; > domid_t dom; > /* OUT parameters. */ > int16_t status; /* => enum grant_status */ > grant_handle_t handle; > /* IN/OUT parameters */ > uint64_t dev_bus_addr; > }; > > > The grant map operation would then behave similarly to the IOMMUOP_map_page > subop for the creation of the IOMMU mapping. > > The grant unmap operation would then behave similarly to the > IOMMUOP_unmap_page > subop for the removal of the IOMMU mapping. We're talking about mappings of foreign pages here - aren't these the wrong IOMMUOPs then? And if so, where would the ioserver id come from? Jan