From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53985) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5csS-0006FR-Qh for qemu-devel@nongnu.org; Thu, 18 Jun 2015 12:41:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z5csR-0000Fj-Ol for qemu-devel@nongnu.org; Thu, 18 Jun 2015 12:41:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34186) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5csR-0000F7-HO for qemu-devel@nongnu.org; Thu, 18 Jun 2015 12:41:19 -0400 Message-ID: <1434645677.3700.60.camel@redhat.com> From: Alex Williamson Date: Thu, 18 Jun 2015 10:41:17 -0600 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC v10 10/19] vfio: improve vfio_get_group to support adding as is NULL. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chen Fan Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org On Tue, 2015-06-16 at 16:10 +0800, Chen Fan wrote: > Signed-off-by: Chen Fan > --- > hw/vfio/common.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index df3171d..15f19a2 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -808,11 +808,18 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as) > VFIOGroup *group; > char path[32]; > struct vfio_group_status status = { .argsz = sizeof(status) }; > + int ret; > > QLIST_FOREACH(group, &vfio_group_list, next) { > if (group->groupid == groupid) { > /* Found it. Now is it already in the right context? */ > - if (group->container->space->as == as) { > + if (as && !group->container->space->as) { > + ret = vfio_register_container_listener(group->container, as); > + if (ret) { > + return NULL; > + } > + } > + if (!as || group->container->space->as == as) { > group->ref++; > return group; > } else { No, this is broken. Look further down where we call vfio_connect_container(group, as). With as=NULL, that gives us a new VFIOAddressSpace with space.as=NULL. Then we walk the list of containers attached to that as, which is of course empty, so we create a new container for the device. If we have more than one group attached this way, they share this container. What happens latter if one or more of those devices are attached to the VM? In the best case we setup a memory listener for this container, but that breaks vfio locked memory accounting because previously they would have shared a container and now we have two separate containers for them. But to make things worse, any other groups that were attached to this space.as=NULL container are forced to share this new as, so it's impossible to add them to a separate as. So, this won't work. I also don't like the side-effect of taking implicit ownership of groups. IOMMU groups are challenging enough and changing the rules to care about not only groups but shared bus dependencies seems like a support nightmare. I keep trying to think how we could reduce the scope to make AER support more manageable. Should we simply require that all device affected by a bus reset are assigned to the VM? To do that we'd need to revert to your machine-init-done check for compliance. But then what becomes of hotplug? As soon as we hot-remove one of those devices, AER is broken and we can't hot-add device individually and maintain support. If we limit support to single function endpoints, our problems go away, but the number of devices we could enable AER for is nearly zero. If we include multi-function, then hotplug is a significant issue since each function is individually hot-pluggable. But we can solve that by requiring AER devices to have a 1:1 function mapping to the VM and requiring all the functions of the slot to be assigned to the VM. Hot-remove then works automatically since that's done on a slot basis. Hot-add is not possible since QEMU doesn't support hot-add of multi-function devices, but that's a problem that needs to be solved anyway. Thanks, Alex