From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44385) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5X4b-000059-SX for qemu-devel@nongnu.org; Thu, 18 Jun 2015 06:29:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z5X4Y-0003DV-Kt for qemu-devel@nongnu.org; Thu, 18 Jun 2015 06:29:29 -0400 Received: from [59.151.112.132] (port=44972 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5X4T-00032u-2T for qemu-devel@nongnu.org; Thu, 18 Jun 2015 06:29:26 -0400 Message-ID: <55829D0C.3050008@cn.fujitsu.com> Date: Thu, 18 Jun 2015 18:27:24 +0800 From: Chen Fan MIME-Version: 1.0 References: <2949053b9dfe0e8fd0df9d0c5e32fecddce7f156.1433812962.git.chen.fan.fnst@cn.fujitsu.com> <1433885045.4927.148.camel@redhat.com> <557FDA05.8070604@cn.fujitsu.com> <1434463719.4927.448.camel@redhat.com> <55811391.1050908@cn.fujitsu.com> <1434554607.5628.26.camel@redhat.com> In-Reply-To: <1434554607.5628.26.camel@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC v9 14/18] vfio: improve vfio_pci_hot_reset to support more case List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org On 06/17/2015 11:23 PM, Alex Williamson wrote: > On Wed, 2015-06-17 at 14:28 +0800, Chen Fan wrote: >> On 06/16/2015 10:08 PM, Alex Williamson wrote: >>> On Tue, 2015-06-16 at 16:10 +0800, Chen Fan wrote: >>>> On 06/10/2015 05:24 AM, Alex Williamson wrote: >>>>> On Tue, 2015-06-09 at 11:37 +0800, Chen Fan wrote: >>>>>> the vfio_pci_hot_reset differentiate the single and multi in-used >>>>>> devices for reset. but sometimes we own the group without any devices, >>>>>> that also should support hot reset. >>>>> Nope, did you try it? It can be done, but the group still needs to be >>>>> connected to a container for isolation. >>>> I'm sorry for that. because I have no such host in hand. but I think if >>>> we can keep connect container for each affected group, we also able >>>> to use this method to do host bus reset. >>> All you need is a dual-port card with isolation, which includes all >>> Intel 1G NICs (igb & e1000e) as of the quirks that are currently in >>> linux-next to be pushed for v4.2. Intel 10G NICs are already quirked >>> upstream. There are certainly ways to fake isolation for testing as >>> well. Thanks, >> I just have a Intel Corporation 82576 dual-port card, but how can I fake >> isolation group for this card in linux-next kernel? can you tell me the >> document link? > If you're running linux-next and have the card installed under a root > port that provides isolation then each port should be in a separate > iommu group. Nearly all Intel PCH root ports should have quirks to > enable isolation. If you're installing it in a processor root port > slot, you need to use a Xeon E5 or better CPU or else the lack of > isolation at the root port will negate the isolation capabilities of the > endpoint. Thanks, Hi Alex, I had test the case with isolation groups in latest linux-next, I can see the dual-port devices in host looks like: #readlink /sys/bus/pci/devices/0000\:06\:00.0/iommu_group ../../../../kernel/iommu_groups/30 #readlink /sys/bus/pci/devices/0000\:06\:00.1/iommu_group ../../../../kernel/iommu_groups/31 I used my v10 qemu code that added affected groups to VM to test the case with qemu command: qemu-system-x86_64 -M q35 -device ioh3420,bus=pcie.0,addr=1c.0,port=1,id=bridge1 -device vfio-pci,host=06:00.0,bus=bridge1,aer=true --enable-kvm then I used to emulate the aer with aer-inject in host. I could find the aer recovery successful in guest. and the 6:00.0 NIC can reuse by network-manage. but when I re-binding the dual devices to host. the host show error: Jun 18 19:13:36 TX300I kernel: [] dump_stack+0x45/0x57 Jun 18 19:13:36 TX300I kernel: [] warn_slowpath_common+0x8a/0xc0 Jun 18 19:13:36 TX300I kernel: [] warn_slowpath_fmt+0x55/0x70 Jun 18 19:13:36 TX300I kernel: [] bad_io_access+0x38/0x40 Jun 18 19:13:36 TX300I kernel: [] pci_iounmap+0x27/0x40 Jun 18 19:13:36 TX300I kernel: [] igb_probe+0xafd/0x1280 [igb] Jun 18 19:13:36 TX300I kernel: [] local_pci_probe+0x45/0xa0 Jun 18 19:13:36 TX300I kernel: [] ? pci_match_device+0xf4/0x120 Jun 18 19:13:36 TX300I kernel: [] pci_device_probe+0xe9/0x130 Jun 18 19:13:36 TX300I kernel: [] driver_probe_device+0x14f/0x420 Jun 18 19:13:36 TX300I kernel: [] bind_store+0xdc/0x120 Jun 18 19:13:36 TX300I kernel: [] drv_attr_store+0x24/0x30 Jun 18 19:13:36 TX300I kernel: [] sysfs_kf_write+0x3a/0x50 Jun 18 19:13:36 TX300I kernel: [] kernfs_fop_write+0x120/0x170 Jun 18 19:13:36 TX300I kernel: [] __vfs_write+0x37/0x100 Jun 18 19:13:36 TX300I kernel: [] ? __sb_start_write+0x58/0x110 Jun 18 19:13:36 TX300I kernel: [] vfs_write+0xa9/0x190 Jun 18 19:13:36 TX300I kernel: [] ? do_audit_syscall_entry+0x66/0x70 Jun 18 19:13:36 TX300I kernel: [] SyS_write+0x55/0xc0 Jun 18 19:13:36 TX300I kernel: [] entry_SYSCALL_64_fastpath+0x12/0x71 Jun 18 19:13:36 TX300I kernel: ---[ end trace b312cb051751fac4 ]--- Jun 18 19:13:36 TX300I kernel: igb: probe of 0000:06:00.1 failed with error -5 the 06:00.0 device can be initialized by igb driver. but the affected 06:00.1 can't be initialized by igb driver. Is not the 06:00.1 device state initialized by reset ? Thanks, Chen > > Alex > >>>>>> Signed-off-by: Chen Fan >>>>>> --- >>>>>> hw/vfio/pci.c | 11 +++++++++++ >>>>>> 1 file changed, 11 insertions(+) >>>>>> >>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>>>> index a4e8658..6507f39 100644 >>>>>> --- a/hw/vfio/pci.c >>>>>> +++ b/hw/vfio/pci.c >>>>>> @@ -3398,6 +3398,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) >>>>>> PCIHostDeviceAddress host; >>>>>> VFIOPCIDevice *tmp; >>>>>> VFIODevice *vbasedev_iter; >>>>>> + bool found; >>>>>> >>>>>> host.domain = devices[i].segment; >>>>>> host.bus = devices[i].bus; >>>>>> @@ -3427,6 +3428,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) >>>>>> goto out; >>>>>> } >>>>>> >>>>>> + found = false; >>>>>> /* Prep dependent devices for reset and clear our marker. */ >>>>>> QLIST_FOREACH(vbasedev_iter, &group->device_list, next) { >>>>>> if (vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) { >>>>>> @@ -3438,12 +3440,21 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) >>>>>> ret = -EINVAL; >>>>>> goto out_single; >>>>>> } >>>>>> + found = true; >>>>>> vfio_pci_pre_reset(tmp); >>>>>> tmp->vbasedev.needs_reset = false; >>>>>> multi = true; >>>>>> break; >>>>>> } >>>>>> } >>>>>> + >>>>>> + /* >>>>>> + * If we own the group but does not own the device, we also >>>>>> + * should call hot reset with multi. >>>>>> + */ >>>>>> + if (!single && !found) { >>>>>> + multi = true; >>>>>> + } >>>>>> } >>>>>> >>>>>> if (!single && !multi) { >>>>> >>>>> >>> >>> >>> . >>> > > > . >