From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934194AbcBDWGb (ORCPT ); Thu, 4 Feb 2016 17:06:31 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:35080 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934114AbcBDWG0 (ORCPT ); Thu, 4 Feb 2016 17:06:26 -0500 Date: Thu, 4 Feb 2016 22:06:21 +0000 From: Chris Bainbridge To: Alan Stern Cc: mathias.nyman@linux.intel.com, johan@kernel.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] usb: host: xhci: Replace bus lock with host controller lock Message-ID: <20160204220621.GA5612@localhost> References: <1454615189-6862-1-git-send-email-chris.bainbridge@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 04, 2016 at 04:00:51PM -0500, Alan Stern wrote: > On Thu, 4 Feb 2016, Chris Bainbridge wrote: > > > The XHCI controller presents two USB buses to the system - one for USB 2 > > and one for USB 3. When only one bus is locked there is a race condition > > during hub init that results in errors like: > > > > [ 13.183701] usb 3-3: device descriptor read/all, error -110 > > What exactly is the race condition? Why does locking both buses fix > it? [ 2.692571] xhci_hcd 0000:00:14.0: xHCI Host Controller [ 2.693279] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3 [ 2.694867] xhci_hcd 0000:00:14.0: hcc params 0x20007181 hci version 0x100 quirks 0x0000b930 [ 2.694880] xhci_hcd 0000:00:14.0: cache line size of 256 is not supported [ 2.695995] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002 [ 2.696005] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 2.696016] usb usb3: Product: xHCI Host Controller [ 2.696024] usb usb3: Manufacturer: Linux 4.4.0 xhci-hcd [ 2.696031] usb usb3: SerialNumber: 0000:00:14.0 [ 2.698414] hub 3-0:1.0: USB hub found [ 2.704723] xhci_hcd 0000:00:14.0: xHCI Host Controller [ 2.705502] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4 [ 2.706136] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003 [ 2.706138] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 2.706139] usb usb4: Product: xHCI Host Controller [ 2.706140] usb usb4: Manufacturer: Linux 4.4.0 xhci-hcd [ 2.706141] usb usb4: SerialNumber: 0000:00:14.0 [ 2.708467] hub 4-0:1.0: USB hub found [ 3.021779] usb 3-3: new high-speed USB device number 2 using xhci_hcd [ 8.034843] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command ...bus4 enumeration... [ 13.297835] usb 3-3: device descriptor read/all, error -110 ...bus3 enumeration... Note there seem to be two timeouts of 5 seconds/10 seconds after the last message at 3.021779 seconds. hub_port_init is called in parallel for both buses. The first thread is in usb_get_device_descriptor when the second one enters the function and calls the code to get an address. I don't know precisely how it fails - it looks like the functions for doing the initialisation are synchronous and sleeping waiting for a response and that gets disrupted when the second thread tries to initialise the hub. What was the basis for using a lock on the bus rather than the controller? Does the spec say that buses of the same controller can be initialised in parallel? Mathias previously said: > Just found an additional note in the xhci specs section 4.5.3 saying that: > "Note: Software shall not transition more than one Device Slot to the Default State at a time" > which is what xhci_setup_device() does in addition to moving slots to the addressed state But I don't know if that means you can do the reset/set address/read descriptors in parallel? > > @@ -4312,7 +4312,7 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1, > > if (oldspeed == USB_SPEED_LOW) > > delay = HUB_LONG_RESET_TIME; > > > > - mutex_lock(&hdev->bus->usb_address0_mutex); > > + mutex_lock(&hdev->bus->controller->mutex); > > > > /* Reset the device; full speed may morph to high speed */ > > /* FIXME a USB 2.0 device may morph into SuperSpeed on reset. */ > > @@ -4588,7 +4588,7 @@ fail: > > hub_port_disable(hub, port1, 0); > > update_devnum(udev, devnum); /* for disconnect processing */ > > } > > - mutex_unlock(&hdev->bus->usb_address0_mutex); > > + mutex_unlock(&hdev->bus->controller->mutex); > > return retval; > > } > > I don't think this is a good idea. The driver core needs to be able to > access the controller while this function is running. You can > introduce a new mutex if you want, perhaps in the primary hcd > structure, but don't use bus->controller->mutex. An explicit lock might be a good idea. I was trying to avoid adding another lock so used the one in struct device as it appeared unused. The XHCI code seems to only use the lock in struct xhci_hcd and ehci uses struct ehci->lock. btw I think this bug may be the same as reported at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437492