linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Myron Stowe <mstowe@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Bjørn Mork" <bjorn@mork.no>,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Kay Sievers" <kay@vrfy.org>,
	"Myron Stowe" <myron.stowe@redhat.com>,
	linux-hotplug@vger.kernel.org, linux-pci@vger.kernel.org,
	yuxiangl@marvell.com, yxlraid@gmail.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] udevadm-info: Don't access sysfs 'resource<N>' files
Date: Tue, 19 Mar 2013 17:06:54 +0000	[thread overview]
Message-ID: <1363712814.2399.64.camel@zim.stowe> (raw)
In-Reply-To: <1363712270.2399.62.camel@zim.stowe>

On Tue, 2013-03-19 at 10:57 -0600, Myron Stowe wrote:
> On Mon, 2013-03-18 at 12:59 -0600, Alex Williamson wrote:
> > On Mon, 2013-03-18 at 19:25 +0100, Bjørn Mork wrote:
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > 
> > > >On Mon, 2013-03-18 at 18:20 +0100, Bjørn Mork wrote:
> > > >> Alex Williamson <alex.williamson@redhat.com> writes:
> > > >> 
> > > >> > At least for KVM the kernel fix is the addition of the vfio driver
> > > >which
> > > >> > gives us a non-sysfs way to do this.  If this problem was found a
> > > >few
> > > >> > years later and we were ready to make the switch I'd support just
> > > >> > removing these resource files.  In the meantime we have userspace
> > > >that
> > > >> > depends on this interface, so I'm open to suggestions how to fix
> > > >it.
> > > >> 
> > > >> I am puzzled by a couple of things in this discussion:
> > > >> 
> > > >> 1) do you seriously mean that a userspace application (any, not just
> > > >>    udevadm or qemu or whatever) should be able to read and write
> > > >these
> > > >>    registers while the device is owned by a driver?  How is that ever
> > > >>    going to work?
> > > >
> > > >The expectation is that the user doesn't mess with the device through
> > > >pci-sysfs while it's running.  This is really no different than config
> > > >space or MMIO space in that respect. 
> > > 
> > > But it is.  That's the problem. As a user I expect to be able to run
> > > e.g "grep . /sys/devices/whatever/*" with no ill effects. This holds
> > > for config space or MMIO space. It does not for any reset-on-read
> > > register.
> > 
> > As a non-admin user you can
> > 
> > > > You can use setpci to break your
> > > >PCI card while it's used by the driver today.  The difference is that
> > > >MMIO spaces side-step the issue by only allowing mmap and config space
> > > >is known not to have read side-effects.
> > > 
> > > Yes. And that is why there is no problem exporting those. This
> > > difference is fundamental. 
> > 
> > So how do we side-step the problem with I/O port registers?  If we
> > remove them then KVM needs to run with iopl which is a pretty serious
> > security hole should QEMU be exploited.  We could activate the resource
> > files only when the device is bound to pci-assign, but that only limits
> > the scope and might break UIO drivers.  We could modify the file to have
> > an enable sequence, but we can't do this without breaking current
> > userspace.  As I mentioned, the VFIO driver is intended to replace KVM's
> > use of these files, but we're not ready to rip it out, perhaps not even
> > ready to declare it deprecated.
> > 
> > > >> 2) is it really so that a device can be so fundamentally screwed up
> > > >by
> > > >>    reading some registers, that a later driver probe cannot properly
> > > >>    reinitialize it?
> > > >
> > > >Never underestimate how broken hardware can be, 
> > > 
> > > True :)
> > > 
> > > > though in this case
> > > >reading a device register seems to be causing a system hang/reset.
> > > 
> > > I understand that it does so if the ahci driver is bound to the device
> > > while reading the registers, but does it also hang the system with no
> > > bound driver? How does it do that? By killing the bus?
> > 
> > I don't know, Myron?
> 
> Yes - the system hangs when BAR1's (and likely BAR3's) I/O port space is
> read.
Sorry - that wasn't very explicit.  Just accessing BAR1's region as
udevadm does is enough to hang the system - even when no driver is
bound.
> 
> Here are the details that I've been able to put together from the two
> linux-pci threads and various online sources -
> 
> 
> From Robert Hancock - "... BAR5 is the MMIO region used by the AHCI
> driver. BARs 0-4 are the legacy SFF-compatible ATA ports.  Nothing
> should be messing with those IO ports while AHCI is enabled. ..."  This
> likely explains why the system boots and runs fine as long as the
> 'udevadm ...' command is *not* ran (i.e. the driver never accesses the
> I/O port BARs).
> 
> Using a SATA controller I have access to as an example for the details
> (Note: I do not have access to a system with the Marvell 9125 device):
>   00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port  SATA AHCI Controller (rev 06) (prog-if 01 [AHCI 1.0])
>   Subsystem: Lenovo Device 2168
>   Region 0: I/O ports at 1860 [size=8]
>   Region 1: I/O ports at 1814 [size=4]
>   Region 2: I/O ports at 1818 [size=8]
>   Region 3: I/O ports at 1810 [size=4]
>   Region 4: I/O ports at 1840 [size2]
>   Region 5: Memory at f2827000 (32-bit, non-prefetchable) [size=2K]
> 
> I/O port registers [1][2]:
>   Primary IDE controller  [0x1860-0x1867; 0x1814-0x1817]
>   BAR0  Base address for the command block registers for ATA Channel X
>     0x1860  (Read/Write): Data Register
>     0x1861  (Read): Error Register
>     0x1861  (Write): Features Register
>     0x1862  (Read/Write): Sector Count Register
>     0x1863  (Read/Write): LBA Low Register
>     0x1864  (Read/Write): LBA Mid Register
>     0x1865  (Read/Write): LBA High Register
>     0x1866  (Read/Write): Drive/Head Register
>     0x1867  (Read): Status Register
>     0x1867  (Write): Command Register
>   BAR1* Base address for the control register for ATA Channel X
>     0x1814  Reserved
>     0x1815  Reserved
>     0x1816  (Read): Alternate Status Register
>     0x1816  (Write): Device Control Register
>     0x1817  Reserved
> 
> * The base must be Dword aligned; a PCI requirement.  The Device Control
>   and Alternate Status Registers are at ofset 0x2 from this base.
> 
> [1] www.t13.org/documents/UploadedDocuments/project/d1510r1-Host-Adapter.pdf
> [2] lateblt.tripod.com/atapi.htm
> 
> From Xiangliang - executing 'udevadm ...' causes a 32-bit I/O port read
> to BAR1's region.  This is shown by the BE (Byte Enable) value of
> 0x1111.  So apparently reads to this region that include any of reserved
> Bytes causes "the chip will go bad."
> 
> So, only a Byte access at offset 2 is successful.  I have not been able
> to get any more details as to the exact cause of the hang.  I would have
> thought that the PCI transaction would have just timed out, or errored
> out, or something but apparently the platform ends up hanging.
> 
> It appears that this device did not implement the reserved registers
> such that they would return 0 on reads or something more similarly sane.
> 
> Since BARs 2 and 3 are not 0, indicating the device only supports one
> channel, I expect the same issue will occur when accessing BAR3.  Again,
> I do not have access to a system with this device to test with.
> 
> > 
> > > >> I would have thought that the solution to all this was to return
> > > >-EINVAL
> > > >> on any attemt to read or write these files while a driver is bound to
> > > >> the device.  If userspace is going to use the API, then the
> > > >application
> > > >> better unbind any driver first.
> > > >> 
> > > >> Or? Am I missing something here?
> > > >
> > > >That doesn't really solve anything though.  Let's pretend the resource
> > > >files only work while the device is bound to pci-stub.  Now what
> > > >happens
> > > >when you run this udevadm command as admin while it's in use by the
> > > >userspace driver?  All we've done is limit the scope of the problem.
> > > 
> > > Assuming that the system hangs without driver help and that this
> > > brokenness is widespread. I don't think any of those assumptions hold.
> > > Do they?
> > 
> > I thought it was true that for this device a system hang happened
> > regardless of the host driver, but haven't seen the original bug report.
> > As for widespread, this is the first I've heard of problems in the 2.5+
> > years that we've supported these I/O port resource files.  The rest is
> > probably just FUD about random userspace apps trolling through device
> > registers.
> > 
> > > >> > If we want to blacklist this specific device, that's fine, but as
> > > >others
> > > >> > have pointed out it's really a class problem.  Perhaps we report 1
> > > >byte
> > > >> > extra for the file length where EOF-1 is an enable byte?  Is there
> > > >> > anything else in file ops that we could use to make it slightly
> > > >more
> > > >> > complicated than open(), read() to access the device?  Thanks,
> > > >> 
> > > >> If there really are devices which cannot handle reading at all, and
> > > >> cannot be reset to a sane state by later driver initialization, then
> > > >a
> > > >> blacklist could be added for those devices.  This should not be a
> > > >common
> > > >> problem.
> > > >
> > > >Yes, if these are dead registers, let's blacklist and move along.  I
> > > >suspect though that these registers probably work fine if you access
> > > >them according to the device programming model, so blacklisting just
> > > >prevents full use through something like KVM device assignment. 
> > > 
> > > Well, if the device is that broken then I think it will require the
> > > kernel to police the device programming. I don't see how you can leave
> > > a bomb like that because it might be useful in a rare and very
> > > theoretical case.
> > > 
> > > Easier to just blacklist it...
> > 
> > Easier, yes.  But it likely just kicks the problem down the road until
> > the next device.  Thanks,
> > 
> > Alex
> > 
> > 
> 
> 



  reply	other threads:[~2013-03-19 17:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-16 21:35 [PATCH] udevadm-info: Don't access sysfs entries backing device I/O port space Myron Stowe
2013-03-16 21:35 ` [PATCH] udevadm-info: Don't access sysfs 'resource<N>' files Myron Stowe
2013-03-16 22:11   ` Greg KH
2013-03-16 22:55     ` Bjorn Helgaas
2013-03-16 23:50     ` Myron Stowe
2013-03-17  1:03       ` Greg KH
2013-03-17  4:11         ` Alex Williamson
2013-03-17  5:36           ` Greg KH
2013-03-17 13:38             ` Alex Williamson
2013-03-17 14:00               ` Kay Sievers
2013-03-17 14:20                 ` Myron Stowe
2013-03-17 14:29                   ` Kay Sievers
2013-03-17 14:36                     ` Myron Stowe
2013-03-17 14:43                       ` Kay Sievers
2013-03-18 16:24                 ` Alex Williamson
2013-03-18 16:41                   ` Greg KH
2013-03-18 16:51                     ` Alex Williamson
2013-03-18 17:20                       ` Bjørn Mork
2013-03-18 17:54                         ` Alex Williamson
2013-03-18 18:02                           ` Robert Brown
2013-03-18 18:25                           ` Bjørn Mork
2013-03-18 18:59                             ` Alex Williamson
2013-03-19 16:57                               ` Myron Stowe
2013-03-19 17:06                                 ` Myron Stowe [this message]
2013-03-17 14:33               ` Myron Stowe
2013-03-17 22:28                 ` Alex Williamson
2013-03-18 14:50                   ` Don Dutile
2013-03-18 16:34                     ` Alex Williamson
2013-03-17 14:12         ` Myron Stowe
2013-03-19  1:54         ` Robert Hancock
2013-03-19  2:03           ` Greg KH
2013-03-19  2:09             ` Robert Hancock
2013-03-19  2:35               ` Greg KH
2013-03-19  3:08                 ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1363712814.2399.64.camel@zim.stowe \
    --to=mstowe@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bjorn@mork.no \
    --cc=gregkh@linuxfoundation.org \
    --cc=kay@vrfy.org \
    --cc=linux-hotplug@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=myron.stowe@redhat.com \
    --cc=yuxiangl@marvell.com \
    --cc=yxlraid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).