2015-06-21 19:25 keltezéssel, Jiang Liu írta: > On 2015/6/21 22:19, Boszormenyi Zoltan wrote: >> 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: >>> [+cc linux-pci] >>> >>> Hi Boszormenyi, >>> >>> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan wrote: >>>> Hi, >>>> >>>> please, cc me, I am not subscribed to lkml. >>>> >>>>> Hi, >>>>> >>>>> [lkml.org still broken --> no accurate mail header info possible...] >>>>> >>>>> Just to ask the obvious: >>>>> I assume using /sys/bus/pci/rescan does not help once it's broken? >>>>> (since the machine comes up empty at initial-boot scan, too) >>>> I will try it, too, but I am not sure it would work. >>>> >>>> Currently I can't test it because the last time I completely discharged >>>> the battery. I also disconnected it to be able to get the realtek chip back >>>> immediately for faster testing. Now, that I have reconnected the battery, >>>> I need to wait for it to be charged somewhat to be able to reproduce >>>> losing the network chip. >>>> >>>>> Also, you could try diffing lspci -vvxxx -s.... output >>>>> of working vs. "distorting" kernel version - perhaps some register setup >>>>> has been changed (e.g. due to power management improvements or some such), >>>>> which may encourage the card >>>>> to get a problematic/corrupt state. >>>> I attached a tarball that contains lspci -vvxxx for >>>> - all devices / only the network chip >>>> - before / after "modprobe r8169" >>>> - for all 3 kernel versions tested. >>>> >>>> I figured out that if I type the modprobe and lspci in the same command line, >>>> I can get diagnostics out of the machine, after all. >>>> >>>> It's not just the Realtek chip that has changed parameters. >>>> >>>> (Vague idea) I noticed that some devices have changed like this: >>>> >>>> - Memory behind bridge: 80000000-801fffff >>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff >>>> + Memory behind bridge: ff000000-ff1fffff >>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff >>>> >>>> Can't this cause a problem? E.g. programming the bridge with an address range >>>> that the bridge doesn't actually support? >>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You >>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a >>> v3.18.16 dmesg log, so we can compare them? >> I collected all 3 for you to compare them, compressed, attached. >> >> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 >> as suspicious. I will try the 4.0/4.1 kernels with this one reverted. >> >>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at >>> the code to see what might be going on: >>> >>> acpi PNP0A08:00: host bridge window expanded to [mem >>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] >>> ignored >>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff >>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem >>> 0xf0000000-0xfed8ffff window] >>> >>> Bjorn > Hi Bjorn and Boszormenyi, > From the 3.18 kernel, we got a message: > [ 0.126248] acpi PNP0A08:00: host bridge window > [0x400000000-0xfffffffff] (ignored, not CPU addressable) > And from 4.1.-rc8, we got another message: > [ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem > 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored > > That smells like a 32bit overflow or 64bit cut-off issue. > > Hi Boszormenyi, could you please help to provide acpidump from the > machine? I already did in a previous mail which was only sent to LKML, but here it is again. Thanks, Zoltán > Thanks! > Gerry > > > >