All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: "Michał Pecio" <michal.pecio@gmail.com>,
	"Mathias Nyman" <mathias.nyman@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-usb@vger.kernel.org,
	"Niklas Neronin" <niklas.neronin@linux.intel.com>
Subject: Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
Date: Thu, 11 Apr 2024 21:55:04 +0200	[thread overview]
Message-ID: <bf5e1667-6052-4207-a2b3-e784f9b49d44@molgen.mpg.de> (raw)
In-Reply-To: <049c4850-fdb5-78fb-1d5e-0850dcd062aa@linux.intel.com>

Dear Mathias,


Thank you for your reply.

Am 11.04.24 um 09:18 schrieb Mathias Nyman:
> On 10.4.2024 10.59, Paul Menzel wrote:

>> Am 09.04.24 um 13:22 schrieb Mathias Nyman:
>>> On 8.4.2024 22.05, Michał Pecio wrote:
>>>>> It's also possible this TD/TRB was cancelled due to the disconnect.
>>>>> Could be that even if driver removes the TD from the list and cleans
>>>>> out the TRB from the ring buffer (turns TRB to no-op) hardware may
>>>>> have read ahead and cached the TRB, and process it anyway.
>>>>
>>>> I thought about it, but my debug patch says that the missing TD was
>>>> freed by finish_td(), which is called on TDs considered completed by
>>>> hardware. A cancelled TD would show giveback_invalidated_tds().
>>>>
>>>> Anyway, we now have new information from the reporter. My v2 patch
>>>> keeps a log of the last five events processed on each transfer ring
>>>> and dumps the log on TRB mismatch errors.
>>>>
>>>> Unfortunately, it looks like the host controller is broken and signals
>>>> completion of those transfers twice. The log below shows two distinct
>>>> events for TRB 32959a1c0 and that the coresponding TD has just been
>>>> freed by finish_td().
>>>
>>> The trace confirms this, we get double completion events for several
>>> Isoc TRBs. These double completions are seen after a transaction
>>> error on the same device (different endpoint). >
>>> Transfer events for TRB ..a1c0 twice, with a transaction error in 
>>> between:
>>>   <idle>-0       [000] d.h2. 33819.709897: xhci_handle_event: EVENT: TRB 000000032959a1c0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>   <idle>-0       [000] d.h2. 33819.709904: xhci_handle_event: EVENT: TRB 000000041547d010 status 'USB Transaction Error' len 4 slot 6 ep 15 type 'Transfer Event' flags e:c
>>>   systemd-journal-395     [000] d.H1. 33819.711886: xhci_handle_event: EVENT: TRB 000000032959a1c0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>
>>> Transfer events for TRB ..a1d0 twice (the next TRB)
>>>   systemd-journal-395     [000] d.H1. 33819.712001: xhci_handle_event: EVENT: TRB 000000032959a1d0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>   systemd-journal-395     [000] d.H1. 33819.712059: xhci_handle_event: EVENT: TRB 000000032959a1d0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>
>>> Transfer events for TRB ..a1e0 twice
>>>   systemd-journal-395     [000] d.H1. 33819.712139: xhci_handle_event: EVENT: TRB 000000032959a1e0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>   systemd-journal-395     [000] d.h1. 33819.712871: xhci_handle_event: EVENT: TRB 000000032959a1e0 status 'Success' len 0 slot 6 ep 2 type 'Transfer Event' flags e:c
>>>
>>> etc..
>>>
>>> Driver can cope with these extra events, but if this is common we should
>>> probably handle it silently and not concern users with that ERROR 
>>> message.
>>
>> Thank you for the detailed analysis. Excuse my ignorance, but do you 
>> have an idea, what this Sennheiser USB headset does differently than 
>> other USB devices? Additionally, is this a known problem with this 
>> Intel xHCI controller, meaning, is there an errata about this problem?
> 
> There are a few related erratas in older 9 series chipsets that possibly
> could explain this, but those issues are no longer listed for newer 
> chipsets.
> 
> The Sennheiser headset is a full-speed (FS) device that use 192 byte 
> Isoch transfers.
> 
> Series 9 chipset xHC has issues with exactly those FS Isoch transfers 
> over 189 bytes, see
> " 1. USB Isoch In Transfer Error Issue"
> 
> There are some issues related to FS device removal:
> " 13. USB Full-/low-speed Device Removal Issue"
> 
> And some related to resending transfer events for "cached" TRBs after
> FS device disconnect/reconnect.
> "25. USB xHCI may Execute a Stale Transfer Request Block (TRB)"
> 
> https://www.intel.co.jp/content/dam/www/public/us/en/documents/specification-updates/9-series-chipset-pch-spec-update.pdf

Thank you for digging this up. Judging from the document, these erratas 
were not addressed in any firmware update.

As another data point, I was able to reproduce this issue with the 
Sennheiser USB headset and a Dell XPS 15 7590.

     $ lspci -nn | grep USB
     00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH 
USB 3.1 xHCI Host Controller [8086:a36d] (rev 10)
     3a:00.0 USB controller [0c03]: Intel Corporation JHL6340 
Thunderbolt 3 USB 3.1 Controller (C step) [Alpine Ridge 2C 2016] 
[8086:15db] (rev 02)

I uploaded the logs to *Hardware for Linux* [1].


Kind regards,

Paul


[1]: https://linux-hardware.org/?probe=904c918345

  reply	other threads:[~2024-04-11 19:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 10:47 xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1 Paul Menzel
2024-04-05  9:32 ` Michał Pecio
2024-04-06 14:37   ` Paul Menzel
2024-04-06 16:36     ` Michał Pecio
2024-04-06 17:01       ` Paul Menzel
2024-04-07 12:25         ` Michał Pecio
2024-04-08  7:17           ` Mathias Nyman
2024-04-08 16:37             ` Paul Menzel
2024-04-08 16:42               ` Paul Menzel
2024-04-08 19:05             ` Michał Pecio
2024-04-09 11:22               ` Mathias Nyman
2024-04-10  7:59                 ` Paul Menzel
2024-04-11  7:18                   ` Mathias Nyman
2024-04-11 19:55                     ` Paul Menzel [this message]
2024-04-13  9:17                   ` Michał Pecio
2024-04-10  9:46                 ` Michał Pecio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf5e1667-6052-4207-a2b3-e784f9b49d44@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=mathias.nyman@linux.intel.com \
    --cc=michal.pecio@gmail.com \
    --cc=niklas.neronin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.