All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
To: Wesley Cheng <wcheng@codeaurora.org>,
	"balbi@kernel.org" <balbi@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	John Stultz <john.stultz@linaro.org>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jackp@codeaurora.org" <jackp@codeaurora.org>
Subject: Re: [PATCH] usb: dwc3: gadget: Use list_replace_init() before traversing lists
Date: Mon, 9 Aug 2021 22:07:23 +0000	[thread overview]
Message-ID: <22969fbd-c16b-9443-7673-1e0ae72c873f@synopsys.com> (raw)
In-Reply-To: <1627543994-20327-1-git-send-email-wcheng@codeaurora.org>

+ John Stultz

Wesley Cheng wrote:
> The list_for_each_entry_safe() macro saves the current item (n) and
> the item after (n+1), so that n can be safely removed without
> corrupting the list.  However, when traversing the list and removing
> items using gadget giveback, the DWC3 lock is briefly released,
> allowing other routines to execute.  There is a situation where, while
> items are being removed from the cancelled_list using
> dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
> routine is running in parallel (due to UDC unbind).  As the cleanup
> routine removes n, and the pullup disable removes n+1, once the
> cleanup retakes the DWC3 lock, it references a request who was already
> removed/handled.  With list debug enabled, this leads to a panic.
> Ensure all instances of the macro are replaced where gadget giveback
> is used.
> 
> Example call stack:
> 
> Thread#1:
> __dwc3_gadget_ep_set_halt() - CLEAR HALT
>   -> dwc3_gadget_ep_cleanup_cancelled_requests()
>     ->list_for_each_entry_safe()
>     ->dwc3_gadget_giveback(n)
>       ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
>       ->spin_unlock
>       ->Thread#2 executes
>       ...
>     ->dwc3_gadget_giveback(n+1)
>       ->Already removed!
> 
> Thread#2:
> dwc3_gadget_pullup()
>   ->waiting for dwc3 spin_lock
>   ...
>   ->Thread#1 released lock
>   ->dwc3_stop_active_transfers()
>     ->dwc3_remove_requests()
>       ->fetches n+1 item from cancelled_list (n removed by Thread#1)
>       ->dwc3_gadget_giveback()
>         ->dwc3_gadget_del_and_unmap_request()- n+1
> deleted[cancelled_list]
>         ->spin_unlock
> 
> Fix this condition by utilizing list_replace_init(), and traversing
> through a local copy of the current elements in the endpoint lists.
> This will also set the parent list as empty, so if another thread is
> also looping through the list, it will be empty on the next iteration.
> 
> Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
> 
> ---
> Previous patchset:
> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/1620716636-12422-1-git-send-email-wcheng@codeaurora.org/__;!!A4F2R9G_pg!Ngid3pREhM1FWiRmEnCGrN6FhBvSxDTkPbZ4RzAEO5Ubs0aGSxtikFT1APzTWhgw42As$ 
> ---
>  drivers/usb/dwc3/gadget.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index a29a4ca..3ce6ed9 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1926,9 +1926,13 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>  {
>  	struct dwc3_request		*req;
>  	struct dwc3_request		*tmp;
> +	struct list_head		local;
>  	struct dwc3			*dwc = dep->dwc;
>  
> -	list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
> +restart:
> +	list_replace_init(&dep->cancelled_list, &local);
> +
> +	list_for_each_entry_safe(req, tmp, &local, list) {
>  		dwc3_gadget_ep_skip_trbs(dep, req);
>  		switch (req->status) {
>  		case DWC3_REQUEST_STATUS_DISCONNECTED:
> @@ -1946,6 +1950,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>  			break;
>  		}
>  	}
> +
> +	if (!list_empty(&dep->cancelled_list))
> +		goto restart;
>  }
>  
>  static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> @@ -3190,8 +3197,12 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
>  {
>  	struct dwc3_request	*req;
>  	struct dwc3_request	*tmp;
> +	struct list_head	local;
>  
> -	list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> +restart:
> +	list_replace_init(&dep->started_list, &local);
> +
> +	list_for_each_entry_safe(req, tmp, &local, list) {
>  		int ret;
>  
>  		ret = dwc3_gadget_ep_cleanup_completed_request(dep, event,
> @@ -3199,6 +3210,9 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
>  		if (ret)
>  			break;
>  	}
> +
> +	if (!list_empty(&dep->started_list))
> +		goto restart;

This is not right. We don't cleanup the entire started list here.
Sometime we end early because some TRBs are completed but not all.

BR,
Thinh

WARNING: multiple messages have this Message-ID (diff)
From: Ray Chi <raychi@google.com>
To: thinh.nguyen@synopsys.com, Wesley Cheng <wcheng@codeaurora.org>,
	"balbi@kernel.org" <balbi@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	John Stultz <john.stultz@linaro.org>
Cc: jackp@codeaurora.org, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org, albertccwang@google.com,
	Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Subject: Re: [PATCH] usb: dwc3: gadget: Use list_replace_init() before traversing lists
Date: Tue, 10 Aug 2021 11:12:16 +0800	[thread overview]
Message-ID: <22969fbd-c16b-9443-7673-1e0ae72c873f@synopsys.com> (raw) (raw)
Message-ID: <20210810031216.3gE-_14viMrZoxWZ-FudUx5edrr9i9aqwPuGHb4WYi0@z> (raw)
In-Reply-To: <1627543994-20327-1-git-send-email-wcheng@codeaurora.org>

From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>

> + John Stultz
>
> Wesley Cheng wrote:
> > The list_for_each_entry_safe() macro saves the current item (n) and
> > the item after (n+1), so that n can be safely removed without
> > corrupting the list.  However, when traversing the list and removing
> > items using gadget giveback, the DWC3 lock is briefly released,
> > allowing other routines to execute.  There is a situation where, while
> > items are being removed from the cancelled_list using
> > dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
> > routine is running in parallel (due to UDC unbind).  As the cleanup
> > routine removes n, and the pullup disable removes n+1, once the
> > cleanup retakes the DWC3 lock, it references a request who was already
> > removed/handled.  With list debug enabled, this leads to a panic.
> > Ensure all instances of the macro are replaced where gadget giveback
> > is used.
> > 
> > Example call stack:
> > 
> > Thread#1:
> > __dwc3_gadget_ep_set_halt() - CLEAR HALT
> >   -> dwc3_gadget_ep_cleanup_cancelled_requests()
> >     ->list_for_each_entry_safe()
> >     ->dwc3_gadget_giveback(n)
> >       ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
> >       ->spin_unlock
> >       ->Thread#2 executes
> >       ...
> >     ->dwc3_gadget_giveback(n+1)
> >       ->Already removed!
> > 
> > Thread#2:
> > dwc3_gadget_pullup()
> >   ->waiting for dwc3 spin_lock
> >   ...
> >   ->Thread#1 released lock
> >   ->dwc3_stop_active_transfers()
> >     ->dwc3_remove_requests()
> >       ->fetches n+1 item from cancelled_list (n removed by Thread#1)
> >       ->dwc3_gadget_giveback()
> >         ->dwc3_gadget_del_and_unmap_request()- n+1
> > deleted[cancelled_list]
> >         ->spin_unlock
> > 
> > Fix this condition by utilizing list_replace_init(), and traversing
> > through a local copy of the current elements in the endpoint lists.
> > This will also set the parent list as empty, so if another thread is
> > also looping through the list, it will be empty on the next iteration.
> > 
> > Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
> > Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
> > 
> > ---
> > Previous patchset:
> > https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/1620716636-12422-1-git-send-email-wcheng@codeaurora.org/__;!!A4F2R9G_pg!Ngid3pREhM1FWiRmEnCGrN6FhBvSxDTkPbZ4RzAEO5Ubs0aGSxtikFT1APzTWhgw42As$ 
> > ---
> >  drivers/usb/dwc3/gadget.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index a29a4ca..3ce6ed9 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1926,9 +1926,13 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> >  {
> >  	struct dwc3_request		*req;
> >  	struct dwc3_request		*tmp;
> > +	struct list_head		local;
> >  	struct dwc3			*dwc = dep->dwc;
> >  
> > -	list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
> > +restart:
> > +	list_replace_init(&dep->cancelled_list, &local);
> > +
> > +	list_for_each_entry_safe(req, tmp, &local, list) {
> >  		dwc3_gadget_ep_skip_trbs(dep, req);
> >  		switch (req->status) {
> >  		case DWC3_REQUEST_STATUS_DISCONNECTED:
> > @@ -1946,6 +1950,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> >  			break;
> >  		}
> >  	}
> > +
> > +	if (!list_empty(&dep->cancelled_list))
> > +		goto restart;
> >  }
> >  
> >  static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> > @@ -3190,8 +3197,12 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> >  {
> >  	struct dwc3_request	*req;
> >  	struct dwc3_request	*tmp;
> > +	struct list_head	local;
> >  
> > -	list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > +restart:
> > +	list_replace_init(&dep->started_list, &local);
> > +
> > +	list_for_each_entry_safe(req, tmp, &local, list) {
> >  		int ret;
> >  
> >  		ret = dwc3_gadget_ep_cleanup_completed_request(dep, event,
> > @@ -3199,6 +3210,9 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> >  		if (ret)
> >  			break;

I also met the connection issue. The problem is related that dwc3 requests
in local list are ignored due to loop break.

> >  	}
> > +
> > +	if (!list_empty(&dep->started_list))
> > +		goto restart;
>
> This is not right. We don't cleanup the entire started list here.
> Sometime we end early because some TRBs are completed but not all.

Yes, I also think it can be replaced with checking local list and
restoring unhandled requests directly.

> BR,
> Thinh
>

Best regards,
Ray

  parent reply	other threads:[~2021-08-09 22:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29  7:33 [PATCH] usb: dwc3: gadget: Use list_replace_init() before traversing lists Wesley Cheng
2021-07-29  8:09 ` Felipe Balbi
2021-07-29  8:45   ` Wesley Cheng
2021-07-29  9:31     ` Felipe Balbi
2021-07-29 14:20   ` Alan Stern
2021-08-09 21:04 ` John Stultz
2021-08-09 22:31   ` [RFC][PATCH] dwc3: gadget: Fix losing list items in dwc3_gadget_ep_cleanup_completed_requests() John Stultz
2021-08-09 22:44     ` Thinh Nguyen
2021-08-09 22:53       ` John Stultz
2021-08-09 22:57         ` Thinh Nguyen
2021-08-10  6:05           ` Greg Kroah-Hartman
2021-08-10  7:11             ` Greg Kroah-Hartman
2021-08-10 17:11           ` Wesley Cheng
2021-08-10 20:14             ` Thinh Nguyen
2021-08-10 20:17               ` Thinh Nguyen
2021-08-10 23:40                 ` Thinh Nguyen
2021-08-09 21:26 ` [PATCH] usb: dwc3: gadget: Use list_replace_init() before traversing lists John Stultz
2021-08-09 22:07 ` Thinh Nguyen [this message]
2021-08-10  3:12   ` Ray Chi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22969fbd-c16b-9443-7673-1e0ae72c873f@synopsys.com \
    --to=thinh.nguyen@synopsys.com \
    --cc=balbi@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jackp@codeaurora.org \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=wcheng@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.