All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Jack Pham <jackp@codeaurora.org>
To: Sandeep Maheswaram <sanm@codeaurora.org>
Cc: Felipe Balbi <balbi@kernel.org>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	p.zabel@pengutronix.de, linux-usb@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	arm-mail-list <linux-arm-kernel@lists.infradead.org>,
	linux-arm-msm@vger.kernel.org
Subject: Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove()
Date: Tue, 15 Jun 2021 17:13:41 -0700	[thread overview]
Message-ID: <20210616001341.GC25299@jackp-linux.qualcomm.com> (raw)
In-Reply-To: <d9ab95a1-f901-6bfe-899b-e4577d14cb52@codeaurora.org>

Hi Sandeep,

On Mon, Jun 14, 2021 at 01:07:11PM +0530, Sandeep Maheswaram wrote:
> 
> On 6/10/2021 9:03 PM, Jack Pham wrote:
> > On Thu, Jun 10, 2021 at 01:11:42PM +0300, Felipe Balbi wrote:
> > > Jack Pham <jackp@codeaurora.org> writes:
> > > > On Fri, Jun 04, 2021 at 11:20:12AM +0300, Felipe Balbi wrote:
> > > > > Jack Pham <jackp@codeaurora.org> writes:
> > > > > > > > > > > Alexandru Elisei <alexandru.elisei@arm.com> writes:
> > > > > > > > > > > > I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb:
> > > > > > > > > > > > dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic
> > > > > > > > > > > > information if needed and I can help test the fix.
> > > > > > > > > > > if you simply revert that commit in HEAD, does the problem really go
> > > > > > > > > > > away?
> > > > > > > > > > Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is
> > > > > > > > > > there. Reverting the offending commit, 568262bf5492, makes the panic disappear.
> > > > > > > > > Want to send a revert so I can take it now?
> > > > > > > > I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix,
> > > > > > > > so I'll leave it up to Felipe to decide how to proceed.
> > > > > > > I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi
> > > > > > > <balbi@kernel.org> or it.
> > > > > > > 
> > > > > > > Sandeep, please send a new version that doesn't encounter the same
> > > > > > > issue. Make sure to test by reloading the driver in a tight loop for
> > > > > > > several iterations.
> > > > > > This would probably be tricky to test on other "glue" drivers as the
> > > > > > problem appears to be specific only to dwc3_of_simple.  It looks like
> > > > > > both dwc3_of_simple and the dwc3 core now (due to 568262bf5492) each
> > > > > > implement respective .shutdown callbacks. The latter is simply a wrapper
> > > > > > around dwc3_remove(). And from the panic call stack above we see that
> > > > > > dwc3_of_simple_shutdown() calls of_platform_depopulate() which will
> > > > > > again call dwc3_remove() resulting in the double remove.
> > > > > > 
> > > > > > So would an alternative approach be to protect against dwc3_remove()
> > > > > > getting called multiple times? IMO it'd be a bit messy to have to add
> > > > > no, I  don't think so. That sounds like a workaround. We should be able
> > > > > to guarantee that ->remove() doesn't get called twice using the driver
> > > > > model properly.
> > > > Completely fair.  So then having a .shutdown callback that directly calls
> > > > dwc3_remove() is probably not the right thing to do as it completely
> > > > bypasses the driver model so if and when the driver core does later
> > > > release the device from the driver that's how we end up with the double
> > > > remove.
> > > yeah, I would agree with that.
> > > 
> > > > > > additional checks there to know if it had already been called. So maybe
> > > > > > avoid it altogether--should dwc3_of_simple_shutdown() just skip calling
> > > > > > of_platform_depopulate()?
> > > > > I don't know what the idiomatic is nowadays, but at least early on, we
> > > > > had to call depopulate.
> > > > So any suggestions on how to fix the original issue Sandeep was trying
> > > > to fix with 568262bf5492? Maybe implement .shutdown in dwc3_qcom and have
> > > > it follow what dwc3_of_simple does with of_platform_depopulate()? But
> > > > then wouldn't other "glues" want/need to follow suit?
> > > I think we can implement shutdown in core, but we need to careful with
> > > it. Instead of just blindly calling remove, let's extract the common
> > > parts to another internal function that both remove and shutdown
> > > call. debugfs removal should not be part of that generic method :-)
> > Hi Sandeep,
> > 
> > Upon re-reading your description in 568262bf5492 it sounds like the
> > original intention of your patch is basically to quiesce the HW so that
> > it doesn't continue to run after SMMU/IOMMU is disabled right?
> > 
> > If that is the case, couldn't we simply call only dwc3_core_exit_mode()
> > assuming there is no other requirement to do any other cleanup/teardown
> > (PHYs, clocks, resets, runtime PM, et al)? This function should do the
> > bare minimum of stopping the controller in whatever mode (host or
> > peripheral) it is currently operating in.
> 
> Yes that was the intention. I will call only dwc3_core_exit_mode()
> and check. Is there any way we can do from dwc3 qcom glue driver to
> avoid problems for other glue drivers?

As I mentioned above maybe you could just implement a dwc3_qcom specific
.shutdown callback which mimics what dwc3_of_simple() does by calling
of_platform_depopulate(). This will allow the kernel driver core to
invoke dwc3_remove() rather than calling it directly yourself.

The downside is that if other glue drivers want to follow this they'd
have to duplicate the same logic. But maybe this is a more cautious
approach until we start seeing other drivers needing this generically
within core.c.

> > > Anything in that generic method should, probably, be idempotent.
> > Yes we'll need to ensure that dwc3_core_exit_mode() can be called
> > multiple times without additional side effects. At first glance this
> > probably means setting dwc->xhci and dwc->gadget to NULL from
> > dwc3_host_exit() and dwc3_gadget_exit(), respectively.
> 
> Ok. Is there any way to test this ?

You could implement both the dwc3_qcom_shutdown() as above as well as
adding back dwc3_shutdown() which only does dwc3_core_exit_mode(). Make
sure that even though dwc3_core_exit_mode() gets called twice nothing
bad happens.

Jack
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

      reply	other threads:[~2021-06-16  0:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 11:02 [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() Alexandru Elisei
2021-06-01 11:02 ` Alexandru Elisei
2021-06-03  3:07 ` Peter Chen
2021-06-03  3:07   ` Peter Chen
2021-06-03  6:30 ` Felipe Balbi
2021-06-03  6:30   ` Felipe Balbi
2021-06-03 10:41   ` Alexandru Elisei
2021-06-03 10:41     ` Alexandru Elisei
2021-06-03 11:40     ` Greg Kroah-Hartman
2021-06-03 11:40       ` Greg Kroah-Hartman
2021-06-03 12:05       ` Alexandru Elisei
2021-06-03 12:05         ` Alexandru Elisei
2021-06-03 14:35         ` Felipe Balbi
2021-06-03 14:35           ` Felipe Balbi
2021-06-03 17:36           ` Jack Pham
2021-06-04  8:20             ` Felipe Balbi
2021-06-04  8:20               ` Felipe Balbi
2021-06-07 18:00               ` Jack Pham
2021-06-10 10:11                 ` Felipe Balbi
2021-06-10 10:11                   ` Felipe Balbi
2021-06-10 15:33                   ` Jack Pham
2021-06-14  7:37                     ` Sandeep Maheswaram
2021-06-16  0:13                       ` Jack Pham [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210616001341.GC25299@jackp-linux.qualcomm.com \
    --to=jackp@codeaurora.org \
    --cc=alexandru.elisei@arm.com \
    --cc=balbi@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=p.zabel@pengutronix.de \
    --cc=sanm@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.