From: Martin Wilck <mwilck@suse.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
Mikulas Patocka <mpatocka@redhat.com>,
Alasdair G Kergon <agk@redhat.com>,
dm-devel@lists.linux.dev, Hannes Reinecke <hare@suse.de>,
Vasilis Liaskovitis <vliaskovitis@suse.com>
Subject: Re: [RFC Patch] dm: make sure to wait for all dispatched requests in __dm_suspend()
Date: Wed, 20 Mar 2024 10:51:50 +0100 [thread overview]
Message-ID: <9c19b0e67ffc172121590a4b71a0ce5e9367b8a9.camel@suse.com> (raw)
In-Reply-To: <ZfpSFA9NlLKZyBDg@fedora>
On Wed, 2024-03-20 at 11:03 +0800, Ming Lei wrote:
> On Tue, Mar 19, 2024 at 04:41:26PM +0100, Martin Wilck wrote:
> >
> > What we know for sure is that there was a bad dm_target reference
> > in
> > (struct dm_rq_target_io *tio)->ti:
> >
> > crash> struct -x dm_rq_target_io c00000245ca90128
> > struct dm_rq_target_io {
> > md = 0xc0000031c66a4000,
> > ti = 0xc0080000020d0080 <fscache_object_list_lock+665632>,
> >
> > crash> struct -x dm_target 0xc0080000020d0080
> > struct dm_target struct: invalid kernel virtual address:
> > c0080000020d0080 type: "gdb_readmem_callback"
> >
> > The question is how this could have come to pass. It can only
> > happen
> > if tio->ti had been set before the map was reloaded.
> > My theory is that the IO had been dispatched before the queue had
> > been
> > quiesced, like this:
> >
> > Task A Task B
> > (dispatching IO) (executing a DM_SUSPEND
> > ioctl to
> > resume after DM_TABLE_LOAD)
> > do_resume()
> > dm_suspend()
> > __dm_suspend()
> > dm_mq_queue_rq()
> > struct dm_target *ti =
> > md->immutable_target;
> > dm_stop_queue()
> >
> > blk_mq_quiesce_queue()
> > /*
> > * At this point, the queue is quiesced, but task A
> > * has alreadyentered dm_mq_queue_rq()
> > */
>
> That shouldn't happen, blk_mq_quiesce_queue() drains all pending
> dm_mq_queue_rq() and prevents new dm_mq_queue_rq() from being
> called.
Thanks for pointing this out. I'd been missing the fact that the
synchronization is achieved by the rcu_read_lock() in
__blk_mq_run_dispatch_ops(), which guards invocations of the
request dispatching code against the synchronize_rcu() in
blk_mq_wait_quiesce_done(). In our old kernel it was still in
hctx_lock(), but with the same effect.
This means that don't see any more how our dm_target reference could
have pointed to freed memory. For now, we'll follow Mike's advice.
Thanks a lot,
Martin
prev parent reply other threads:[~2024-03-20 9:51 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-15 23:10 [RFC Patch] dm: make sure to wait for all dispatched requests in __dm_suspend() Martin Wilck
2024-03-19 13:04 ` Ming Lei
2024-03-19 15:41 ` Martin Wilck
2024-03-19 16:53 ` Mike Snitzer
2024-03-19 21:01 ` Martin Wilck
2024-03-20 3:03 ` Ming Lei
2024-03-20 9:51 ` Martin Wilck [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9c19b0e67ffc172121590a4b71a0ce5e9367b8a9.camel@suse.com \
--to=mwilck@suse.com \
--cc=agk@redhat.com \
--cc=dm-devel@lists.linux.dev \
--cc=hare@suse.de \
--cc=ming.lei@redhat.com \
--cc=mpatocka@redhat.com \
--cc=snitzer@redhat.com \
--cc=vliaskovitis@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).