All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Max Gurtovoy <mgurtovoy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	axboe@kernel.dk, sagi@grimberg.me
Subject: Re: [PATCHv2 1/3] block: introduce rq_list_for_each_safe macro
Date: Wed, 5 Jan 2022 09:26:25 -0800	[thread overview]
Message-ID: <20220105172625.GA3181467@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <ac74ac4c-15f3-997e-ecd2-5e704a5b4573@nvidia.com>

On Tue, Jan 04, 2022 at 02:15:58PM +0200, Max Gurtovoy wrote:
> 
> This patch worked for me with 2 namespaces for NVMe PCI.
> 
> I'll check it later on with my RDMA queue_rqs patches as well. There we have
> also a tagset sharing with the connect_q (and not only with multiple
> namespaces).
> 
> But the connect_q is using a reserved tags only (for the connect commands).
> 
> I saw some strange things that I couldn't understand:
> 
> 1. running randread fio with libaio ioengine didn't call nvme_queue_rqs -
> expected
> 
> *2. running randwrite fio with libaio ioengine did call nvme_queue_rqs - Not
> expected !!*
> 
> *3. running randread fio with io_uring ioengine (and --iodepth_batch=32)
> didn't call nvme_queue_rqs - Not expected !!*
> 
> 4. running randwrite fio with io_uring ioengine (and --iodepth_batch=32) did
> call nvme_queue_rqs - expected
> 
> 5. *running randread fio with io_uring ioengine (and --iodepth_batch=32
> --runtime=30) didn't finish after 30 seconds and stuck for 300 seconds (fio
> jobs required "kill -9 fio" to remove refcounts from nvme_core)   - Not
> expected !!*
> 
> *debug pring: fio: job 'task_nvme0n1' (state=5) hasn't exited in 300
> seconds, it appears to be stuck. Doing forceful exit of this job.
> *
> 
> *6. ***running randwrite fio with io_uring ioengine (and  --iodepth_batch=32
> --runtime=30) didn't finish after 30 seconds and stuck for 300 seconds (fio
> jobs required "kill -9 fio" to remove refcounts from nvme_core)   - Not
> expected !!**
> 
> ***debug pring: fio: job 'task_nvme0n1' (state=5) hasn't exited in 300
> seconds, it appears to be stuck. Doing forceful exit of this job.***
> 
> 
> any idea what could cause these unexpected scenarios ? at least unexpected
> for me :)

Not sure about all the scenarios. I believe it should call queue_rqs
anytime we finish a plugged list of requests as long as the requests
come from the same request_queue, and it's not being flushed from
io_schedule().

The stuck fio job might be a lost request, which is what this series
should address. It would be unusual to see such an error happen in
normal operation, though. I had to synthesize errors to verify the bug
and fix.

In any case, I'll run more multi-namespace tests to see if I can find
any other issues with shared tags.

  reply	other threads:[~2022-01-05 17:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-27 16:41 [PATCHv2 1/3] block: introduce rq_list_for_each_safe macro Keith Busch
2021-12-27 16:41 ` [PATCHv2 2/3] block: introduce rq_list_move Keith Busch
2021-12-27 18:49   ` kernel test robot
2021-12-27 18:49     ` kernel test robot
2021-12-29 17:41   ` Christoph Hellwig
2021-12-29 20:59     ` Keith Busch
2021-12-27 16:41 ` [PATCHv2 3/3] nvme-pci: fix queue_rqs list splitting Keith Busch
2021-12-29 17:46   ` Christoph Hellwig
2021-12-29 21:04     ` Keith Busch
2021-12-30  7:53       ` Christoph Hellwig
2022-01-04 19:38     ` Keith Busch
2022-01-05  7:35       ` Christoph Hellwig
2021-12-29 17:39 ` [PATCHv2 1/3] block: introduce rq_list_for_each_safe macro Christoph Hellwig
2021-12-29 20:57   ` Keith Busch
2021-12-30 14:38     ` Max Gurtovoy
2021-12-30 15:30       ` Keith Busch
2022-01-03 15:23         ` Max Gurtovoy
2022-01-03 18:15           ` Keith Busch
2022-01-04 12:15             ` Max Gurtovoy
2022-01-05 17:26               ` Keith Busch [this message]
2022-01-06 11:54                 ` Max Gurtovoy
2022-01-06 13:41                   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220105172625.GA3181467@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.