All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvmet: fail outstanding host posted AEN req
@ 2020-06-09 22:43 Chaitanya Kulkarni
  2020-06-09 22:50 ` Sagi Grimberg
  0 siblings, 1 reply; 2+ messages in thread
From: Chaitanya Kulkarni @ 2020-06-09 22:43 UTC (permalink / raw
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In function nvmet_async_event_process() we only process AENs iff
there is an open slot on the ctrl->async_event_cmds[] && aen
event list posted by the target is not empty. This keeps host
posted AEN outstanding if target generated AEN list is empty.
We do cleanup the target generated entries from the aen list in
nvmet_ctrl_free()-> nvmet_async_events_free() but we don't
process AEN posted by the host. This leads to following problem :-

When processing admin sq at the time of nvmet_sq_destroy() holds
an extra percpu reference(atomic value = 1), so in the following code
path after switching to atomic rcu, release function (nvmet_sq_free())
is not getting called which blocks the sq->free_done in
nvmet_sq_destroy() :-

nvmet_sq_destroy()
 percpu_ref_kill_and_confirm()
 - __percpu_ref_switch_mode()
 --  __percpu_ref_switch_to_atomic()
 ---   call_rcu() -> percpu_ref_switch_to_atomic_rcu()
 ----     /* calls switch callback */
 - percpu_ref_put()
 -- percpu_ref_put_many(ref, 1)
 --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
 ----   ref->release(ref); <---- Not called.

This results in indefinite hang:-

  void nvmet_sq_destroy(struct nvmet_sq *sq)
...
          if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) {
                  nvmet_async_events_process(ctrl, status);
                  percpu_ref_put(&sq->ref);
          }
          percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq);
          wait_for_completion(&sq->confirm_done);
          wait_for_completion(&sq->free_done); <-- Hang here

Which breaks the further disconnect sequence. This problem seems to be
introduced after commit 64f5e9cdd711b ("nvmet: fix memory leak when
removing namespaces and controllers concurrently").

This patch processes the outstanding ctrl->async_event_cmd[] until
there are no cmds available in array irrespective of aen list if
empty or not.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/target/core.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 6392bcd30bd7..fd7095ef7bcd 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -150,6 +150,19 @@ static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status)
 		nvmet_req_complete(req, status);
 		mutex_lock(&ctrl->lock);
 	}
+	/*
+	 * Target controller's host posted events needs to be explicitly
+	 * checked and cleared since there is no 1 : 1 mapping between
+	 * host posted AEN requests and target generated AENs on the
+	 * target controller's aen_list to the async_event_cmds array.
+	 */
+	while (status != 0 && ctrl->nr_async_event_cmds) {
+		req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
+		mutex_unlock(&ctrl->lock);
+		trace_nvmet_async_event(ctrl, req->cqe->result.u32);
+		nvmet_req_complete(req, status);
+		mutex_lock(&ctrl->lock);
+	}
 	mutex_unlock(&ctrl->lock);
 }
 
-- 
2.22.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] nvmet: fail outstanding host posted AEN req
  2020-06-09 22:43 [PATCH] nvmet: fail outstanding host posted AEN req Chaitanya Kulkarni
@ 2020-06-09 22:50 ` Sagi Grimberg
  0 siblings, 0 replies; 2+ messages in thread
From: Sagi Grimberg @ 2020-06-09 22:50 UTC (permalink / raw
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch


> +	/*
> +	 * Target controller's host posted events needs to be explicitly
> +	 * checked and cleared since there is no 1 : 1 mapping between
> +	 * host posted AEN requests and target generated AENs on the
> +	 * target controller's aen_list to the async_event_cmds array.
> +	 */
> +	while (status != 0 && ctrl->nr_async_event_cmds) {
> +		req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
> +		mutex_unlock(&ctrl->lock);
> +		trace_nvmet_async_event(ctrl, req->cqe->result.u32);
> +		nvmet_req_complete(req, status);
> +		mutex_lock(&ctrl->lock);
> +	}
>   	mutex_unlock(&ctrl->lock);

Its out of place in my mind, I'd prefer to have it on a dedicated
routine if we go down that route...

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-06-09 22:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-09 22:43 [PATCH] nvmet: fail outstanding host posted AEN req Chaitanya Kulkarni
2020-06-09 22:50 ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.