* [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed
@ 2021-12-27 12:38 Dust Li
2022-01-03 11:27 ` Leon Romanovsky
2022-01-05 19:06 ` Jason Gunthorpe
0 siblings, 2 replies; 3+ messages in thread
From: Dust Li @ 2021-12-27 12:38 UTC (permalink / raw
To: linux-rdma, linux-kernel; +Cc: Leon Romanovsky, Jason Gunthorpe
mlx5_handle_error_cqe() only dump the content of the CQE
which is raw hex data, and not straighforward for debug.
Print WC status message when we got CQE error and dump is
need.
Here is an example of how the dmesg log looks like with this:
[166755.330649] infiniband mlx5_0: mlx5_handle_error_cqe:333:(pid 0): WC error: 10, message: remote access error
[166755.332323] infiniband mlx5_0: dump_cqe:272:(pid 0): dump error cqe
[166755.332944] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[166755.333574] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[166755.334202] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[166755.334837] 00000030: 00 00 00 00 00 00 88 13 08 03 61 b3 1e a1 42 d3
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
---
drivers/infiniband/hw/mlx5/cq.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index a190fb581591..66dfadb96c66 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -328,8 +328,11 @@ static void mlx5_handle_error_cqe(struct mlx5_ib_dev *dev,
}
wc->vendor_err = cqe->vendor_err_synd;
- if (dump)
+ if (dump) {
+ mlx5_ib_warn(dev, "WC error: %d, Message: %s\n",
+ wc->status, ib_wc_status_msg(wc->status));
dump_cqe(dev, cqe);
+ }
}
static void handle_atomics(struct mlx5_ib_qp *qp, struct mlx5_cqe64 *cqe64,
--
2.19.1.3.ge56e4f7
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed
2021-12-27 12:38 [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed Dust Li
@ 2022-01-03 11:27 ` Leon Romanovsky
2022-01-05 19:06 ` Jason Gunthorpe
1 sibling, 0 replies; 3+ messages in thread
From: Leon Romanovsky @ 2022-01-03 11:27 UTC (permalink / raw
To: Dust Li; +Cc: linux-rdma, linux-kernel, Jason Gunthorpe
On Mon, Dec 27, 2021 at 08:38:06PM +0800, Dust Li wrote:
> mlx5_handle_error_cqe() only dump the content of the CQE
> which is raw hex data, and not straighforward for debug.
> Print WC status message when we got CQE error and dump is
> need.
>
> Here is an example of how the dmesg log looks like with this:
> [166755.330649] infiniband mlx5_0: mlx5_handle_error_cqe:333:(pid 0): WC error: 10, message: remote access error
> [166755.332323] infiniband mlx5_0: dump_cqe:272:(pid 0): dump error cqe
> [166755.332944] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.333574] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.334202] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.334837] 00000030: 00 00 00 00 00 00 88 13 08 03 61 b3 1e a1 42 d3
>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> ---
> drivers/infiniband/hw/mlx5/cq.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
Thanks,
Acked-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed
2021-12-27 12:38 [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed Dust Li
2022-01-03 11:27 ` Leon Romanovsky
@ 2022-01-05 19:06 ` Jason Gunthorpe
1 sibling, 0 replies; 3+ messages in thread
From: Jason Gunthorpe @ 2022-01-05 19:06 UTC (permalink / raw
To: Dust Li; +Cc: linux-rdma, linux-kernel, Leon Romanovsky
On Mon, Dec 27, 2021 at 08:38:06PM +0800, Dust Li wrote:
> mlx5_handle_error_cqe() only dump the content of the CQE
> which is raw hex data, and not straighforward for debug.
> Print WC status message when we got CQE error and dump is
> need.
>
> Here is an example of how the dmesg log looks like with this:
> [166755.330649] infiniband mlx5_0: mlx5_handle_error_cqe:333:(pid 0): WC error: 10, message: remote access error
> [166755.332323] infiniband mlx5_0: dump_cqe:272:(pid 0): dump error cqe
> [166755.332944] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.333574] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.334202] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [166755.334837] 00000030: 00 00 00 00 00 00 88 13 08 03 61 b3 1e a1 42 d3
>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> Acked-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/infiniband/hw/mlx5/cq.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
Applied to for-next, thanks
Jason
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-01-05 19:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-27 12:38 [PATCH rdma-next] RDMA/mlx5: print wc status on CQE error and dump needed Dust Li
2022-01-03 11:27 ` Leon Romanovsky
2022-01-05 19:06 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).