All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017
@ 2017-12-19  3:56 Dennis Dalessandro
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Mike Marciniszyn, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Patel Jay P,
	Kaike Wan, Michael J. Ruhl, Sebastian Sanchez

Hi Jason and Doug,

Here is another set of patches to land fdor 4.16. Just driver changes and a
small change for the MAINTIANERS file. We are getting rid of that ipath email
and ensuring Mike and I both receive direct mails for all our drivers instead.

https://github.com/ddalessa/kernel/tree/for-4.16

---

Dennis Dalessandro (2):
      IB/hfi1: Check return value of strchr before using it
      rdma: Update maintainer contact for Intel RDMA drivers

Kaike Wan (2):
      IB/rdmavt: No need to cancel RNRNAK retry timer when it is running
      IB/rdmavt: Add trace for RNRNAK timer

Michael J. Ruhl (3):
      IB/{rdmavt,hfi1,qib}: Self determine driver name
      IB/{rdmavt,hfi1,qib}: Remove get_card_name() downcall
      IB/{hfi1,qib}: Fix a concurrency issue with device name in logging

Mike Marciniszyn (2):
      IB/rdmavt: Use correct numa node for SRQ allocation
      IB/rdmavt: Allocate CQ memory on the correct node

Patel Jay P (1):
      IB/hfi1: Destroy link_wq workqueue after free_irq()

Sebastian Sanchez (1):
      IB/hfi1: Fix infinite loop in 8051 command error path


 MAINTAINERS                             |    4 +
 drivers/infiniband/hw/hfi1/chip.c       |   98 +++++++++++++------------------
 drivers/infiniband/hw/hfi1/chip.h       |    2 -
 drivers/infiniband/hw/hfi1/driver.c     |   16 -----
 drivers/infiniband/hw/hfi1/firmware.c   |   64 +++++---------------
 drivers/infiniband/hw/hfi1/hfi.h        |   27 +++++----
 drivers/infiniband/hw/hfi1/init.c       |   33 ++++++++--
 drivers/infiniband/hw/hfi1/verbs.c      |   10 ++-
 drivers/infiniband/hw/qib/qib.h         |    8 +--
 drivers/infiniband/hw/qib/qib_driver.c  |   16 -----
 drivers/infiniband/hw/qib/qib_eeprom.c  |    3 -
 drivers/infiniband/hw/qib/qib_init.c    |    2 +
 drivers/infiniband/hw/qib/qib_verbs.c   |    2 -
 drivers/infiniband/sw/rdmavt/cq.c       |   10 ++-
 drivers/infiniband/sw/rdmavt/qp.c       |    9 +--
 drivers/infiniband/sw/rdmavt/srq.c      |   16 +++--
 drivers/infiniband/sw/rdmavt/trace.h    |    4 +
 drivers/infiniband/sw/rdmavt/trace_qp.h |   42 +++++++++++++
 drivers/infiniband/sw/rdmavt/vt.c       |    1 
 drivers/infiniband/sw/rdmavt/vt.h       |    6 +-
 include/rdma/rdma_vt.h                  |   31 ++++++++--
 21 files changed, 204 insertions(+), 200 deletions(-)

--
-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq()
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-19  3:56   ` Dennis Dalessandro
       [not found]     ` <20171219035612.2126.10447.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it Dennis Dalessandro
                     ` (10 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Patel Jay P,
	Sebastian Sanchez

From: Patel Jay P <jay.p.patel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

A sporadic crash occurs when handle_8051_interrupt handler is invoked
while doing rmmod. Actually, handler is invoked after all workqueue
related resources are freed which results into crash.

Call Trace:
 queue_work_on+0x27/0x40
 handle_8051_interrupt+0x417/0x710 [hfi1]
 ? handle_dcc_err+0x212/0x660 [hfi1]
 ? check_preempt_wakeup+0x119/0x250
 ? tracing_is_on+0x15/0x30
 ? tracing_record_taskinfo_skip+0x1e/0x40
 ? radix_tree_next_chunk+0x10b/0x2e0
 ? __slab_free+0x9b/0x2c0
 interrupt_clear_down+0x43/0x120 [hfi1]
 is_dc_int+0x2f/0xa0 [hfi1]
 general_interrupt+0x18c/0x1f0 [hfi1]
 __free_irq+0x1b3/0x2d0
 free_irq+0x35/0x70
 pci_free_irq+0x1c/0x30
 clean_up_interrupts+0x53/0xf0 [hfi1]
 hfi1_start_cleanup+0x122/0x190 [hfi1]
 postinit_cleanup+0x1d/0x280 [hfi1]
 remove_one+0x233/0x250 [hfi1]
 pci_device_remove+0x39/0xc0

When kernel is built with CONFIG_DEBUG_SHIRQ config flag, an extra call
to IRQ handler is made from _free_irq() function. The driver should be
prepared for this fake call.

Adding a mechanism which detects whether handler is invoked after
disabling interrupts. hfi_intr_mask field is added to hfi1_devdata
structure which is replica of interrupt mask register of hfi device.
The field is updated while writing a value to register.

Destroying link_wq workqueue after calling free_irq. This will make sure
that if interrupt handler is invoked before or while calling free_irq
then workqueue is destroyed after interrupt is handled.

Fixes: 05cb18fda926 ("IB/hfi1: Update HFI to use the latest PCI API")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Patel Jay P <jay.p.patel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c |    8 +++++++-
 drivers/infiniband/hw/hfi1/hfi.h  |    4 ++++
 drivers/infiniband/hw/hfi1/init.c |   31 ++++++++++++++++++++++---------
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 4f057e8..87748a6 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -8224,6 +8224,8 @@ static irqreturn_t general_interrupt(int irq, void *data)
 		/* only clear if anything is set */
 		if (regs[i])
 			write_csr(dd, CCE_INT_CLEAR + (8 * i), regs[i]);
+
+		regs[i] &= dd->hfi_intr_mask[i];
 	}
 
 	/* phase 2: call the appropriate handler */
@@ -12942,12 +12944,15 @@ void set_intr_state(struct hfi1_devdata *dd, u32 enable)
 			u64 mask = get_int_mask(dd, i);
 
 			write_csr(dd, CCE_INT_MASK + (8 * i), mask);
+			dd->hfi_intr_mask[i] = mask;
 		}
 
 		init_qsfp_int(dd);
 	} else {
-		for (i = 0; i < CCE_NUM_INT_CSRS; i++)
+		for (i = 0; i < CCE_NUM_INT_CSRS; i++) {
 			write_csr(dd, CCE_INT_MASK + (8 * i), 0ull);
+			dd->hfi_intr_mask[i] =  0ull;
+		}
 	}
 }
 
@@ -14773,6 +14778,7 @@ void hfi1_start_cleanup(struct hfi1_devdata *dd)
 	free_cntrs(dd);
 	free_rcverr(dd);
 	clean_up_interrupts(dd);
+	clean_up_workqueues(dd);
 	finish_chip_resources(dd);
 }
 
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 4a9b4d7..e12a80b 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1188,6 +1188,9 @@ struct hfi1_devdata {
 	/* INTx information */
 	u32 requested_intx_irq;		/* did we request one? */
 
+	/* copy of interrupt mask register */
+	u64 hfi_intr_mask[CCE_NUM_INT_CSRS];
+
 	/* general interrupt: mask of handled interrupts */
 	u64 gi_mask[CCE_NUM_INT_CSRS];
 
@@ -1993,6 +1996,7 @@ static inline void flush_wc(void)
 int kdeth_process_eager(struct hfi1_packet *packet);
 int process_receive_invalid(struct hfi1_packet *packet);
 void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd);
+void clean_up_workqueues(struct hfi1_devdata *dd);
 
 /* global module parameter variables */
 extern unsigned int hfi1_max_mtu;
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 8e3b3e7..c84af52 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -823,6 +823,28 @@ static int create_workqueues(struct hfi1_devdata *dd)
 }
 
 /**
+ * clean_up_workqueues - destroys hfi1_wq and link_wq workqueues
+ * @dd: the hfi1_ib device
+ */
+void clean_up_workqueues(struct hfi1_devdata *dd)
+{
+	int pidx;
+	struct hfi1_pportdata *ppd;
+
+	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
+		ppd = dd->pport + pidx;
+		if (ppd->hfi1_wq) {
+			destroy_workqueue(ppd->hfi1_wq);
+			ppd->hfi1_wq = NULL;
+		}
+		if (ppd->link_wq) {
+			destroy_workqueue(ppd->link_wq);
+			ppd->link_wq = NULL;
+		}
+	}
+}
+
+/**
  * hfi1_init - do the actual initialization sequence on the chip
  * @dd: the hfi1_ib device
  * @reinit: re-initializing, so don't allocate new memory
@@ -1102,15 +1124,6 @@ static void shutdown_device(struct hfi1_devdata *dd)
 		 * We can't count on interrupts since we are stopping.
 		 */
 		hfi1_quiet_serdes(ppd);
-
-		if (ppd->hfi1_wq) {
-			destroy_workqueue(ppd->hfi1_wq);
-			ppd->hfi1_wq = NULL;
-		}
-		if (ppd->link_wq) {
-			destroy_workqueue(ppd->link_wq);
-			ppd->link_wq = NULL;
-		}
 	}
 	sdma_exit(dd);
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq() Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
       [not found]     ` <20171219035621.2126.23093.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 03/11] IB/rdmavt: No need to cancel RNRNAK retry timer when it is running Dennis Dalessandro
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

The call to strchr in our counter initialization does not check the return
value before attempting to use the pointer. In theory this should not
happen given the way the code is structured but do the smart thing and
check the value anyway to harden the code.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/verbs.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 6d27c85..2487190 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1733,6 +1733,12 @@ static int init_cntr_names(const char *names_in,
 	for (i = 0; i < n; i++) {
 		q[i] = p;
 		p = strchr(p, '\n');
+		if (!p) {
+			*num_cntrs = 0;
+			*cntr_names = NULL;
+			kfree(names_out);
+			return -EINVAL;
+		}
 		*p++ = '\0';
 	}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 03/11] IB/rdmavt: No need to cancel RNRNAK retry timer when it is running
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq() Dennis Dalessandro
  2017-12-19  3:56   ` [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
  2017-12-19  3:56   ` [PATCH for-next 04/11] IB/{rdmavt, hfi1, qib}: Self determine driver name Dennis Dalessandro
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn, Kaike Wan

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

When the rdmavt's RNRNAK timer is fired, it tries to cancel the timer by
calling hrtimer_try_to_cancel(), which always returns -1 because the timer
is currently running. This patch removes this useless call.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/qp.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 831f7bd..f3c6d2c 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -2110,10 +2110,8 @@ static int rvt_stop_rnr_timer(struct rvt_qp *qp)
 
 	lockdep_assert_held(&qp->s_lock);
 	/* Remove QP from rnr timer */
-	if (qp->s_flags & RVT_S_WAIT_RNR) {
+	if (qp->s_flags & RVT_S_WAIT_RNR)
 		qp->s_flags &= ~RVT_S_WAIT_RNR;
-		rval = hrtimer_try_to_cancel(&qp->s_rnr_timer);
-	}
 	return rval;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 04/11] IB/{rdmavt, hfi1, qib}: Self determine driver name
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-12-19  3:56   ` [PATCH for-next 03/11] IB/rdmavt: No need to cancel RNRNAK retry timer when it is running Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
       [not found]     ` <20171219035635.2126.59763.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 05/11] IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall Dennis Dalessandro
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Mike Marciniszyn

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Currently the HFI and QIB drivers allow the IB core to assign a unit
number to the driver name string.

If multiple devices exist in a system, there is a possibility that the
device unit number and the IB core number will be mismatched.

Fix by using the driver defined unit number to generate the device
name.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/init.c     |    2 ++
 drivers/infiniband/hw/hfi1/verbs.c    |    3 ---
 drivers/infiniband/hw/qib/qib_init.c  |    2 ++
 drivers/infiniband/hw/qib/qib_verbs.c |    1 -
 include/rdma/rdma_vt.h                |   13 +++++++++++++
 5 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index c84af52..a86ad2d 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1285,6 +1285,8 @@ struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev, size_t extra)
 			       "Could not allocate unit ID: error %d\n", -ret);
 		goto bail;
 	}
+	rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit);
+
 	/*
 	 * Initialize all locks for the device. This needs to be as early as
 	 * possible so locks are usable.
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 2487190..81fcff3 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1850,7 +1850,6 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
 	struct hfi1_ibport *ibp = &ppd->ibport_data;
 	unsigned i;
 	int ret;
-	size_t lcpysz = IB_DEVICE_NAME_MAX;
 
 	for (i = 0; i < dd->num_pports; i++)
 		init_ibport(ppd + i);
@@ -1878,8 +1877,6 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
 	 */
 	if (!ib_hfi1_sys_image_guid)
 		ib_hfi1_sys_image_guid = ibdev->node_guid;
-	lcpysz = strlcpy(ibdev->name, class_name(), lcpysz);
-	strlcpy(ibdev->name + lcpysz, "_%d", IB_DEVICE_NAME_MAX - lcpysz);
 	ibdev->owner = THIS_MODULE;
 	ibdev->phys_port_cnt = dd->num_pports;
 	ibdev->dev.parent = &dd->pcidev->dev;
diff --git a/drivers/infiniband/hw/qib/qib_init.c b/drivers/infiniband/hw/qib/qib_init.c
index 5243ad3..4f9d38b 100644
--- a/drivers/infiniband/hw/qib/qib_init.c
+++ b/drivers/infiniband/hw/qib/qib_init.c
@@ -1119,6 +1119,8 @@ struct qib_devdata *qib_alloc_devdata(struct pci_dev *pdev, size_t extra)
 			      "Could not allocate unit ID: error %d\n", -ret);
 		goto bail;
 	}
+	rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s%d", "qib", dd->unit);
+
 	dd->int_counter = alloc_percpu(u64);
 	if (!dd->int_counter) {
 		ret = -ENOMEM;
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index c550005..373b80b 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1571,7 +1571,6 @@ int qib_register_ib_device(struct qib_devdata *dd)
 	if (!ib_qib_sys_image_guid)
 		ib_qib_sys_image_guid = ppd->guid;
 
-	strlcpy(ibdev->name, "qib%d", IB_DEVICE_NAME_MAX);
 	ibdev->owner = THIS_MODULE;
 	ibdev->node_guid = ppd->guid;
 	ibdev->phys_port_cnt = dd->num_pports;
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index 1ba84a7..b57784e 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -419,6 +419,19 @@ struct rvt_dev_info {
 
 };
 
+/**
+ * rvt_set_ibdev_name - Craft an IB device name from client info
+ * @rdi: pointer to the client rvt_dev_info structure
+ * @name: client specific name
+ * @unit: client specific unit number.
+ */
+static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
+				      const char *fmt, const char *name,
+				      const int unit)
+{
+	snprintf(rdi->ibdev.name, sizeof(rdi->ibdev.name), fmt, name, unit);
+}
+
 static inline struct rvt_pd *ibpd_to_rvtpd(struct ib_pd *ibpd)
 {
 	return container_of(ibpd, struct rvt_pd, ibpd);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 05/11] IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-12-19  3:56   ` [PATCH for-next 04/11] IB/{rdmavt, hfi1, qib}: Self determine driver name Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
  2017-12-19  3:56   ` [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation Dennis Dalessandro
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Mike Marciniszyn

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

rdmavt has a down call to client drivers to retrieve a crafted card
name.

This name should be the IB defined name.

Rather than craft the name each time it is needed, simply retrieve
the IB allocated name from the IB device.

Update the function name to reflect its application.

Clean up driver code to match this change.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/driver.c    |    8 --------
 drivers/infiniband/hw/hfi1/hfi.h       |    1 -
 drivers/infiniband/hw/hfi1/verbs.c     |    1 -
 drivers/infiniband/hw/qib/qib.h        |    1 -
 drivers/infiniband/hw/qib/qib_driver.c |    8 --------
 drivers/infiniband/hw/qib/qib_verbs.c  |    1 -
 drivers/infiniband/sw/rdmavt/trace.h   |    4 ++--
 drivers/infiniband/sw/rdmavt/vt.c      |    1 -
 drivers/infiniband/sw/rdmavt/vt.h      |    6 +++---
 include/rdma/rdma_vt.h                 |   18 +++++++++++-------
 10 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 4f65ac6..4561719 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -167,14 +167,6 @@ static int hfi1_caps_get(char *buffer, const struct kernel_param *kp)
 	return iname;
 }
 
-const char *get_card_name(struct rvt_dev_info *rdi)
-{
-	struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi);
-	struct hfi1_devdata *dd = container_of(ibdev,
-					       struct hfi1_devdata, verbs_dev);
-	return get_unit_name(dd->unit);
-}
-
 struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi)
 {
 	struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index e12a80b..88fa934 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1976,7 +1976,6 @@ int get_platform_config_field(struct hfi1_devdata *dd,
 			      u32 *data, u32 len);
 
 const char *get_unit_name(int unit);
-const char *get_card_name(struct rvt_dev_info *rdi);
 struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi);
 
 /*
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 81fcff3..e05f843 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1896,7 +1896,6 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
 	 * Fill in rvt info object.
 	 */
 	dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files;
-	dd->verbs_dev.rdi.driver_f.get_card_name = get_card_name;
 	dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev;
 	dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah;
 	dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah;
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index 092ed81..34c5254 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1429,7 +1429,6 @@ int qib_pcie_ddinit(struct qib_devdata *, struct pci_dev *,
 dma_addr_t qib_map_page(struct pci_dev *, struct page *, unsigned long,
 			  size_t, int);
 const char *qib_get_unit_name(int unit);
-const char *qib_get_card_name(struct rvt_dev_info *rdi);
 struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi);
 
 /*
diff --git a/drivers/infiniband/hw/qib/qib_driver.c b/drivers/infiniband/hw/qib/qib_driver.c
index 33d3335..9b13680 100644
--- a/drivers/infiniband/hw/qib/qib_driver.c
+++ b/drivers/infiniband/hw/qib/qib_driver.c
@@ -89,14 +89,6 @@
 	return iname;
 }
 
-const char *qib_get_card_name(struct rvt_dev_info *rdi)
-{
-	struct qib_ibdev *ibdev = container_of(rdi, struct qib_ibdev, rdi);
-	struct qib_devdata *dd = container_of(ibdev,
-					      struct qib_devdata, verbs_dev);
-	return qib_get_unit_name(dd->unit);
-}
-
 struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi)
 {
 	struct qib_ibdev *ibdev = container_of(rdi, struct qib_ibdev, rdi);
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 373b80b..fabee76 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1585,7 +1585,6 @@ int qib_register_ib_device(struct qib_devdata *dd)
 	 * Fill in rvt info object.
 	 */
 	dd->verbs_dev.rdi.driver_f.port_callback = qib_create_port_files;
-	dd->verbs_dev.rdi.driver_f.get_card_name = qib_get_card_name;
 	dd->verbs_dev.rdi.driver_f.get_pci_dev = qib_get_pci_dev;
 	dd->verbs_dev.rdi.driver_f.check_ah = qib_check_ah;
 	dd->verbs_dev.rdi.driver_f.check_send_wqe = qib_check_send_wqe;
diff --git a/drivers/infiniband/sw/rdmavt/trace.h b/drivers/infiniband/sw/rdmavt/trace.h
index bb4b1e7..36ddbd2 100644
--- a/drivers/infiniband/sw/rdmavt/trace.h
+++ b/drivers/infiniband/sw/rdmavt/trace.h
@@ -45,8 +45,8 @@
  *
  */
 
-#define RDI_DEV_ENTRY(rdi)   __string(dev, rdi->driver_f.get_card_name(rdi))
-#define RDI_DEV_ASSIGN(rdi)  __assign_str(dev, rdi->driver_f.get_card_name(rdi))
+#define RDI_DEV_ENTRY(rdi)   __string(dev, rvt_get_ibdev_name(rdi))
+#define RDI_DEV_ASSIGN(rdi)  __assign_str(dev, rvt_get_ibdev_name(rdi))
 
 #include "trace_rvt.h"
 #include "trace_qp.h"
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index 64bdd44..088fb2d 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -413,7 +413,6 @@ static noinline int check_support(struct rvt_dev_info *rdi, int verb)
 		 * required for rdmavt to function.
 		 */
 		if ((!rdi->driver_f.port_callback) ||
-		    (!rdi->driver_f.get_card_name) ||
 		    (!rdi->driver_f.get_pci_dev))
 			return -EINVAL;
 		break;
diff --git a/drivers/infiniband/sw/rdmavt/vt.h b/drivers/infiniband/sw/rdmavt/vt.h
index f363505..8823b2e 100644
--- a/drivers/infiniband/sw/rdmavt/vt.h
+++ b/drivers/infiniband/sw/rdmavt/vt.h
@@ -63,19 +63,19 @@
 
 #define rvt_pr_info(rdi, fmt, ...) \
 	__rvt_pr_info(rdi->driver_f.get_pci_dev(rdi), \
-		      rdi->driver_f.get_card_name(rdi), \
+		      rvt_get_ibdev_name(rdi), \
 		      fmt, \
 		      ##__VA_ARGS__)
 
 #define rvt_pr_warn(rdi, fmt, ...) \
 	__rvt_pr_warn(rdi->driver_f.get_pci_dev(rdi), \
-		      rdi->driver_f.get_card_name(rdi), \
+		      rvt_get_ibdev_name(rdi), \
 		      fmt, \
 		      ##__VA_ARGS__)
 
 #define rvt_pr_err(rdi, fmt, ...) \
 	__rvt_pr_err(rdi->driver_f.get_pci_dev(rdi), \
-		     rdi->driver_f.get_card_name(rdi), \
+		     rvt_get_ibdev_name(rdi), \
 		     fmt, \
 		     ##__VA_ARGS__)
 
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index b57784e..4118324 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -228,13 +228,6 @@ struct rvt_driver_provided {
 	int (*port_callback)(struct ib_device *, u8, struct kobject *);
 
 	/*
-	 * Returns a string to represent the device for which is being
-	 * registered. This is primarily used for error and debug messages on
-	 * the console.
-	 */
-	const char * (*get_card_name)(struct rvt_dev_info *rdi);
-
-	/*
 	 * Returns a pointer to the undelying hardware's PCI device. This is
 	 * used to display information as to what hardware is being referenced
 	 * in an output message
@@ -432,6 +425,17 @@ static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
 	snprintf(rdi->ibdev.name, sizeof(rdi->ibdev.name), fmt, name, unit);
 }
 
+/**
+ * rvt_get_ibdev_name - return the IB name
+ * @rdi: rdmavt device
+ *
+ * Return the registered name of the device.
+ */
+static inline const char *rvt_get_ibdev_name(const struct rvt_dev_info *rdi)
+{
+	return rdi->ibdev.name;
+}
+
 static inline struct rvt_pd *ibpd_to_rvtpd(struct ib_pd *ibpd)
 {
 	return container_of(ibpd, struct rvt_pd, ibpd);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-12-19  3:56   ` [PATCH for-next 05/11] IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
       [not found]     ` <20171219035649.2126.1625.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:56   ` [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path Dennis Dalessandro
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
	Sebastian Sanchez

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Normal receive queue allocation ensures that kernel receive queues
are allocated on the local numa node. Shared receive queues
do not behave the same way.

Ensure that kernel shared receive queues are allocated on the device
local node.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/srq.c |   16 +++++++++-------
 1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/srq.c b/drivers/infiniband/sw/rdmavt/srq.c
index f7c48e9..3707952 100644
--- a/drivers/infiniband/sw/rdmavt/srq.c
+++ b/drivers/infiniband/sw/rdmavt/srq.c
@@ -90,7 +90,7 @@ struct ib_srq *rvt_create_srq(struct ib_pd *ibpd,
 	    srq_init_attr->attr.max_wr > dev->dparms.props.max_srq_wr)
 		return ERR_PTR(-EINVAL);
 
-	srq = kmalloc(sizeof(*srq), GFP_KERNEL);
+	srq = kzalloc_node(sizeof(*srq), GFP_KERNEL, dev->dparms.node);
 	if (!srq)
 		return ERR_PTR(-ENOMEM);
 
@@ -101,7 +101,10 @@ struct ib_srq *rvt_create_srq(struct ib_pd *ibpd,
 	srq->rq.max_sge = srq_init_attr->attr.max_sge;
 	sz = sizeof(struct ib_sge) * srq->rq.max_sge +
 		sizeof(struct rvt_rwqe);
-	srq->rq.wq = vmalloc_user(sizeof(struct rvt_rwq) + srq->rq.size * sz);
+	srq->rq.wq = udata ?
+		vmalloc_user(sizeof(struct rvt_rwq) + srq->rq.size * sz) :
+		vzalloc_node(sizeof(struct rvt_rwq) + srq->rq.size * sz,
+			     dev->dparms.node);
 	if (!srq->rq.wq) {
 		ret = ERR_PTR(-ENOMEM);
 		goto bail_srq;
@@ -129,16 +132,12 @@ struct ib_srq *rvt_create_srq(struct ib_pd *ibpd,
 			ret = ERR_PTR(err);
 			goto bail_ip;
 		}
-	} else {
-		srq->ip = NULL;
 	}
 
 	/*
 	 * ib_create_srq() will initialize srq->ibsrq.
 	 */
 	spin_lock_init(&srq->rq.lock);
-	srq->rq.wq->head = 0;
-	srq->rq.wq->tail = 0;
 	srq->limit = srq_init_attr->attr.srq_limit;
 
 	spin_lock(&dev->n_srqs_lock);
@@ -200,7 +199,10 @@ int rvt_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		sz = sizeof(struct rvt_rwqe) +
 			srq->rq.max_sge * sizeof(struct ib_sge);
 		size = attr->max_wr + 1;
-		wq = vmalloc_user(sizeof(struct rvt_rwq) + size * sz);
+		wq = udata ?
+			vmalloc_user(sizeof(struct rvt_rwq) + size * sz) :
+			vzalloc_node(sizeof(struct rvt_rwq) + size * sz,
+				     dev->dparms.node);
 		if (!wq)
 			return -ENOMEM;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-12-19  3:56   ` [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation Dennis Dalessandro
@ 2017-12-19  3:56   ` Dennis Dalessandro
       [not found]     ` <20171219035657.2126.88651.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:57   ` [PATCH for-next 08/11] IB/rdmavt: Allocate CQ memory on the correct node Dennis Dalessandro
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:56 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Sebastian Sanchez

From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

When an 8051 command times out, the entire DC block is restarted. During
the restart, the host interface version bit is set, which calls
do_8051_command() recursively. The host version bit needs to be set
before the link moves into polling, so the host version bit can be set
in set_local_link_attributes() instead. Thus, the 8051 command functions
can be simplied as a non-locking version (dd->dc8051_lock) of those
functions are no longer needed.

Fixes: 9be6a5d788b0 ("IB/hfi1: Prevent LNI out of sync by resetting host interface version")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c     |   85 ++++++++++++---------------------
 drivers/infiniband/hw/hfi1/chip.h     |    2 -
 drivers/infiniband/hw/hfi1/firmware.c |   64 ++++++-------------------
 3 files changed, 49 insertions(+), 102 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 87748a6..99c7347 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -6518,11 +6518,12 @@ static void _dc_start(struct hfi1_devdata *dd)
 	if (!dd->dc_shutdown)
 		return;
 
-	/*
-	 * Take the 8051 out of reset, wait until 8051 is ready, and set host
-	 * version bit.
-	 */
-	release_and_wait_ready_8051_firmware(dd);
+	/* Take the 8051 out of reset */
+	write_csr(dd, DC_DC8051_CFG_RST, 0ull);
+	/* Wait until 8051 is ready */
+	if (wait_fm_ready(dd, TIMEOUT_8051_START))
+		dd_dev_err(dd, "%s: timeout starting 8051 firmware\n",
+			   __func__);
 
 	/* Take away reset for LCB and RX FPE (set in lcb_shutdown). */
 	write_csr(dd, DCC_CFG_RESET, 0x10);
@@ -8566,23 +8567,30 @@ int write_lcb_csr(struct hfi1_devdata *dd, u32 addr, u64 data)
 }
 
 /*
- * If the 8051 is in reset mode (dd->dc_shutdown == 1), this function
- * will still continue executing.
- *
  * Returns:
  *	< 0 = Linux error, not able to get access
  *	> 0 = 8051 command RETURN_CODE
  */
-static int _do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
-			    u64 *out_data)
+static int do_8051_command(
+	struct hfi1_devdata *dd,
+	u32 type,
+	u64 in_data,
+	u64 *out_data)
 {
 	u64 reg, completed;
 	int return_code;
 	unsigned long timeout;
 
-	lockdep_assert_held(&dd->dc8051_lock);
 	hfi1_cdbg(DC8051, "type %d, data 0x%012llx", type, in_data);
 
+	mutex_lock(&dd->dc8051_lock);
+
+	/* We can't send any commands to the 8051 if it's in reset */
+	if (dd->dc_shutdown) {
+		return_code = -ENODEV;
+		goto fail;
+	}
+
 	/*
 	 * If an 8051 host command timed out previously, then the 8051 is
 	 * stuck.
@@ -8683,29 +8691,6 @@ static int _do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
 	write_csr(dd, DC_DC8051_CFG_HOST_CMD_0, 0);
 
 fail:
-	return return_code;
-}
-
-/*
- * Returns:
- *	< 0 = Linux error, not able to get access
- *	> 0 = 8051 command RETURN_CODE
- */
-static int do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
-			   u64 *out_data)
-{
-	int return_code;
-
-	mutex_lock(&dd->dc8051_lock);
-	/* We can't send any commands to the 8051 if it's in reset */
-	if (dd->dc_shutdown) {
-		return_code = -ENODEV;
-		goto fail;
-	}
-
-	return_code = _do_8051_command(dd, type, in_data, out_data);
-
-fail:
 	mutex_unlock(&dd->dc8051_lock);
 	return return_code;
 }
@@ -8715,17 +8700,16 @@ static int set_physical_link_state(struct hfi1_devdata *dd, u64 state)
 	return do_8051_command(dd, HCMD_CHANGE_PHY_STATE, state, NULL);
 }
 
-static int _load_8051_config(struct hfi1_devdata *dd, u8 field_id,
-			     u8 lane_id, u32 config_data)
+int load_8051_config(struct hfi1_devdata *dd, u8 field_id,
+		     u8 lane_id, u32 config_data)
 {
 	u64 data;
 	int ret;
 
-	lockdep_assert_held(&dd->dc8051_lock);
 	data = (u64)field_id << LOAD_DATA_FIELD_ID_SHIFT
 		| (u64)lane_id << LOAD_DATA_LANE_ID_SHIFT
 		| (u64)config_data << LOAD_DATA_DATA_SHIFT;
-	ret = _do_8051_command(dd, HCMD_LOAD_CONFIG_DATA, data, NULL);
+	ret = do_8051_command(dd, HCMD_LOAD_CONFIG_DATA, data, NULL);
 	if (ret != HCMD_SUCCESS) {
 		dd_dev_err(dd,
 			   "load 8051 config: field id %d, lane %d, err %d\n",
@@ -8734,18 +8718,6 @@ static int _load_8051_config(struct hfi1_devdata *dd, u8 field_id,
 	return ret;
 }
 
-int load_8051_config(struct hfi1_devdata *dd, u8 field_id,
-		     u8 lane_id, u32 config_data)
-{
-	int return_code;
-
-	mutex_lock(&dd->dc8051_lock);
-	return_code = _load_8051_config(dd, field_id, lane_id, config_data);
-	mutex_unlock(&dd->dc8051_lock);
-
-	return return_code;
-}
-
 /*
  * Read the 8051 firmware "registers".  Use the RAM directly.  Always
  * set the result, even on error.
@@ -8861,14 +8833,13 @@ int write_host_interface_version(struct hfi1_devdata *dd, u8 version)
 	u32 frame;
 	u32 mask;
 
-	lockdep_assert_held(&dd->dc8051_lock);
 	mask = (HOST_INTERFACE_VERSION_MASK << HOST_INTERFACE_VERSION_SHIFT);
 	read_8051_config(dd, RESERVED_REGISTERS, GENERAL_CONFIG, &frame);
 	/* Clear, then set field */
 	frame &= ~mask;
 	frame |= ((u32)version << HOST_INTERFACE_VERSION_SHIFT);
-	return _load_8051_config(dd, RESERVED_REGISTERS, GENERAL_CONFIG,
-				 frame);
+	return load_8051_config(dd, RESERVED_REGISTERS, GENERAL_CONFIG,
+				frame);
 }
 
 void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
@@ -9272,6 +9243,14 @@ static int set_local_link_attributes(struct hfi1_pportdata *ppd)
 	if (ret != HCMD_SUCCESS)
 		goto set_local_link_attributes_fail;
 
+	ret = write_host_interface_version(dd, HOST_INTERFACE_VERSION);
+	if (ret != HCMD_SUCCESS) {
+		dd_dev_err(dd,
+			   "Failed to set host interface version, return 0x%x\n",
+			   ret);
+		goto set_local_link_attributes_fail;
+	}
+
 	/*
 	 * DC supports continuous updates.
 	 */
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index 133e313..21fca8e 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -508,6 +508,7 @@
 #define DOWN_REMOTE_REASON_SHIFT 16
 #define DOWN_REMOTE_REASON_MASK  0xff
 
+#define HOST_INTERFACE_VERSION 1
 #define HOST_INTERFACE_VERSION_SHIFT 16
 #define HOST_INTERFACE_VERSION_MASK  0xff
 
@@ -713,7 +714,6 @@ void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
 		      u8 *ver_patch);
 int write_host_interface_version(struct hfi1_devdata *dd, u8 version);
 void read_guid(struct hfi1_devdata *dd);
-int release_and_wait_ready_8051_firmware(struct hfi1_devdata *dd);
 int wait_fm_ready(struct hfi1_devdata *dd, u32 mstimeout);
 void set_link_down_reason(struct hfi1_pportdata *ppd, u8 lcl_reason,
 			  u8 neigh_reason, u8 rem_reason);
diff --git a/drivers/infiniband/hw/hfi1/firmware.c b/drivers/infiniband/hw/hfi1/firmware.c
index 98868df..2b57ba7 100644
--- a/drivers/infiniband/hw/hfi1/firmware.c
+++ b/drivers/infiniband/hw/hfi1/firmware.c
@@ -68,7 +68,6 @@
 #define ALT_FW_FABRIC_NAME "hfi1_fabric_d.fw"
 #define ALT_FW_SBUS_NAME "hfi1_sbus_d.fw"
 #define ALT_FW_PCIE_NAME "hfi1_pcie_d.fw"
-#define HOST_INTERFACE_VERSION 1
 
 MODULE_FIRMWARE(DEFAULT_FW_8051_NAME_ASIC);
 MODULE_FIRMWARE(DEFAULT_FW_FABRIC_NAME);
@@ -976,46 +975,6 @@ int wait_fm_ready(struct hfi1_devdata *dd, u32 mstimeout)
 }
 
 /*
- * Clear all reset bits, releasing the 8051.
- * Wait for firmware to be ready to accept host requests.
- * Then, set host version bit.
- *
- * This function executes even if the 8051 is in reset mode when
- * dd->dc_shutdown == 1.
- *
- * Expects dd->dc8051_lock to be held.
- */
-int release_and_wait_ready_8051_firmware(struct hfi1_devdata *dd)
-{
-	int ret;
-
-	lockdep_assert_held(&dd->dc8051_lock);
-	/* clear all reset bits, releasing the 8051 */
-	write_csr(dd, DC_DC8051_CFG_RST, 0ull);
-
-	/*
-	 * Wait for firmware to be ready to accept host
-	 * requests.
-	 */
-	ret = wait_fm_ready(dd, TIMEOUT_8051_START);
-	if (ret) {
-		dd_dev_err(dd, "8051 start timeout, current FW state 0x%x\n",
-			   get_firmware_state(dd));
-		return ret;
-	}
-
-	ret = write_host_interface_version(dd, HOST_INTERFACE_VERSION);
-	if (ret != HCMD_SUCCESS) {
-		dd_dev_err(dd,
-			   "Failed to set host interface version, return 0x%x\n",
-			   ret);
-		return -EIO;
-	}
-
-	return 0;
-}
-
-/*
  * Load the 8051 firmware.
  */
 static int load_8051_firmware(struct hfi1_devdata *dd,
@@ -1080,22 +1039,31 @@ static int load_8051_firmware(struct hfi1_devdata *dd,
 	if (ret)
 		return ret;
 
+	/* clear all reset bits, releasing the 8051 */
+	write_csr(dd, DC_DC8051_CFG_RST, 0ull);
+
 	/*
-	 * Clear all reset bits, releasing the 8051.
 	 * DC reset step 5. Wait for firmware to be ready to accept host
 	 * requests.
-	 * Then, set host version bit.
 	 */
-	mutex_lock(&dd->dc8051_lock);
-	ret = release_and_wait_ready_8051_firmware(dd);
-	mutex_unlock(&dd->dc8051_lock);
-	if (ret)
-		return ret;
+	ret = wait_fm_ready(dd, TIMEOUT_8051_START);
+	if (ret) { /* timed out */
+		dd_dev_err(dd, "8051 start timeout, current state 0x%x\n",
+			   get_firmware_state(dd));
+		return -ETIMEDOUT;
+	}
 
 	read_misc_status(dd, &ver_major, &ver_minor, &ver_patch);
 	dd_dev_info(dd, "8051 firmware version %d.%d.%d\n",
 		    (int)ver_major, (int)ver_minor, (int)ver_patch);
 	dd->dc8051_ver = dc8051_ver(ver_major, ver_minor, ver_patch);
+	ret = write_host_interface_version(dd, HOST_INTERFACE_VERSION);
+	if (ret != HCMD_SUCCESS) {
+		dd_dev_err(dd,
+			   "Failed to set host interface version, return 0x%x\n",
+			   ret);
+		return -EIO;
+	}
 
 	return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 08/11] IB/rdmavt: Allocate CQ memory on the correct node
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-12-19  3:56   ` [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path Dennis Dalessandro
@ 2017-12-19  3:57   ` Dennis Dalessandro
  2017-12-19  3:57   ` [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers Dennis Dalessandro
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:57 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
	Sebastian Sanchez

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

CQ allocation does not ensure that completion queue entries
and the completion queue structure are allocated on the correct
numa node.

Fix by allocating the rvt_cq and kernel CQ entries on the device node,
leaving the user CQ entries on the default local node.  Also ensure
CQ resizes use the correct allocator when extending a CQ.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/cq.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 97d71e4..88fa4d4 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -198,7 +198,7 @@ struct ib_cq *rvt_create_cq(struct ib_device *ibdev,
 		return ERR_PTR(-EINVAL);
 
 	/* Allocate the completion queue structure. */
-	cq = kzalloc(sizeof(*cq), GFP_KERNEL);
+	cq = kzalloc_node(sizeof(*cq), GFP_KERNEL, rdi->dparms.node);
 	if (!cq)
 		return ERR_PTR(-ENOMEM);
 
@@ -214,7 +214,9 @@ struct ib_cq *rvt_create_cq(struct ib_device *ibdev,
 		sz += sizeof(struct ib_uverbs_wc) * (entries + 1);
 	else
 		sz += sizeof(struct ib_wc) * (entries + 1);
-	wc = vmalloc_user(sz);
+	wc = udata ?
+		vmalloc_user(sz) :
+		vzalloc_node(sz, rdi->dparms.node);
 	if (!wc) {
 		ret = ERR_PTR(-ENOMEM);
 		goto bail_cq;
@@ -369,7 +371,9 @@ int rvt_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
 		sz += sizeof(struct ib_uverbs_wc) * (cqe + 1);
 	else
 		sz += sizeof(struct ib_wc) * (cqe + 1);
-	wc = vmalloc_user(sz);
+	wc = udata ?
+		vmalloc_user(sz) :
+		vzalloc_node(sz, rdi->dparms.node);
 	if (!wc)
 		return -ENOMEM;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-12-19  3:57   ` [PATCH for-next 08/11] IB/rdmavt: Allocate CQ memory on the correct node Dennis Dalessandro
@ 2017-12-19  3:57   ` Dennis Dalessandro
       [not found]     ` <20171219035711.2126.47130.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19  3:57   ` [PATCH for-next 10/11] IB/{hfi1, qib}: Fix a concurrency issue with device name in logging Dennis Dalessandro
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:57 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

Ensure both Mike and I are listed as maintainer contacts for Intel's qib,
hfi1, and rdmavt drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 MAINTAINERS |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index dafc9c3..d4b725e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11053,7 +11053,8 @@ S:	Maintained
 F:	drivers/firmware/qemu_fw_cfg.c
 
 QIB DRIVER
-M:	Mike Marciniszyn <infinipath-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+M:	Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+M:	Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
 L:	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 S:	Supported
 F:	drivers/infiniband/hw/qib/
@@ -11351,6 +11352,7 @@ F:	drivers/net/ethernet/rdc/r6040.c
 
 RDMAVT - RDMA verbs software
 M:	Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+M:	Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
 L:	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 S:	Supported
 F:	drivers/infiniband/sw/rdmavt

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 10/11] IB/{hfi1, qib}: Fix a concurrency issue with device name in logging
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-12-19  3:57   ` [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers Dennis Dalessandro
@ 2017-12-19  3:57   ` Dennis Dalessandro
  2017-12-19  3:57   ` [PATCH for-next 11/11] IB/rdmavt: Add trace for RNRNAK timer Dennis Dalessandro
  2018-01-05 18:36   ` [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017 Doug Ledford
  11 siblings, 0 replies; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:57 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Mike Marciniszyn

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The get_unit_name() function crafts a string based on the device name
and the device unit number.  It then stores this in a static variable.

This has concurrency issues as can be seen with this log:

hfi1 0000:02:00.0: hfi1_1: read_idle_message: read idle message 0x203
hfi1 0000:01:00.0: hfi1_1: read_idle_message: read idle message 0x203

The PCI device ID (0000:02:00.0 vs. 0000:01:00.0) is correct for the
message, but the device string hfi1_1 is incorrect (it should be
hfi1_0 for the second log message).

Remove get_unit_name() function.

Instead, use the rvt accessor rvt_get_ibdev_name() to get the IB name
string.

Clean up any hfi1_early_xx calls that can now use the new path.

QIB has the same (qib_get_unit_name()) issue.  Updating as necessary.

Remove qib_get_unit_name() function.

Update log message that has redundant device name.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c      |    5 ++---
 drivers/infiniband/hw/hfi1/driver.c    |    8 --------
 drivers/infiniband/hw/hfi1/hfi.h       |   22 ++++++++++++----------
 drivers/infiniband/hw/qib/qib.h        |    7 +++----
 drivers/infiniband/hw/qib/qib_driver.c |    8 --------
 drivers/infiniband/hw/qib/qib_eeprom.c |    3 +--
 6 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 99c7347..e63988c 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -14929,9 +14929,8 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 
 		if (num_vls < HFI1_MIN_VLS_SUPPORTED ||
 		    num_vls > HFI1_MAX_VLS_SUPPORTED) {
-			hfi1_early_err(&pdev->dev,
-				       "Invalid num_vls %u, using %u VLs\n",
-				    num_vls, HFI1_MAX_VLS_SUPPORTED);
+			dd_dev_err(dd, "Invalid num_vls %u, using %u VLs\n",
+				   num_vls, HFI1_MAX_VLS_SUPPORTED);
 			num_vls = HFI1_MAX_VLS_SUPPORTED;
 		}
 		ppd->vls_supported = num_vls;
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 4561719..067b29f 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -159,14 +159,6 @@ static int hfi1_caps_get(char *buffer, const struct kernel_param *kp)
 	return scnprintf(buffer, PAGE_SIZE, "0x%lx", cap_mask);
 }
 
-const char *get_unit_name(int unit)
-{
-	static char iname[16];
-
-	snprintf(iname, sizeof(iname), DRIVER_NAME "_%u", unit);
-	return iname;
-}
-
 struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi)
 {
 	struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 88fa934..869c2bf 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1975,7 +1975,6 @@ int get_platform_config_field(struct hfi1_devdata *dd,
 			      table_type, int table_index, int field_index,
 			      u32 *data, u32 len);
 
-const char *get_unit_name(int unit);
 struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi);
 
 /*
@@ -2126,39 +2125,42 @@ static inline u64 hfi1_pkt_base_sdma_integrity(struct hfi1_devdata *dd)
 
 #define dd_dev_emerg(dd, fmt, ...) \
 	dev_emerg(&(dd)->pcidev->dev, "%s: " fmt, \
-		  get_unit_name((dd)->unit), ##__VA_ARGS__)
+		  rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define dd_dev_err(dd, fmt, ...) \
 	dev_err(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+		rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define dd_dev_err_ratelimited(dd, fmt, ...) \
 	dev_err_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+			    rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \
+			    ##__VA_ARGS__)
 
 #define dd_dev_warn(dd, fmt, ...) \
 	dev_warn(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+		 rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define dd_dev_warn_ratelimited(dd, fmt, ...) \
 	dev_warn_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+			     rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \
+			     ##__VA_ARGS__)
 
 #define dd_dev_info(dd, fmt, ...) \
 	dev_info(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+		 rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define dd_dev_info_ratelimited(dd, fmt, ...) \
 	dev_info_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \
-			get_unit_name((dd)->unit), ##__VA_ARGS__)
+			     rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \
+			     ##__VA_ARGS__)
 
 #define dd_dev_dbg(dd, fmt, ...) \
 	dev_dbg(&(dd)->pcidev->dev, "%s: " fmt, \
-		get_unit_name((dd)->unit), ##__VA_ARGS__)
+		rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define hfi1_dev_porterr(dd, port, fmt, ...) \
 	dev_err(&(dd)->pcidev->dev, "%s: port %u: " fmt, \
-			get_unit_name((dd)->unit), (port), ##__VA_ARGS__)
+		rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), (port), ##__VA_ARGS__)
 
 /*
  * this is used for formatting hw error messages...
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index 34c5254..0235f76 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1428,7 +1428,6 @@ int qib_pcie_ddinit(struct qib_devdata *, struct pci_dev *,
  */
 dma_addr_t qib_map_page(struct pci_dev *, struct page *, unsigned long,
 			  size_t, int);
-const char *qib_get_unit_name(int unit);
 struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi);
 
 /*
@@ -1487,15 +1486,15 @@ static inline void qib_flush_wc(void)
 
 #define qib_dev_err(dd, fmt, ...) \
 	dev_err(&(dd)->pcidev->dev, "%s: " fmt, \
-		qib_get_unit_name((dd)->unit), ##__VA_ARGS__)
+		rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define qib_dev_warn(dd, fmt, ...) \
 	dev_warn(&(dd)->pcidev->dev, "%s: " fmt, \
-		qib_get_unit_name((dd)->unit), ##__VA_ARGS__)
+		 rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__)
 
 #define qib_dev_porterr(dd, port, fmt, ...) \
 	dev_err(&(dd)->pcidev->dev, "%s: IB%u:%u " fmt, \
-		qib_get_unit_name((dd)->unit), (dd)->unit, (port), \
+		rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), (dd)->unit, (port), \
 		##__VA_ARGS__)
 
 #define qib_devinfo(pcidev, fmt, ...) \
diff --git a/drivers/infiniband/hw/qib/qib_driver.c b/drivers/infiniband/hw/qib/qib_driver.c
index 9b13680..3117cc5 100644
--- a/drivers/infiniband/hw/qib/qib_driver.c
+++ b/drivers/infiniband/hw/qib/qib_driver.c
@@ -81,14 +81,6 @@
 
 struct qlogic_ib_stats qib_stats;
 
-const char *qib_get_unit_name(int unit)
-{
-	static char iname[16];
-
-	snprintf(iname, sizeof(iname), "infinipath%u", unit);
-	return iname;
-}
-
 struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi)
 {
 	struct qib_ibdev *ibdev = container_of(rdi, struct qib_ibdev, rdi);
diff --git a/drivers/infiniband/hw/qib/qib_eeprom.c b/drivers/infiniband/hw/qib/qib_eeprom.c
index 33a2e74..5838b3b 100644
--- a/drivers/infiniband/hw/qib/qib_eeprom.c
+++ b/drivers/infiniband/hw/qib/qib_eeprom.c
@@ -163,8 +163,7 @@ void qib_get_eeprom_info(struct qib_devdata *dd)
 			if (bguid[6] == 0xff) {
 				if (bguid[5] == 0xff) {
 					qib_dev_err(dd,
-						"Can't set %s GUID from base, wraps to OUI!\n",
-						qib_get_unit_name(t));
+						    "Can't set GUID from base, wraps to OUI!\n");
 					dd->base_guid = 0;
 					goto bail;
 				}

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 11/11] IB/rdmavt: Add trace for RNRNAK timer
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-12-19  3:57   ` [PATCH for-next 10/11] IB/{hfi1, qib}: Fix a concurrency issue with device name in logging Dennis Dalessandro
@ 2017-12-19  3:57   ` Dennis Dalessandro
  2018-01-05 18:36   ` [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017 Doug Ledford
  11 siblings, 0 replies; 31+ messages in thread
From: Dennis Dalessandro @ 2017-12-19  3:57 UTC (permalink / raw
  To: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn, Kaike Wan

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This patch adds static trace for RNRNAK timer. Currently the output from
hrtimer static trace only shows the addresses of hrtimers in the system
and there is no easy way to correlate an RNRNAK timer with its entries in
the hrtimer trace. This patch adds the correlation among a QP, its RNRNAK
timer, and its entries in the hrtimer trace. This correlation will be
enormously helpful when debugging RNRNAK related issues. In addition, this
patch cleans up rvt_stop_rnr_timer() to be void while here.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/qp.c       |   11 ++++----
 drivers/infiniband/sw/rdmavt/trace_qp.h |   42 +++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index f3c6d2c..ed46801 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -2075,6 +2075,7 @@ void rvt_add_rnr_timer(struct rvt_qp *qp, u32 aeth)
 	lockdep_assert_held(&qp->s_lock);
 	qp->s_flags |= RVT_S_WAIT_RNR;
 	to = rvt_aeth_to_usec(aeth);
+	trace_rvt_rnrnak_add(qp, to);
 	hrtimer_start(&qp->s_rnr_timer,
 		      ns_to_ktime(1000 * to), HRTIMER_MODE_REL);
 }
@@ -2104,15 +2105,14 @@ void rvt_stop_rc_timers(struct rvt_qp *qp)
  * stop an rnr timer and return if the timer
  * had been pending.
  */
-static int rvt_stop_rnr_timer(struct rvt_qp *qp)
+static void rvt_stop_rnr_timer(struct rvt_qp *qp)
 {
-	int rval = 0;
-
 	lockdep_assert_held(&qp->s_lock);
 	/* Remove QP from rnr timer */
-	if (qp->s_flags & RVT_S_WAIT_RNR)
+	if (qp->s_flags & RVT_S_WAIT_RNR) {
 		qp->s_flags &= ~RVT_S_WAIT_RNR;
-	return rval;
+		trace_rvt_rnrnak_stop(qp, 0);
+	}
 }
 
 /**
@@ -2165,6 +2165,7 @@ enum hrtimer_restart rvt_rc_rnr_retry(struct hrtimer *t)
 
 	spin_lock_irqsave(&qp->s_lock, flags);
 	rvt_stop_rnr_timer(qp);
+	trace_rvt_rnrnak_timeout(qp, 0);
 	rdi->driver_f.schedule_send(qp);
 	spin_unlock_irqrestore(&qp->s_lock, flags);
 	return HRTIMER_NORESTART;
diff --git a/drivers/infiniband/sw/rdmavt/trace_qp.h b/drivers/infiniband/sw/rdmavt/trace_qp.h
index 4c77a31..efc9d81 100644
--- a/drivers/infiniband/sw/rdmavt/trace_qp.h
+++ b/drivers/infiniband/sw/rdmavt/trace_qp.h
@@ -85,6 +85,48 @@
 	TP_PROTO(struct rvt_qp *qp, u32 bucket),
 	TP_ARGS(qp, bucket));
 
+DECLARE_EVENT_CLASS(
+	rvt_rnrnak_template,
+	TP_PROTO(struct rvt_qp *qp, u32 to),
+	TP_ARGS(qp, to),
+	TP_STRUCT__entry(
+		RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
+		__field(u32, qpn)
+		__field(void *, hrtimer)
+		__field(u32, s_flags)
+		__field(u32, to)
+	),
+	TP_fast_assign(
+		RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
+		__entry->qpn = qp->ibqp.qp_num;
+		__entry->hrtimer = &qp->s_rnr_timer;
+		__entry->s_flags = qp->s_flags;
+		__entry->to = to;
+	),
+	TP_printk(
+		"[%s] qpn 0x%x hrtimer 0x%p s_flags 0x%x timeout %u us",
+		__get_str(dev),
+		__entry->qpn,
+		__entry->hrtimer,
+		__entry->s_flags,
+		__entry->to
+	)
+);
+
+DEFINE_EVENT(
+	rvt_rnrnak_template, rvt_rnrnak_add,
+	TP_PROTO(struct rvt_qp *qp, u32 to),
+	TP_ARGS(qp, to));
+
+DEFINE_EVENT(
+	rvt_rnrnak_template, rvt_rnrnak_timeout,
+	TP_PROTO(struct rvt_qp *qp, u32 to),
+	TP_ARGS(qp, to));
+
+DEFINE_EVENT(
+	rvt_rnrnak_template, rvt_rnrnak_stop,
+	TP_PROTO(struct rvt_qp *qp, u32 to),
+	TP_ARGS(qp, to));
 
 #endif /* __RVT_TRACE_QP_H */
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers
       [not found]     ` <20171219035711.2126.47130.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-19 20:51       ` Jason Gunthorpe
  2017-12-22 23:39       ` Jason Gunthorpe
  1 sibling, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-19 20:51 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

On Mon, Dec 18, 2017 at 07:57:13PM -0800, Dennis Dalessandro wrote:
> Ensure both Mike and I are listed as maintainer contacts for Intel's qib,
> hfi1, and rdmavt drivers.
> 
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>  MAINTAINERS |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dafc9c3..d4b725e 100644
> +++ b/MAINTAINERS
> @@ -11053,7 +11053,8 @@ S:	Maintained
>  F:	drivers/firmware/qemu_fw_cfg.c
>  
>  QIB DRIVER
> -M:	Mike Marciniszyn <infinipath-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> +M:	Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> +M:	Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>  L:	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>  S:	Supported
>  F:	drivers/infiniband/hw/qib/

Please send an update for rdma-core as well

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq()
       [not found]     ` <20171219035612.2126.10447.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-19 20:57       ` Jason Gunthorpe
       [not found]         ` <20171219205754.GE14814-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-19 20:57 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Patel Jay P,
	Sebastian Sanchez

On Mon, Dec 18, 2017 at 07:56:16PM -0800, Dennis Dalessandro wrote:
 
> When kernel is built with CONFIG_DEBUG_SHIRQ config flag, an extra call
> to IRQ handler is made from _free_irq() function. The driver should be
> prepared for this fake call.

It is not a 'fake call' it is call designed to test what happens
during an IRQ handler racing with free.

> Adding a mechanism which detects whether handler is invoked after
> disabling interrupts. hfi_intr_mask field is added to hfi1_devdata
> structure which is replica of interrupt mask register of hfi device.
> The field is updated while writing a value to register.

And this is not the typical solution.

Explain why adding a non-atomic variable to a concurrancy doesn't just
create races and a mess??

free_irq() is a synchronous fence, It returns once all the running IRQ
handlers have exited and guarantees they will not be called again.

You are supposed to call it before destroying anything that the IRQ
handler could be using.

re-ordering desstruction order is the typical solution.

Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 04/11] IB/{rdmavt, hfi1, qib}: Self determine driver name
       [not found]     ` <20171219035635.2126.59763.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-19 20:59       ` Jason Gunthorpe
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-19 20:59 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Mike Marciniszyn

On Mon, Dec 18, 2017 at 07:56:37PM -0800, Dennis Dalessandro wrote:
> From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> Currently the HFI and QIB drivers allow the IB core to assign a unit
> number to the driver name string.
> 
> If multiple devices exist in a system, there is a possibility that the
> device unit number and the IB core number will be mismatched.

This is much better than the last try at this. thanks

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path
       [not found]     ` <20171219035657.2126.88651.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-20  8:08       ` Leon Romanovsky
       [not found]         ` <20171220080854.GL2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Leon Romanovsky @ 2017-12-20  8:08 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
	Sebastian Sanchez

[-- Attachment #1: Type: text/plain, Size: 2680 bytes --]

On Mon, Dec 18, 2017 at 07:56:59PM -0800, Dennis Dalessandro wrote:
> From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> When an 8051 command times out, the entire DC block is restarted. During
> the restart, the host interface version bit is set, which calls
> do_8051_command() recursively. The host version bit needs to be set
> before the link moves into polling, so the host version bit can be set
> in set_local_link_attributes() instead. Thus, the 8051 command functions
> can be simplied as a non-locking version (dd->dc8051_lock) of those
> functions are no longer needed.
>
> Fixes: 9be6a5d788b0 ("IB/hfi1: Prevent LNI out of sync by resetting host interface version")
> Reviewed-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/infiniband/hw/hfi1/chip.c     |   85 ++++++++++++---------------------
>  drivers/infiniband/hw/hfi1/chip.h     |    2 -
>  drivers/infiniband/hw/hfi1/firmware.c |   64 ++++++-------------------
>  3 files changed, 49 insertions(+), 102 deletions(-)
>
> diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
> index 87748a6..99c7347 100644
> --- a/drivers/infiniband/hw/hfi1/chip.c
> +++ b/drivers/infiniband/hw/hfi1/chip.c
> @@ -6518,11 +6518,12 @@ static void _dc_start(struct hfi1_devdata *dd)
>  	if (!dd->dc_shutdown)
>  		return;
>
> -	/*
> -	 * Take the 8051 out of reset, wait until 8051 is ready, and set host
> -	 * version bit.
> -	 */
> -	release_and_wait_ready_8051_firmware(dd);
> +	/* Take the 8051 out of reset */
> +	write_csr(dd, DC_DC8051_CFG_RST, 0ull);
> +	/* Wait until 8051 is ready */
> +	if (wait_fm_ready(dd, TIMEOUT_8051_START))
> +		dd_dev_err(dd, "%s: timeout starting 8051 firmware\n",
> +			   __func__);
>
>  	/* Take away reset for LCB and RX FPE (set in lcb_shutdown). */
>  	write_csr(dd, DCC_CFG_RESET, 0x10);
> @@ -8566,23 +8567,30 @@ int write_lcb_csr(struct hfi1_devdata *dd, u32 addr, u64 data)
>  }
>
>  /*
> - * If the 8051 is in reset mode (dd->dc_shutdown == 1), this function
> - * will still continue executing.
> - *
>   * Returns:
>   *	< 0 = Linux error, not able to get access
>   *	> 0 = 8051 command RETURN_CODE
>   */
> -static int _do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
> -			    u64 *out_data)
> +static int do_8051_command(
> +	struct hfi1_devdata *dd,
> +	u32 type,
> +	u64 in_data,
> +	u64 *out_data)

What did you try to say by this change ? :)

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation
       [not found]     ` <20171219035649.2126.1625.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-20  8:17       ` Leon Romanovsky
       [not found]         ` <20171220081720.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Leon Romanovsky @ 2017-12-20  8:17 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
	Sebastian Sanchez

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

On Mon, Dec 18, 2017 at 07:56:52PM -0800, Dennis Dalessandro wrote:
> From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Normal receive queue allocation ensures that kernel receive queues
> are allocated on the local numa node. Shared receive queues
> do not behave the same way.
>
> Ensure that kernel shared receive queues are allocated on the device
> local node.
>
> Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/infiniband/sw/rdmavt/srq.c |   16 +++++++++-------
>  1 files changed, 9 insertions(+), 7 deletions(-)

Are you aware of "__GFP_THISNODE" flag?

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found]     ` <20171219035621.2126.23093.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-12-20  8:25       ` Leon Romanovsky
       [not found]         ` <20171220082555.GN2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Leon Romanovsky @ 2017-12-20  8:25 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

[-- Attachment #1: Type: text/plain, Size: 508 bytes --]

On Mon, Dec 18, 2017 at 07:56:23PM -0800, Dennis Dalessandro wrote:
> The call to strchr in our counter initialization does not check the return
> value before attempting to use the pointer. In theory this should not
> happen given the way the code is structured but do the smart thing and
> check the value anyway to harden the code.

The smartest way is to get rid of the whole "\n"<->"\0" logic and
copy/paste mlx5 implementation which does the same thing but statically
and much safer than here.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation
       [not found]         ` <20171220081720.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-12-20  8:31           ` Leon Romanovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Leon Romanovsky @ 2017-12-20  8:31 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
	Sebastian Sanchez

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

On Wed, Dec 20, 2017 at 10:17:20AM +0200, Leon Romanovsky wrote:
> On Mon, Dec 18, 2017 at 07:56:52PM -0800, Dennis Dalessandro wrote:
> > From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >
> > Normal receive queue allocation ensures that kernel receive queues
> > are allocated on the local numa node. Shared receive queues
> > do not behave the same way.
> >
> > Ensure that kernel shared receive queues are allocated on the device
> > local node.
> >
> > Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> >  drivers/infiniband/sw/rdmavt/srq.c |   16 +++++++++-------
> >  1 files changed, 9 insertions(+), 7 deletions(-)
>
> Are you aware of "__GFP_THISNODE" flag?

It is not relevant to this patch, sorry.

>
> Thanks



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path
       [not found]         ` <20171220080854.GL2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-12-20 18:02           ` Sanchez, Sebastian
       [not found]             ` <5CDA63463B33C94CA80846587415F0772829387D-8oqHQFITsIGkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Sanchez, Sebastian @ 2017-12-20 18:02 UTC (permalink / raw
  To: Leon Romanovsky, Dalessandro, Dennis
  Cc: jgg-uk2M96/98Pc@public.gmane.org,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ruhl, Michael J

>What did you try to say by this change ? :)

Are you referring to the parameters for do_8051_command() or something else?

Thank you,

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path
       [not found]             ` <5CDA63463B33C94CA80846587415F0772829387D-8oqHQFITsIGkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-12-20 18:12               ` Jason Gunthorpe
       [not found]                 ` <20171220181244.GD22908-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-20 18:12 UTC (permalink / raw
  To: Sanchez, Sebastian
  Cc: Leon Romanovsky, Dalessandro, Dennis,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ruhl, Michael J

On Wed, Dec 20, 2017 at 06:02:39PM +0000, Sanchez, Sebastian wrote:
> >What did you try to say by this change ? :)
> 
> Are you referring to the parameters for do_8051_command() or something else?

I guess he means why did you randomly reformat the whitespace around
_do_8051_command just for removing a '_', and why did you reformat it
to a non-kernel-standard formatting?

Have you discovered clang-format's editor integration yet? Makes these
issues go away..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq()
       [not found]         ` <20171219205754.GE14814-uk2M96/98Pc@public.gmane.org>
@ 2017-12-20 21:01           ` Ruhl, Michael J
       [not found]             ` <14063C7AD467DE4B82DEDB5C278E86639F0E3917-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Ruhl, Michael J @ 2017-12-20 21:01 UTC (permalink / raw
  To: Jason Gunthorpe, Dalessandro, Dennis
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Patel, Jay P,
	Sanchez, Sebastian

> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg-uk2M96/98Pc@public.gmane.org]
> Sent: Tuesday, December 19, 2017 3:58 PM
> To: Dalessandro, Dennis <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Ruhl, Michael J
> <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; Patel, Jay P <jay.p.patel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; Sanchez,
> Sebastian <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Subject: Re: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after
> free_irq()
> 
> On Mon, Dec 18, 2017 at 07:56:16PM -0800, Dennis Dalessandro wrote:
> 
> > When kernel is built with CONFIG_DEBUG_SHIRQ config flag, an extra call
> > to IRQ handler is made from _free_irq() function. The driver should be
> > prepared for this fake call.
> 
> It is not a 'fake call' it is call designed to test what happens
> during an IRQ handler racing with free.

Hi Jason,

After further review it appears that our patch:

commit 05cb18fda926ddce299280bd86cbc9d491306f28
IB/hfi1: Update HFI to use the latest PCI API

has exposed an architecture issue in how we do our driver
remove.

Our current path does not adequately deal with the race condition
of getting an IRQ during the removal.

To address, we could submit a patch to revert the above commit.

After re-architecting the driver remove path, we would re-submit
the PCI API patch.

Note: reverting the PCI API patch will not make the crash addressed in:

IB/hfi1: Destroy link_wq workqueue after free_irq()

issue go away, but will make it more difficult for the issue to occur.

Thanks,

Mike
> 
> > Adding a mechanism which detects whether handler is invoked after
> > disabling interrupts. hfi_intr_mask field is added to hfi1_devdata
> > structure which is replica of interrupt mask register of hfi device.
> > The field is updated while writing a value to register.
> 
> And this is not the typical solution.
> 
> Explain why adding a non-atomic variable to a concurrancy doesn't just
> create races and a mess??
> 
> free_irq() is a synchronous fence, It returns once all the running IRQ
> handlers have exited and guarantees they will not be called again.
> 
> You are supposed to call it before destroying anything that the IRQ
> handler could be using.
> 
> re-ordering desstruction order is the typical solution.
> 
> Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq()
       [not found]             ` <14063C7AD467DE4B82DEDB5C278E86639F0E3917-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-12-20 21:11               ` Jason Gunthorpe
       [not found]                 ` <20171220211112.GG22908-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-20 21:11 UTC (permalink / raw
  To: Ruhl, Michael J
  Cc: Dalessandro, Dennis,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Patel, Jay P,
	Sanchez, Sebastian

On Wed, Dec 20, 2017 at 09:01:17PM +0000, Ruhl, Michael J wrote:

> Note: reverting the PCI API patch will not make the crash addressed in:
> 
> IB/hfi1: Destroy link_wq workqueue after free_irq()
> 
> issue go away, but will make it more difficult for the issue to occur.

It is up to you.

Device removal races should be fixed, but it is often unlikely they
would impact any user.

So, abstractly, reverting a racy situation to another racy situation
doesn't seem like a worthwhile win to me.

But I do not want to merge this patch which is just making the race
harder to detect with our debug tools for finding racing..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path
       [not found]                 ` <20171220181244.GD22908-uk2M96/98Pc@public.gmane.org>
@ 2017-12-20 22:24                   ` Sanchez, Sebastian
  0 siblings, 0 replies; 31+ messages in thread
From: Sanchez, Sebastian @ 2017-12-20 22:24 UTC (permalink / raw
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Dalessandro, Dennis,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ruhl, Michael J

>I guess he means why did you randomly reformat the whitespace around _do_8051_command just for >removing a '_', and why did you reformat it to a non-kernel-standard formatting?

An updated patch will be resubmitted.

Thank you,

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq()
       [not found]                 ` <20171220211112.GG22908-uk2M96/98Pc@public.gmane.org>
@ 2017-12-22 13:13                   ` Ruhl, Michael J
  0 siblings, 0 replies; 31+ messages in thread
From: Ruhl, Michael J @ 2017-12-22 13:13 UTC (permalink / raw
  To: Jason Gunthorpe
  Cc: Dalessandro, Dennis,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Patel, Jay P,
	Sanchez, Sebastian

> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Jason Gunthorpe
> Sent: Wednesday, December 20, 2017 4:11 PM
> To: Ruhl, Michael J <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Dalessandro, Dennis <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>;
> dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Patel, Jay P
> <jay.p.patel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; Sanchez, Sebastian <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Subject: Re: [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after
> free_irq()
> 
> On Wed, Dec 20, 2017 at 09:01:17PM +0000, Ruhl, Michael J wrote:
> 
> > Note: reverting the PCI API patch will not make the crash addressed in:
> >
> > IB/hfi1: Destroy link_wq workqueue after free_irq()
> >
> > issue go away, but will make it more difficult for the issue to occur.
> 
> It is up to you.
> 
> Device removal races should be fixed, but it is often unlikely they
> would impact any user.
> 
> So, abstractly, reverting a racy situation to another racy situation
> doesn't seem like a worthwhile win to me.
> 
> But I do not want to merge this patch which is just making the race
> harder to detect with our debug tools for finding racing..

Ok.  Will work toward getting the race condition addressed more completely.

Mike

 
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers
       [not found]     ` <20171219035711.2126.47130.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-12-19 20:51       ` Jason Gunthorpe
@ 2017-12-22 23:39       ` Jason Gunthorpe
  1 sibling, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2017-12-22 23:39 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

On Mon, Dec 18, 2017 at 07:57:13PM -0800, Dennis Dalessandro wrote:
> Ensure both Mike and I are listed as maintainer contacts for Intel's qib,
> hfi1, and rdmavt drivers.
> 
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

I took this one to for-next, others in the series had comments.

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found]         ` <20171220082555.GN2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2018-01-03 15:05           ` Dennis Dalessandro
       [not found]             ` <f5849e2b-c8cd-b93b-f32f-f423bff9ae31-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2018-01-03 15:05 UTC (permalink / raw
  To: Leon Romanovsky
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

On 12/20/2017 3:25 AM, Leon Romanovsky wrote:
> On Mon, Dec 18, 2017 at 07:56:23PM -0800, Dennis Dalessandro wrote:
>> The call to strchr in our counter initialization does not check the return
>> value before attempting to use the pointer. In theory this should not
>> happen given the way the code is structured but do the smart thing and
>> check the value anyway to harden the code.
> 
> The smartest way is to get rid of the whole "\n"<->"\0" logic and
> copy/paste mlx5 implementation which does the same thing but statically
> and much safer than here.
> 
> Thanks
> 

Not sure I'd agree. Is there something unsafe about the code here? The 
hole is plugged. Changing the entire implementation for a copy/paste job 
doesn't seem like a good thing to me.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found]             ` <f5849e2b-c8cd-b93b-f32f-f423bff9ae31-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2018-01-03 15:27               ` Leon Romanovsky
       [not found]                 ` <20180103152721.GT10145-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Leon Romanovsky @ 2018-01-03 15:27 UTC (permalink / raw
  To: Dennis Dalessandro
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

[-- Attachment #1: Type: text/plain, Size: 968 bytes --]

On Wed, Jan 03, 2018 at 10:05:56AM -0500, Dennis Dalessandro wrote:
> On 12/20/2017 3:25 AM, Leon Romanovsky wrote:
> > On Mon, Dec 18, 2017 at 07:56:23PM -0800, Dennis Dalessandro wrote:
> > > The call to strchr in our counter initialization does not check the return
> > > value before attempting to use the pointer. In theory this should not
> > > happen given the way the code is structured but do the smart thing and
> > > check the value anyway to harden the code.
> >
> > The smartest way is to get rid of the whole "\n"<->"\0" logic and
> > copy/paste mlx5 implementation which does the same thing but statically
> > and much safer than here.
> >
> > Thanks
> >
>
> Not sure I'd agree. Is there something unsafe about the code here? The hole
> is plugged. Changing the entire implementation for a copy/paste job doesn't
> seem like a good thing to me.

The names are static and can't be changed. IMHO, the whole
implementation is overkill.

Thanks

>
> -Denny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found]                 ` <20180103152721.GT10145-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2018-01-03 15:42                   ` Dennis Dalessandro
       [not found]                     ` <4555c08f-a568-48ea-e183-2d49ebd36c7c-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Dennis Dalessandro @ 2018-01-03 15:42 UTC (permalink / raw
  To: Leon Romanovsky
  Cc: jgg-uk2M96/98Pc, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

On 1/3/2018 10:27 AM, Leon Romanovsky wrote:
> On Wed, Jan 03, 2018 at 10:05:56AM -0500, Dennis Dalessandro wrote:
>> On 12/20/2017 3:25 AM, Leon Romanovsky wrote:
>>> On Mon, Dec 18, 2017 at 07:56:23PM -0800, Dennis Dalessandro wrote:
>>>> The call to strchr in our counter initialization does not check the return
>>>> value before attempting to use the pointer. In theory this should not
>>>> happen given the way the code is structured but do the smart thing and
>>>> check the value anyway to harden the code.
>>>
>>> The smartest way is to get rid of the whole "\n"<->"\0" logic and
>>> copy/paste mlx5 implementation which does the same thing but statically
>>> and much safer than here.
>>>
>>> Thanks
>>>
>>
>> Not sure I'd agree. Is there something unsafe about the code here? The hole
>> is plugged. Changing the entire implementation for a copy/paste job doesn't
>> seem like a good thing to me.
> 
> The names are static and can't be changed. IMHO, the whole
> implementation is overkill.

Now that I do agree with. It is certainly an area that needs improved upon.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it
       [not found]                     ` <4555c08f-a568-48ea-e183-2d49ebd36c7c-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2018-01-05 17:39                       ` Doug Ledford
  0 siblings, 0 replies; 31+ messages in thread
From: Doug Ledford @ 2018-01-05 17:39 UTC (permalink / raw
  To: Dennis Dalessandro, Leon Romanovsky
  Cc: jgg-uk2M96/98Pc, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Michael J. Ruhl

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

On Wed, 2018-01-03 at 10:42 -0500, Dennis Dalessandro wrote:
> On 1/3/2018 10:27 AM, Leon Romanovsky wrote:
> > On Wed, Jan 03, 2018 at 10:05:56AM -0500, Dennis Dalessandro wrote:
> > > On 12/20/2017 3:25 AM, Leon Romanovsky wrote:
> > > > On Mon, Dec 18, 2017 at 07:56:23PM -0800, Dennis Dalessandro wrote:
> > > > > The call to strchr in our counter initialization does not check the return
> > > > > value before attempting to use the pointer. In theory this should not
> > > > > happen given the way the code is structured but do the smart thing and
> > > > > check the value anyway to harden the code.
> > > > 
> > > > The smartest way is to get rid of the whole "\n"<->"\0" logic and
> > > > copy/paste mlx5 implementation which does the same thing but statically
> > > > and much safer than here.
> > > > 
> > > > Thanks
> > > > 
> > > 
> > > Not sure I'd agree. Is there something unsafe about the code here? The hole
> > > is plugged. Changing the entire implementation for a copy/paste job doesn't
> > > seem like a good thing to me.
> > 
> > The names are static and can't be changed. IMHO, the whole
> > implementation is overkill.
> 
> Now that I do agree with. It is certainly an area that needs improved upon.

As there is a better solution on the table, and, as you point out in the
original patch, this shouldn't happen, I'm dropping this patch and will
wait for the better solution to come along.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017
       [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-12-19  3:57   ` [PATCH for-next 11/11] IB/rdmavt: Add trace for RNRNAK timer Dennis Dalessandro
@ 2018-01-05 18:36   ` Doug Ledford
  11 siblings, 0 replies; 31+ messages in thread
From: Doug Ledford @ 2018-01-05 18:36 UTC (permalink / raw
  To: Dennis Dalessandro, jgg-uk2M96/98Pc
  Cc: Mike Marciniszyn, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Patel Jay P,
	Kaike Wan, Michael J. Ruhl, Sebastian Sanchez

[-- Attachment #1: Type: text/plain, Size: 3094 bytes --]

On Mon, 2017-12-18 at 19:56 -0800, Dennis Dalessandro wrote:
> Hi Jason and Doug,
> 
> Here is another set of patches to land fdor 4.16. Just driver changes and a
> small change for the MAINTIANERS file. We are getting rid of that ipath email
> and ensuring Mike and I both receive direct mails for all our drivers instead.
> 
> https://github.com/ddalessa/kernel/tree/for-4.16
> 
> ---
> 
> Dennis Dalessandro (2):
>       IB/hfi1: Check return value of strchr before using it
>       rdma: Update maintainer contact for Intel RDMA drivers
> 
> Kaike Wan (2):
>       IB/rdmavt: No need to cancel RNRNAK retry timer when it is running
>       IB/rdmavt: Add trace for RNRNAK timer
> 
> Michael J. Ruhl (3):
>       IB/{rdmavt,hfi1,qib}: Self determine driver name
>       IB/{rdmavt,hfi1,qib}: Remove get_card_name() downcall
>       IB/{hfi1,qib}: Fix a concurrency issue with device name in logging
> 
> Mike Marciniszyn (2):
>       IB/rdmavt: Use correct numa node for SRQ allocation
>       IB/rdmavt: Allocate CQ memory on the correct node
> 
> Patel Jay P (1):
>       IB/hfi1: Destroy link_wq workqueue after free_irq()
> 
> Sebastian Sanchez (1):
>       IB/hfi1: Fix infinite loop in 8051 command error path
> 
> 
>  MAINTAINERS                             |    4 +
>  drivers/infiniband/hw/hfi1/chip.c       |   98 +++++++++++++------------------
>  drivers/infiniband/hw/hfi1/chip.h       |    2 -
>  drivers/infiniband/hw/hfi1/driver.c     |   16 -----
>  drivers/infiniband/hw/hfi1/firmware.c   |   64 +++++---------------
>  drivers/infiniband/hw/hfi1/hfi.h        |   27 +++++----
>  drivers/infiniband/hw/hfi1/init.c       |   33 ++++++++--
>  drivers/infiniband/hw/hfi1/verbs.c      |   10 ++-
>  drivers/infiniband/hw/qib/qib.h         |    8 +--
>  drivers/infiniband/hw/qib/qib_driver.c  |   16 -----
>  drivers/infiniband/hw/qib/qib_eeprom.c  |    3 -
>  drivers/infiniband/hw/qib/qib_init.c    |    2 +
>  drivers/infiniband/hw/qib/qib_verbs.c   |    2 -
>  drivers/infiniband/sw/rdmavt/cq.c       |   10 ++-
>  drivers/infiniband/sw/rdmavt/qp.c       |    9 +--
>  drivers/infiniband/sw/rdmavt/srq.c      |   16 +++--
>  drivers/infiniband/sw/rdmavt/trace.h    |    4 +
>  drivers/infiniband/sw/rdmavt/trace_qp.h |   42 +++++++++++++
>  drivers/infiniband/sw/rdmavt/vt.c       |    1 
>  drivers/infiniband/sw/rdmavt/vt.h       |    6 +-
>  include/rdma/rdma_vt.h                  |   31 ++++++++--
>  21 files changed, 204 insertions(+), 200 deletions(-)

Hi Denny,

I went through this series.  Patch 1 had already been dropped.  I
dropped patch 2 as deferred in lieu of a better patch.  Jason had
already taken patch 9.  Patch 8 was the only one with outstanding
comments, and it was just a formatting issue.  So, I took patches 3-8
and 10-11, fixed up the formatting in patch 7 myself, and these are now
applied.  Thanks!

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-01-05 18:36 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-19  3:56 [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017 Dennis Dalessandro
     [not found] ` <20171219034753.2126.78386.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-19  3:56   ` [PATCH for-next 01/11] IB/hfi1: Destroy link_wq workqueue after free_irq() Dennis Dalessandro
     [not found]     ` <20171219035612.2126.10447.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-19 20:57       ` Jason Gunthorpe
     [not found]         ` <20171219205754.GE14814-uk2M96/98Pc@public.gmane.org>
2017-12-20 21:01           ` Ruhl, Michael J
     [not found]             ` <14063C7AD467DE4B82DEDB5C278E86639F0E3917-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-12-20 21:11               ` Jason Gunthorpe
     [not found]                 ` <20171220211112.GG22908-uk2M96/98Pc@public.gmane.org>
2017-12-22 13:13                   ` Ruhl, Michael J
2017-12-19  3:56   ` [PATCH for-next 02/11] IB/hfi1: Check return value of strchr before using it Dennis Dalessandro
     [not found]     ` <20171219035621.2126.23093.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-20  8:25       ` Leon Romanovsky
     [not found]         ` <20171220082555.GN2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-03 15:05           ` Dennis Dalessandro
     [not found]             ` <f5849e2b-c8cd-b93b-f32f-f423bff9ae31-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-01-03 15:27               ` Leon Romanovsky
     [not found]                 ` <20180103152721.GT10145-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-03 15:42                   ` Dennis Dalessandro
     [not found]                     ` <4555c08f-a568-48ea-e183-2d49ebd36c7c-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-01-05 17:39                       ` Doug Ledford
2017-12-19  3:56   ` [PATCH for-next 03/11] IB/rdmavt: No need to cancel RNRNAK retry timer when it is running Dennis Dalessandro
2017-12-19  3:56   ` [PATCH for-next 04/11] IB/{rdmavt, hfi1, qib}: Self determine driver name Dennis Dalessandro
     [not found]     ` <20171219035635.2126.59763.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-19 20:59       ` Jason Gunthorpe
2017-12-19  3:56   ` [PATCH for-next 05/11] IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall Dennis Dalessandro
2017-12-19  3:56   ` [PATCH for-next 06/11] IB/rdmavt: Use correct numa node for SRQ allocation Dennis Dalessandro
     [not found]     ` <20171219035649.2126.1625.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-20  8:17       ` Leon Romanovsky
     [not found]         ` <20171220081720.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-12-20  8:31           ` Leon Romanovsky
2017-12-19  3:56   ` [PATCH for-next 07/11] IB/hfi1: Fix infinite loop in 8051 command error path Dennis Dalessandro
     [not found]     ` <20171219035657.2126.88651.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-20  8:08       ` Leon Romanovsky
     [not found]         ` <20171220080854.GL2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-12-20 18:02           ` Sanchez, Sebastian
     [not found]             ` <5CDA63463B33C94CA80846587415F0772829387D-8oqHQFITsIGkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-12-20 18:12               ` Jason Gunthorpe
     [not found]                 ` <20171220181244.GD22908-uk2M96/98Pc@public.gmane.org>
2017-12-20 22:24                   ` Sanchez, Sebastian
2017-12-19  3:57   ` [PATCH for-next 08/11] IB/rdmavt: Allocate CQ memory on the correct node Dennis Dalessandro
2017-12-19  3:57   ` [PATCH for-next 09/11] rdma: Update maintainer contact for Intel RDMA drivers Dennis Dalessandro
     [not found]     ` <20171219035711.2126.47130.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-12-19 20:51       ` Jason Gunthorpe
2017-12-22 23:39       ` Jason Gunthorpe
2017-12-19  3:57   ` [PATCH for-next 10/11] IB/{hfi1, qib}: Fix a concurrency issue with device name in logging Dennis Dalessandro
2017-12-19  3:57   ` [PATCH for-next 11/11] IB/rdmavt: Add trace for RNRNAK timer Dennis Dalessandro
2018-01-05 18:36   ` [PATCH for-next 00/11] IB/hfi1, rdmavt, qib: Driver updates for 12/18/2017 Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.