All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* rdma_cm NULL deref in 4.11.0+
@ 2017-05-21 13:59 Sagi Grimberg
       [not found] ` <acc72471-eb22-4474-04f5-db23227faadd-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2017-05-21 13:59 UTC (permalink / raw
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Just stepped on it,

Simple nvmf connect triggers it, is this known?
Also, rping client segfaults so librdmacm seems to be broken.

--
[   16.809498] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000008
[   16.812570] IP: __radix_tree_lookup+0xe/0xf0
[   16.814172] PGD 0
[   16.814174] P4D 0

[   16.815052] Oops: 0000 [#1] SMP
[   16.815401] Modules linked in: nvme_loop nvme_fabrics nvme_core 
nvmet_rdma nvmet rdma_cm iw_cm null_blk mlx5_ib iscsi_target_mod ib_srpt 
ib_cm ib_core tcm_loop tcm_fc libfc tcm_qla2xxx qla2xxx 
scsi_transport_fc usb_f_tcm tcm_usb_gadget libcomposite udc_core 
vhost_scsi vhost target_core_file target_core_iblock target_core_pscsi 
target_core_mod configfs kvm_intel kvm irqbypass ppdev crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd 
glue_helper cryptd input_leds joydev serio_raw i2c_piix4 parport_pc 
parport mac_hid sunrpc autofs4 8139too cirrus ttm drm_kms_helper 
mlx5_core syscopyarea ptp sysfillrect psmouse sysimgblt fb_sys_fops 
pps_core drm floppy 8139cp mii pata_acpi
[   16.821972] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0+ #158
[   16.822656] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   16.823630] Workqueue: ib_cm cm_work_handler [ib_cm]
[   16.824144] task: ffff8e013d9810c0 task.stack: ffff9afc801a4000
[   16.824754] RIP: 0010:__radix_tree_lookup+0xe/0xf0
[   16.825248] RSP: 0018:ffff9afc801a7b48 EFLAGS: 00010246
[   16.825791] RAX: ffff8e0135d70f80 RBX: ffff8e0137130a00 RCX: 
0000000000000000
[   16.826497] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[   16.827209] RBP: ffff9afc801a7b50 R08: ffff9afc801a7a48 R09: 
ffff8e0139b35030
[   16.827916] R10: 0000000000000000 R11: 0000000000000040 R12: 
ffff8e0137130a88
[   16.828631] R13: ffff8e0137130a88 R14: ffff8e0135786200 R15: 
ffff8e0137130c00
[   16.829317] FS:  0000000000000000(0000) GS:ffff8e013fc00000(0000) 
knlGS:0000000000000000
[   16.830084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.830629] CR2: 0000000000000008 CR3: 000000000fe09000 CR4: 
00000000003406f0
[   16.831278] Call Trace:
[   16.831511]  radix_tree_lookup+0xd/0x10
[   16.831865]  cma_ps_find+0x59/0x70 [rdma_cm]
[   16.832287]  cma_id_from_event+0xe8/0x5a0 [rdma_cm]
[   16.832734]  cma_req_handler+0x49/0x970 [rdma_cm]
[   16.833166]  ? cma_req_handler+0x49/0x970 [rdma_cm]
[   16.833612]  cm_process_work+0x25/0x120 [ib_cm]
[   16.834026]  ? cm_process_work+0x25/0x120 [ib_cm]
[   16.834455]  ? cm_get_bth_pkey.isra.36+0x3a/0xa0 [ib_cm]
[   16.834938]  cm_req_handler+0xad2/0xd30 [ib_cm]
[   16.835356]  cm_work_handler+0x196/0x16fa [ib_cm]
[   16.835785]  ? cm_work_handler+0x196/0x16fa [ib_cm]
[   16.836263]  process_one_work+0x156/0x3f0
[   16.836631]  worker_thread+0x4b/0x410
[   16.836969]  kthread+0x109/0x140
[   16.837268]  ? process_one_work+0x3f0/0x3f0
[   16.837650]  ? kthread_create_on_node+0x40/0x40
[   16.838070]  ret_from_fork+0x2c/0x40
[   16.838399] Code: ff 45 00 7e 03 e9 64 ff ff ff 4c 89 23 e9 0e ff ff 
ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 49 89 ca 41 bb 40 00 00 00 48 89 
e5 53 <4c> 8b 47 08 4c 89 c0 83 e0 03 48 83 f8 01 0f 85 a9 00 00 00 4c
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: rdma_cm NULL deref in 4.11.0+
       [not found] ` <acc72471-eb22-4474-04f5-db23227faadd-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2017-05-21 14:30   ` Parav Pandit
       [not found]     ` <VI1PR0502MB30083E6CB505CA90A9E047C0D1FB0-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Parav Pandit @ 2017-05-21 14:30 UTC (permalink / raw
  To: Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4219 bytes --]

Hi Sagi,

Majd encountered same sometime back and reported [1].
He has the fix should be posting the fix soon.

Majd/Leon?

Parav

[1] https://www.spinics.net/lists/linux-rdma/msg49857.html


> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Sagi Grimberg
> Sent: Sunday, May 21, 2017 9:00 AM
> To: linux-rdma@vger.kernel.org
> Subject: rdma_cm NULL deref in 4.11.0+
> 
> Just stepped on it,
> 
> Simple nvmf connect triggers it, is this known?
> Also, rping client segfaults so librdmacm seems to be broken.
> 
> --
> [   16.809498] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000008
> [   16.812570] IP: __radix_tree_lookup+0xe/0xf0
> [   16.814172] PGD 0
> [   16.814174] P4D 0
> 
> [   16.815052] Oops: 0000 [#1] SMP
> [   16.815401] Modules linked in: nvme_loop nvme_fabrics nvme_core
> nvmet_rdma nvmet rdma_cm iw_cm null_blk mlx5_ib iscsi_target_mod
> ib_srpt ib_cm ib_core tcm_loop tcm_fc libfc tcm_qla2xxx qla2xxx
> scsi_transport_fc usb_f_tcm tcm_usb_gadget libcomposite udc_core
> vhost_scsi vhost target_core_file target_core_iblock target_core_pscsi
> target_core_mod configfs kvm_intel kvm irqbypass ppdev crct10dif_pclmul
> crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd
> glue_helper cryptd input_leds joydev serio_raw i2c_piix4 parport_pc parport
> mac_hid sunrpc autofs4 8139too cirrus ttm drm_kms_helper mlx5_core
> syscopyarea ptp sysfillrect psmouse sysimgblt fb_sys_fops pps_core drm
> floppy 8139cp mii pata_acpi
> [   16.821972] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0+ #158
> [   16.822656] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [   16.823630] Workqueue: ib_cm cm_work_handler [ib_cm]
> [   16.824144] task: ffff8e013d9810c0 task.stack: ffff9afc801a4000
> [   16.824754] RIP: 0010:__radix_tree_lookup+0xe/0xf0
> [   16.825248] RSP: 0018:ffff9afc801a7b48 EFLAGS: 00010246
> [   16.825791] RAX: ffff8e0135d70f80 RBX: ffff8e0137130a00 RCX:
> 0000000000000000
> [   16.826497] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [   16.827209] RBP: ffff9afc801a7b50 R08: ffff9afc801a7a48 R09:
> ffff8e0139b35030
> [   16.827916] R10: 0000000000000000 R11: 0000000000000040 R12:
> ffff8e0137130a88
> [   16.828631] R13: ffff8e0137130a88 R14: ffff8e0135786200 R15:
> ffff8e0137130c00
> [   16.829317] FS:  0000000000000000(0000) GS:ffff8e013fc00000(0000)
> knlGS:0000000000000000
> [   16.830084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.830629] CR2: 0000000000000008 CR3: 000000000fe09000 CR4:
> 00000000003406f0
> [   16.831278] Call Trace:
> [   16.831511]  radix_tree_lookup+0xd/0x10
> [   16.831865]  cma_ps_find+0x59/0x70 [rdma_cm]
> [   16.832287]  cma_id_from_event+0xe8/0x5a0 [rdma_cm]
> [   16.832734]  cma_req_handler+0x49/0x970 [rdma_cm]
> [   16.833166]  ? cma_req_handler+0x49/0x970 [rdma_cm]
> [   16.833612]  cm_process_work+0x25/0x120 [ib_cm]
> [   16.834026]  ? cm_process_work+0x25/0x120 [ib_cm]
> [   16.834455]  ? cm_get_bth_pkey.isra.36+0x3a/0xa0 [ib_cm]
> [   16.834938]  cm_req_handler+0xad2/0xd30 [ib_cm]
> [   16.835356]  cm_work_handler+0x196/0x16fa [ib_cm]
> [   16.835785]  ? cm_work_handler+0x196/0x16fa [ib_cm]
> [   16.836263]  process_one_work+0x156/0x3f0
> [   16.836631]  worker_thread+0x4b/0x410
> [   16.836969]  kthread+0x109/0x140
> [   16.837268]  ? process_one_work+0x3f0/0x3f0
> [   16.837650]  ? kthread_create_on_node+0x40/0x40
> [   16.838070]  ret_from_fork+0x2c/0x40
> [   16.838399] Code: ff 45 00 7e 03 e9 64 ff ff ff 4c 89 23 e9 0e ff ff
> ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 49 89 ca 41 bb 40 00 00 00 48 89
> e5 53 <4c> 8b 47 08 4c 89 c0 83 e0 03 48 83 f8 01 0f 85 a9 00 00 00 4c
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]     ` <VI1PR0502MB30083E6CB505CA90A9E047C0D1FB0-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-05-21 15:44       ` Leon Romanovsky
       [not found]         ` <20170521154443.GD17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2017-05-21 15:44 UTC (permalink / raw
  To: Parav Pandit
  Cc: Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 4750 bytes --]

On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> Hi Sagi,
>
> Majd encountered same sometime back and reported [1].
> He has the fix should be posting the fix soon.
>
> Majd/Leon?

The fix is in our rdma-rc branch.
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d

I planned to submit it today.

Thanks


>
> Parav
>
> [1] https://www.spinics.net/lists/linux-rdma/msg49857.html
>
>
> > -----Original Message-----
> > From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> > owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Sagi Grimberg
> > Sent: Sunday, May 21, 2017 9:00 AM
> > To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Subject: rdma_cm NULL deref in 4.11.0+
> >
> > Just stepped on it,
> >
> > Simple nvmf connect triggers it, is this known?
> > Also, rping client segfaults so librdmacm seems to be broken.
> >
> > --
> > [   16.809498] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000008
> > [   16.812570] IP: __radix_tree_lookup+0xe/0xf0
> > [   16.814172] PGD 0
> > [   16.814174] P4D 0
> >
> > [   16.815052] Oops: 0000 [#1] SMP
> > [   16.815401] Modules linked in: nvme_loop nvme_fabrics nvme_core
> > nvmet_rdma nvmet rdma_cm iw_cm null_blk mlx5_ib iscsi_target_mod
> > ib_srpt ib_cm ib_core tcm_loop tcm_fc libfc tcm_qla2xxx qla2xxx
> > scsi_transport_fc usb_f_tcm tcm_usb_gadget libcomposite udc_core
> > vhost_scsi vhost target_core_file target_core_iblock target_core_pscsi
> > target_core_mod configfs kvm_intel kvm irqbypass ppdev crct10dif_pclmul
> > crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd
> > glue_helper cryptd input_leds joydev serio_raw i2c_piix4 parport_pc parport
> > mac_hid sunrpc autofs4 8139too cirrus ttm drm_kms_helper mlx5_core
> > syscopyarea ptp sysfillrect psmouse sysimgblt fb_sys_fops pps_core drm
> > floppy 8139cp mii pata_acpi
> > [   16.821972] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0+ #158
> > [   16.822656] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> > [   16.823630] Workqueue: ib_cm cm_work_handler [ib_cm]
> > [   16.824144] task: ffff8e013d9810c0 task.stack: ffff9afc801a4000
> > [   16.824754] RIP: 0010:__radix_tree_lookup+0xe/0xf0
> > [   16.825248] RSP: 0018:ffff9afc801a7b48 EFLAGS: 00010246
> > [   16.825791] RAX: ffff8e0135d70f80 RBX: ffff8e0137130a00 RCX:
> > 0000000000000000
> > [   16.826497] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> > 0000000000000000
> > [   16.827209] RBP: ffff9afc801a7b50 R08: ffff9afc801a7a48 R09:
> > ffff8e0139b35030
> > [   16.827916] R10: 0000000000000000 R11: 0000000000000040 R12:
> > ffff8e0137130a88
> > [   16.828631] R13: ffff8e0137130a88 R14: ffff8e0135786200 R15:
> > ffff8e0137130c00
> > [   16.829317] FS:  0000000000000000(0000) GS:ffff8e013fc00000(0000)
> > knlGS:0000000000000000
> > [   16.830084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   16.830629] CR2: 0000000000000008 CR3: 000000000fe09000 CR4:
> > 00000000003406f0
> > [   16.831278] Call Trace:
> > [   16.831511]  radix_tree_lookup+0xd/0x10
> > [   16.831865]  cma_ps_find+0x59/0x70 [rdma_cm]
> > [   16.832287]  cma_id_from_event+0xe8/0x5a0 [rdma_cm]
> > [   16.832734]  cma_req_handler+0x49/0x970 [rdma_cm]
> > [   16.833166]  ? cma_req_handler+0x49/0x970 [rdma_cm]
> > [   16.833612]  cm_process_work+0x25/0x120 [ib_cm]
> > [   16.834026]  ? cm_process_work+0x25/0x120 [ib_cm]
> > [   16.834455]  ? cm_get_bth_pkey.isra.36+0x3a/0xa0 [ib_cm]
> > [   16.834938]  cm_req_handler+0xad2/0xd30 [ib_cm]
> > [   16.835356]  cm_work_handler+0x196/0x16fa [ib_cm]
> > [   16.835785]  ? cm_work_handler+0x196/0x16fa [ib_cm]
> > [   16.836263]  process_one_work+0x156/0x3f0
> > [   16.836631]  worker_thread+0x4b/0x410
> > [   16.836969]  kthread+0x109/0x140
> > [   16.837268]  ? process_one_work+0x3f0/0x3f0
> > [   16.837650]  ? kthread_create_on_node+0x40/0x40
> > [   16.838070]  ret_from_fork+0x2c/0x40
> > [   16.838399] Code: ff 45 00 7e 03 e9 64 ff ff ff 4c 89 23 e9 0e ff ff
> > ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 49 89 ca 41 bb 40 00 00 00 48 89
> > e5 53 <4c> 8b 47 08 4c 89 c0 83 e0 03 48 83 f8 01 0f 85 a9 00 00 00 4c
> > --
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> > body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> N?????r??y????b?X??ǧv?^?)޺{.n?+????{??ٚ?{ay?\x1dʇڙ?,j\a??f???h???z?\x1e?w???\f???j:+v???w?j?m????\a????zZ+?????ݢj"??!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]         ` <20170521154443.GD17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-24 16:38           ` Jason Gunthorpe
       [not found]             ` <20170524163832.GA23034-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2017-05-24 16:38 UTC (permalink / raw
  To: Leon Romanovsky
  Cc: Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
> On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> > Hi Sagi,
> >
> > Majd encountered same sometime back and reported [1].
> > He has the fix should be posting the fix soon.
> >
> > Majd/Leon?
> 
> The fix is in our rdma-rc branch.
> https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> 
> I planned to submit it today.

Is someone going to look at the segfault in librdmacm?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]             ` <20170524163832.GA23034-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-05-24 18:52               ` Leon Romanovsky
       [not found]                 ` <20170524185256.GT17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
       [not found]                 ` <47078d0f-dc39-4a34-b641-0348877ca718@googlegroups.com>
  0 siblings, 2 replies; 12+ messages in thread
From: Leon Romanovsky @ 2017-05-24 18:52 UTC (permalink / raw
  To: Jason Gunthorpe
  Cc: Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 1093 bytes --]

On Wed, May 24, 2017 at 10:38:32AM -0600, Jason Gunthorpe wrote:
> On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
> > On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> > > Hi Sagi,
> > >
> > > Majd encountered same sometime back and reported [1].
> > > He has the fix should be posting the fix soon.
> > >
> > > Majd/Leon?
> >
> > The fix is in our rdma-rc branch.
> > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> >
> > I planned to submit it today.
>
> Is someone going to look at the segfault in librdmacm?

Does it reproduce with these two patches?
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=1d6af11df23f7df4963af7513a3dad109acbcd4c

We are successfully run our regression suite on my rdma-rc branch.
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=rdma-rc

>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]                 ` <20170524185256.GT17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-24 19:19                   ` Jason Gunthorpe
       [not found]                     ` <20170524191900.GA25200-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2017-05-24 19:19 UTC (permalink / raw
  To: Leon Romanovsky
  Cc: Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, May 24, 2017 at 09:52:56PM +0300, Leon Romanovsky wrote:
> On Wed, May 24, 2017 at 10:38:32AM -0600, Jason Gunthorpe wrote:
> > On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
> > > On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> > > > Hi Sagi,
> > > >
> > > > Majd encountered same sometime back and reported [1].
> > > > He has the fix should be posting the fix soon.
> > > >
> > > > Majd/Leon?
> > >
> > > The fix is in our rdma-rc branch.
> > > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> > >
> > > I planned to submit it today.
> >
> > Is someone going to look at the segfault in librdmacm?
> 
> Does it reproduce with these two patches?

I don't think librdmacm should segfault even if the kernel is
malfunctioning, should it?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]                     ` <20170524191900.GA25200-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-05-24 19:58                       ` Majd Dibbiny
       [not found]                         ` <F8149B59-E43D-4E54-A901-1351BED23946-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Majd Dibbiny @ 2017-05-24 19:58 UTC (permalink / raw
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org


> On May 24, 2017, at 10:19 PM, Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote:
> 
>> On Wed, May 24, 2017 at 09:52:56PM +0300, Leon Romanovsky wrote:
>>> On Wed, May 24, 2017 at 10:38:32AM -0600, Jason Gunthorpe wrote:
>>>> On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
>>>>> On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
>>>>> Hi Sagi,
>>>>> 
>>>>> Majd encountered same sometime back and reported [1].
>>>>> He has the fix should be posting the fix soon.
>>>>> 
>>>>> Majd/Leon?
>>>> 
>>>> The fix is in our rdma-rc branch.
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
>>>> 
>>>> I planned to submit it today.
>>> 
>>> Is someone going to look at the segfault in librdmacm?
>> 
>> Does it reproduce with these two patches?
> 
> I don't think librdmacm should segfault even if the kernel is
> malfunctioning, should it?
The problem now is a kernel panic.. which segfault are you referring to?
> 
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]                         ` <F8149B59-E43D-4E54-A901-1351BED23946-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-05-24 20:40                           ` Jason Gunthorpe
       [not found]                             ` <20170524204000.GA30878-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2017-05-24 20:40 UTC (permalink / raw
  To: Majd Dibbiny
  Cc: Leon Romanovsky, Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, May 24, 2017 at 07:58:17PM +0000, Majd Dibbiny wrote:
> 
> > On May 24, 2017, at 10:19 PM, Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > 
> >> On Wed, May 24, 2017 at 09:52:56PM +0300, Leon Romanovsky wrote:
> >>> On Wed, May 24, 2017 at 10:38:32AM -0600, Jason Gunthorpe wrote:
> >>>> On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
> >>>>> On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> >>>>> Hi Sagi,
> >>>>> 
> >>>>> Majd encountered same sometime back and reported [1].
> >>>>> He has the fix should be posting the fix soon.
> >>>>> 
> >>>>> Majd/Leon?
> >>>> 
> >>>> The fix is in our rdma-rc branch.
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> >>>> 
> >>>> I planned to submit it today.
> >>> 
> >>> Is someone going to look at the segfault in librdmacm?
> >> 
> >> Does it reproduce with these two patches?
> > 
> > I don't think librdmacm should segfault even if the kernel is
> > malfunctioning, should it?
> The problem now is a kernel panic.. which segfault are you referring
> to?

The original report refered to a user space seg fault, is that just
blowback from the kernel panic?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: rdma_cm NULL deref in 4.11.0+
       [not found]                             ` <20170524204000.GA30878-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-05-24 21:13                               ` Hefty, Sean
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB120B2E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Hefty, Sean @ 2017-05-24 21:13 UTC (permalink / raw
  To: Jason Gunthorpe, Majd Dibbiny
  Cc: Leon Romanovsky, Parav Pandit, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

> The original report refered to a user space seg fault, is that just
> blowback from the kernel panic?

I think it was rping that was seg faulting, not librdmacm.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB120B2E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-05-24 21:21                                   ` Robert LeBlanc
       [not found]                                     ` <CAANLjFqb7CGJOu_T4_PhuJLOsX0mxTo6SO_R67daRHKpnMFdvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Robert LeBlanc @ 2017-05-24 21:21 UTC (permalink / raw
  To: Hefty, Sean
  Cc: Jason Gunthorpe, Majd Dibbiny, Leon Romanovsky, Parav Pandit,
	Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, May 24, 2017 at 3:13 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> The original report refered to a user space seg fault, is that just
>> blowback from the kernel panic?
>
> I think it was rping that was seg faulting, not librdmacm.

If this is in regard to "rdma_cm segfaults on RoCE with ConnectX-4
[WAS: Re: rping segfault with 4.9.28 on CentOS 7.3]", I think we have
narrowed it down to the node GUID being '0' or something along those
lines. We are still digging into it. We are not getting a kernel
backtrace when librdmacm segfaults.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: rdma_cm NULL deref in 4.11.0+
       [not found]                                     ` <CAANLjFqb7CGJOu_T4_PhuJLOsX0mxTo6SO_R67daRHKpnMFdvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-05-24 21:37                                       ` Hefty, Sean
  0 siblings, 0 replies; 12+ messages in thread
From: Hefty, Sean @ 2017-05-24 21:37 UTC (permalink / raw
  To: Robert LeBlanc
  Cc: Jason Gunthorpe, Majd Dibbiny, Leon Romanovsky, Parav Pandit,
	Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 940 bytes --]

> If this is in regard to "rdma_cm segfaults on RoCE with ConnectX-4
> [WAS: Re: rping segfault with 4.9.28 on CentOS 7.3]", I think we have
> narrowed it down to the node GUID being '0' or something along those
> lines. We are still digging into it. We are not getting a kernel
> backtrace when librdmacm segfaults.

The librdmacm associates user space id's with devices based on the node guid.  And the check to make this association sees if the node guid from the kernel is non-zero.  There may be an assumption further on in the code that a device has been assigned, but one was not, and no error was reported.

This could very well be coming from an error in the kernel not reporting the node_guid correctly, which prevents the librdmacm from making the expected device association.

- Sean
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: rdma_cm NULL deref in 4.11.0+
       [not found]                   ` <47078d0f-dc39-4a34-b641-0348877ca718-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-05-29  5:27                     ` Leon Romanovsky
  0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2017-05-29  5:27 UTC (permalink / raw
  To: sbranden-Re5JQEeQqe8AvxtiuMwx3w
  Cc: ml-mirrors, jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	parav-VPRAkNaXOzVWk0Htik3J/w, sagi-NQWnxTmZq1alnMjI0IkVqw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1833 bytes --]

On Sun, May 28, 2017 at 01:54:59PM -0700, sbranden-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> Hi Leon,
>
> On Wednesday, May 24, 2017 at 11:53:07 AM UTC-7, Leon Romanovsky wrote:
> >
> > On Wed, May 24, 2017 at 10:38:32AM -0600, Jason Gunthorpe wrote:
> > > On Sun, May 21, 2017 at 06:44:43PM +0300, Leon Romanovsky wrote:
> > > > On Sun, May 21, 2017 at 02:30:04PM +0000, Parav Pandit wrote:
> > > > > Hi Sagi,
> > > > >
> > > > > Majd encountered same sometime back and reported [1].
> > > > > He has the fix should be posting the fix soon.
> > > > >
> > > > > Majd/Leon?
> > > >
> > > > The fix is in our rdma-rc branch.
> > > >
> > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> > > >
> > > > I planned to submit it today.
> > >
> > > Is someone going to look at the segfault in librdmacm?
> >
> > Does it reproduce with these two patches?
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=f1fff656d55c52aeb12129f57347886b02f90e1d
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=1d6af11df23f7df4963af7513a3dad109acbcd4c
> >
> > We are successfully run our regression suite on my rdma-rc branch.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=rdma-rc
> >
> > Are these patches going to make it into linux-next soon or do they need
> any more changes?

From my point of view, they are ready.
I submitted them to the ML [1, 2] and they supposed to be added to linux-next and forwarded to Linus by Doug.

[1] https://patchwork.kernel.org/patch/9739177/
[2] https://patchwork.kernel.org/patch/9739173/

Thanks


>
> > >
> > > Jason
> >
>
> Thanks,
>  Scott


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-05-29  5:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-21 13:59 rdma_cm NULL deref in 4.11.0+ Sagi Grimberg
     [not found] ` <acc72471-eb22-4474-04f5-db23227faadd-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-05-21 14:30   ` Parav Pandit
     [not found]     ` <VI1PR0502MB30083E6CB505CA90A9E047C0D1FB0-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-05-21 15:44       ` Leon Romanovsky
     [not found]         ` <20170521154443.GD17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-24 16:38           ` Jason Gunthorpe
     [not found]             ` <20170524163832.GA23034-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-05-24 18:52               ` Leon Romanovsky
     [not found]                 ` <20170524185256.GT17751-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-24 19:19                   ` Jason Gunthorpe
     [not found]                     ` <20170524191900.GA25200-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-05-24 19:58                       ` Majd Dibbiny
     [not found]                         ` <F8149B59-E43D-4E54-A901-1351BED23946-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-05-24 20:40                           ` Jason Gunthorpe
     [not found]                             ` <20170524204000.GA30878-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-05-24 21:13                               ` Hefty, Sean
     [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB120B2E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-05-24 21:21                                   ` Robert LeBlanc
     [not found]                                     ` <CAANLjFqb7CGJOu_T4_PhuJLOsX0mxTo6SO_R67daRHKpnMFdvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-05-24 21:37                                       ` Hefty, Sean
     [not found]                 ` <47078d0f-dc39-4a34-b641-0348877ca718@googlegroups.com>
     [not found]                   ` <47078d0f-dc39-4a34-b641-0348877ca718-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-05-29  5:27                     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.