From mboxrd@z Thu Jan 1 00:00:00 1970 From: Santosh Shilimkar Subject: [PATCH 2/6] RDS: Fix rds MR reference count in rds_rdma_unuse() Date: Tue, 25 Aug 2015 12:01:59 -0700 Message-ID: <1440529323-4171-3-git-send-email-santosh.shilimkar@oracle.com> References: <1440529323-4171-1-git-send-email-santosh.shilimkar@oracle.com> Cc: linux-kernel@vger.kernel.org, davem@davemloft.net, ssantosh@kernel.org, Santosh Shilimkar To: netdev@vger.kernel.org Return-path: In-Reply-To: <1440529323-4171-1-git-send-email-santosh.shilimkar@oracle.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org rds_rdma_unuse() drops the mr reference count which it hasn't taken. Correct way of removing mr is to remove mr from the tree and then rdma_destroy_mr() it first, then rds_mr_put() to decrement its reference count. Whichever thread holds last reference will free the mr via rds_mr_put() This bug was triggering weird null pointer crashes. One if the trace for it is captured below. BUG: unable to handle kernel NULL pointer dereference at 0000000000000104 IP: [] rds_ib_free_mr+0x31/0x130 [rds_rdma] PGD 4366fa067 PUD 4366f9067 PMD 0 Oops: 0000 [#1] SMP [...] task: ffff88046da6a000 ti: ffff88046da6c000 task.ti: ffff88046da6c000 RIP: 0010:[] [] rds_ib_free_mr+0x31/0x130 [rds_rdma] RSP: 0018:ffff88046fa43bd8 EFLAGS: 00010286 RAX: 0000000071d38b80 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880079e7ff40 RBP: ffff88046fa43bf8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88046fa43ca8 R11: ffff88046a802ed8 R12: ffff880079e7fa40 R13: 0000000000000000 R14: ffff880079e7ff40 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88046fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000104 CR3: 00000004366fb000 CR4: 00000000000006e0 Stack: ffff880079e7fa40 ffff880671d38f08 ffff880079e7ff40 0000000000000296 ffff88046fa43c28 ffffffffa087a38b ffff880079e7fa40 ffff880671d38f10 0000000000000000 0000000000000292 ffff88046fa43c48 ffffffffa087a3b6 Call Trace: [] rds_destroy_mr+0x8b/0xa0 [rds] [] __rds_put_mr_final+0x16/0x30 [rds] [] rds_rdma_unuse+0xc2/0x120 [rds] [] rds_recv_incoming_exthdrs+0x83/0xa0 [rds] [] rds_recv_incoming+0x92/0x200 [rds] [] rds_ib_process_recv+0x259/0x320 [rds_rdma] [] rds_ib_recv_tasklet_fn+0x1a8/0x490 [rds_rdma] [] ? __remove_hrtimer+0x58/0x90 [] tasklet_action+0xb1/0xc0 [] __do_softirq+0xe2/0x290 [] irq_exit+0xa6/0xb0 [] do_IRQ+0x65/0xf0 [] common_interrupt+0x6b/0x6b Signed-off-by: Santosh Shilimkar Signed-off-by: Santosh Shilimkar --- net/rds/rdma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/rds/rdma.c b/net/rds/rdma.c index c1df9b1..4c93bad 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -435,9 +435,10 @@ void rds_rdma_unuse(struct rds_sock *rs, u32 r_key, int force) /* If the MR was marked as invalidate, this will * trigger an async flush. */ - if (zot_me) + if (zot_me) { rds_destroy_mr(mr); - rds_mr_put(mr); + rds_mr_put(mr); + } } void rds_rdma_free_op(struct rm_rdma_op *ro) -- 1.9.1