[PATCH] af_packet: Raw socket destruction warning fix

LKML Archive mirror
 help / color / mirror / Atom feed

* [PATCH] af_packet: Raw socket destruction warning fix
@ 2016-01-18  6:37 Maninder Singh
  2016-01-18  9:44 ` Daniel Borkmann
  0 siblings, 1 reply; 11+ messages in thread
From: Maninder Singh @ 2016-01-18  6:37 UTC (permalink / raw)
  To: davem, willemb, daniel, edumazet, eyal.birger, tklauser, fruggeri,
	dwmw2, netdev, linux-kernel
  Cc: pankaj.m, gh007.kim, hakbong5.lee, Maninder Singh, Vaneet Narang

Receieve queue is not purged when socket dectruction is called
results in kernel warning because of non zero sk_rmem_alloc.

WARNING: at net/packet/af_packet.c:1142 packet_sock_destruct

Backtrace:
WARN_ON(atomic_read(&sk->sk_rmem_alloc)
packet_sock_destruct
__sk_free
sock_wfree
skb_release_head_state
skb_release_all
__kfree_skb
net_tx_action
__do_softirq
run_ksoftirqd

Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
---
 net/packet/af_packet.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 81b4b81..bcb37ba 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1310,6 +1310,7 @@ static bool packet_rcv_has_room(struct packet_sock *po, struct sk_buff *skb)
 
 static void packet_sock_destruct(struct sock *sk)
 {
+	skb_queue_purge(&sk->sk_receive_queue);
 	skb_queue_purge(&sk->sk_error_queue);
 
 	WARN_ON(atomic_read(&sk->sk_rmem_alloc));
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-01-18  6:37 Maninder Singh
@ 2016-01-18  9:44 ` Daniel Borkmann
  2016-01-18 10:29   ` Daniel Borkmann
  2016-02-05 11:26   ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Borkmann @ 2016-01-18  9:44 UTC (permalink / raw)
  To: Maninder Singh
  Cc: davem, willemb, edumazet, eyal.birger, tklauser, fruggeri, dwmw2,
	netdev, linux-kernel, pankaj.m, gh007.kim, hakbong5.lee,
	Vaneet Narang

On 01/18/2016 07:37 AM, Maninder Singh wrote:
> Receieve queue is not purged when socket dectruction is called
> results in kernel warning because of non zero sk_rmem_alloc.
>
> WARNING: at net/packet/af_packet.c:1142 packet_sock_destruct
>
> Backtrace:
> WARN_ON(atomic_read(&sk->sk_rmem_alloc)
> packet_sock_destruct
> __sk_free
> sock_wfree
> skb_release_head_state
> skb_release_all
> __kfree_skb
> net_tx_action
> __do_softirq
> run_ksoftirqd
>
> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>

Thanks for the fix. While it fixes the WARN_ON(), I believe some more
investigation is needed here on why it is happening:

We call first into packet_release(), which removes the socket hook from
the kernel (unregister_prot_hook()), later calls synchronize_net() to
make sure no more skbs will come in. The receive queue is purged right
after the synchronize_net() already.

packet_sock_destruct() will be called afterwards, when there are no more
refs on the socket anymore and no af_packet skbs in tx waiting for completion.
Only then, in sk_destruct(), we'll call into packet_sock_destruct().

So, eventually double purging the sk_receive_queue seems not the right
thing to do at first look, and w/o any deeper analysis in the commit description.

Could you look a bit further into the issue? Do you have a reproducer to
trigger it?

Thanks,
Daniel

> ---
>   net/packet/af_packet.c |    1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 81b4b81..bcb37ba 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1310,6 +1310,7 @@ static bool packet_rcv_has_room(struct packet_sock *po, struct sk_buff *skb)
>
>   static void packet_sock_destruct(struct sock *sk)
>   {
> +	skb_queue_purge(&sk->sk_receive_queue);
>   	skb_queue_purge(&sk->sk_error_queue);
>
>   	WARN_ON(atomic_read(&sk->sk_rmem_alloc));
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
@ 2016-01-18 10:11 Vaneet Narang
  2016-01-18 11:08 ` Daniel Borkmann
  0 siblings, 1 reply; 11+ messages in thread
From: Vaneet Narang @ 2016-01-18 10:11 UTC (permalink / raw)
  To: Daniel Borkmann, Maninder Singh
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee

Hi,

>> __do_softirq
>> run_ksoftirqd
>>
>> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
>> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>

> Thanks for the fix. While it fixes the WARN_ON(), I believe some more
> investigation is needed here on why it is happening:
>
> We call first into packet_release(), which removes the socket hook from
> the kernel (unregister_prot_hook()), later calls synchronize_net() to
> make sure no more skbs will come in. The receive queue is purged right
> after the synchronize_net() already.
>
> packet_sock_destruct() will be called afterwards, when there are no more
> refs on the socket anymore and no af_packet skbs in tx waiting for completion.
> Only then, in sk_destruct(), we'll call into packet_sock_destruct().
>
> So, eventually double purging the sk_receive_queue seems not the right
> thing to do at first look, and w/o any deeper analysis in the commit description.
>
> Could you look a bit further into the issue? Do you have a reproducer to
> trigger it?


It is Suspend Resume scenario and in this case close(sock_id) is 
not called and hence packet_release is also not called. 
In case of suspend, driver power down its ethernet port and release all the
sk_buff stored in RX and TX ring. driver calls dev_kfree_skb_any to release all
the sk_buff in tx ring and if last tx buff of socket is called then 
packet_sock_destruct() will be invoked and will result in warning if and recevive sk_buff is 
still in receive queue. 

Driver calls dev_kfree_skb_any->dev_kfree_skb_irq 
and it adds buffer in completion queue to free and raises softirq NET_TX_SOFTIRQ

net_tx_action->__kfree_skb->skb_release_all->skb_release_head_state->sock_wfree->
__sk_free->packet_sock_destruct 

Also purging of receive queue has been taken care in other protocols.
// IP protocol
void inet_sock_destruct(struct sock *sk)
{
        struct inet_sock *inet = inet_sk(sk);

        __skb_queue_purge(&sk->sk_receive_queue);   // Purge Receive queue
        __skb_queue_purge(&sk->sk_error_queue);

.... 

         WARN_ON(atomic_read(&sk->sk_rmem_alloc));
        WARN_ON(atomic_read(&sk->sk_wmem_alloc));
 }

So i think it should be done in Raw sockets also.

>> ---
>>   net/packet/af_packet.c |    1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 81b4b81..bcb37ba 100644

Thanks
Vaneet Narang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-01-18  9:44 ` Daniel Borkmann
@ 2016-01-18 10:29   ` Daniel Borkmann
  2016-02-05 11:26   ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Borkmann @ 2016-01-18 10:29 UTC (permalink / raw)
  To: Maninder Singh
  Cc: davem, willemb, edumazet, eyal.birger, tklauser, fruggeri, dwmw2,
	netdev, linux-kernel, pankaj.m, gh007.kim, hakbong5.lee,
	Vaneet Narang

On 01/18/2016 10:44 AM, Daniel Borkmann wrote:
> On 01/18/2016 07:37 AM, Maninder Singh wrote:
>> Receieve queue is not purged when socket dectruction is called
>> results in kernel warning because of non zero sk_rmem_alloc.
>>
>> WARNING: at net/packet/af_packet.c:1142 packet_sock_destruct
>>
>> Backtrace:
>> WARN_ON(atomic_read(&sk->sk_rmem_alloc)
>> packet_sock_destruct
>> __sk_free
>> sock_wfree
>> skb_release_head_state
>> skb_release_all
>> __kfree_skb
>> net_tx_action
>> __do_softirq
>> run_ksoftirqd
>>
>> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
>> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
>
> Thanks for the fix. While it fixes the WARN_ON(), I believe some more
> investigation is needed here on why it is happening:
>
> We call first into packet_release(), which removes the socket hook from
> the kernel (unregister_prot_hook()), later calls synchronize_net() to
> make sure no more skbs will come in. The receive queue is purged right
> after the synchronize_net() already.
>
> packet_sock_destruct() will be called afterwards, when there are no more
> refs on the socket anymore and no af_packet skbs in tx waiting for completion.

(...and in your above case, there seem to have been some skbs in tx from
{t,}packet_snd(), as we call __sk_free() via kfree_skb() (-> sock_wfree()).)

> Only then, in sk_destruct(), we'll call into packet_sock_destruct().
>
> So, eventually double purging the sk_receive_queue seems not the right
> thing to do at first look, and w/o any deeper analysis in the commit description.
>
> Could you look a bit further into the issue? Do you have a reproducer to
> trigger it?
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-01-18 10:11 Vaneet Narang
@ 2016-01-18 11:08 ` Daniel Borkmann
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Borkmann @ 2016-01-18 11:08 UTC (permalink / raw)
  To: v.narang, Maninder Singh
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee

On 01/18/2016 11:11 AM, Vaneet Narang wrote:
> Hi,
>
>>> __do_softirq
>>> run_ksoftirqd
>>>
>>> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
>>> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
>
>> Thanks for the fix. While it fixes the WARN_ON(), I believe some more
>> investigation is needed here on why it is happening:
>>
>> We call first into packet_release(), which removes the socket hook from
>> the kernel (unregister_prot_hook()), later calls synchronize_net() to
>> make sure no more skbs will come in. The receive queue is purged right
>> after the synchronize_net() already.
>>
>> packet_sock_destruct() will be called afterwards, when there are no more
>> refs on the socket anymore and no af_packet skbs in tx waiting for completion.
>> Only then, in sk_destruct(), we'll call into packet_sock_destruct().
>>
>> So, eventually double purging the sk_receive_queue seems not the right
>> thing to do at first look, and w/o any deeper analysis in the commit description.
>>
>> Could you look a bit further into the issue? Do you have a reproducer to
>> trigger it?
>
> It is Suspend Resume scenario and in this case close(sock_id) is
> not called and hence packet_release is also not called.
> In case of suspend, driver power down its ethernet port and release all the
> sk_buff stored in RX and TX ring. driver calls dev_kfree_skb_any to release all
> the sk_buff in tx ring and if last tx buff of socket is called then
> packet_sock_destruct() will be invoked and will result in warning if and recevive sk_buff is
> still in receive queue.

Hmm, not quite. See 2b85a34e911b ("net: No more expensive sock_hold()/sock_put()
on each tx") on how it is supposed to work.

See packet_create(): sk_alloc() inits sk_wmem_alloc to 1, sock_init_data() sets
sk_refcnt to 1. sock_hold()/__sock_put() pair in packet sock is managed when we
register/unregister proto hooks.

The other sock_put() in packet_release() to drop the final ref and call into
sk_free(), which drops the 1 ref on the sk_wmem_alloc from init time. Since you
got into __sk_free() via sock_wfree() destructor, your socket must have invoked
packet_release() prior to this (perhaps kernel destroying the process).

What kernel do you use?

> Driver calls dev_kfree_skb_any->dev_kfree_skb_irq
> and it adds buffer in completion queue to free and raises softirq NET_TX_SOFTIRQ
>
> net_tx_action->__kfree_skb->skb_release_all->skb_release_head_state->sock_wfree->
> __sk_free->packet_sock_destruct
>
> Also purging of receive queue has been taken care in other protocols.
> // IP protocol
> void inet_sock_destruct(struct sock *sk)
> {
>          struct inet_sock *inet = inet_sk(sk);
>
>          __skb_queue_purge(&sk->sk_receive_queue);   // Purge Receive queue
>          __skb_queue_purge(&sk->sk_error_queue);
>
> ....
>
>           WARN_ON(atomic_read(&sk->sk_rmem_alloc));
>          WARN_ON(atomic_read(&sk->sk_wmem_alloc));
>   }
>
> So i think it should be done in Raw sockets also.
>
>>> ---
>>>    net/packet/af_packet.c |    1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>>> index 81b4b81..bcb37ba 100644
>
> Thanks
> Vaneet Narang
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
@ 2016-01-21 11:40 Maninder Singh
  2016-01-26  0:13 ` Daniel Borkmann
  0 siblings, 1 reply; 11+ messages in thread
From: Maninder Singh @ 2016-01-21 11:40 UTC (permalink / raw)
  To: Daniel Borkmann, Vaneet Narang
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee

Hi Daniel,

>The other sock_put() in packet_release() to drop the final ref and call into
>sk_free(), which drops the 1 ref on the sk_wmem_alloc from init time. Since you
>got into __sk_free() via sock_wfree() destructor, your socket must have invoked
>packet_release() prior to this (perhaps kernel destroying the process).
>
>What kernel do you use?

Issue is coming for 3.10.58.

>> Driver calls dev_kfree_skb_any->dev_kfree_skb_irq
>> and it adds buffer in completion queue to free and raises softirq NET_TX_SOFTIRQ
>>
>> net_tx_action->__kfree_skb->skb_release_all->skb_release_head_state->sock_wfree->
>> __sk_free->packet_sock_destruct
>>
>> Also purging of receive queue has been taken care in other protocols.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-01-21 11:40 Maninder Singh
@ 2016-01-26  0:13 ` Daniel Borkmann
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Borkmann @ 2016-01-26  0:13 UTC (permalink / raw)
  To: maninder1.s, Vaneet Narang
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee

On 01/21/2016 12:40 PM, Maninder Singh wrote:
>> The other sock_put() in packet_release() to drop the final ref and call into
>> sk_free(), which drops the 1 ref on the sk_wmem_alloc from init time. Since you
>> got into __sk_free() via sock_wfree() destructor, your socket must have invoked
>> packet_release() prior to this (perhaps kernel destroying the process).
>>
>> What kernel do you use?
>
> Issue is coming for 3.10.58.

[ sorry for late reply ]

What driver are you using (is that in-tree)? Can you reproduce the same issue
with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 or
4.4)? There has been quite a bit of changes in err queue handling (which also
accounts rmem) as well. How reliably can you trigger the issue? Does it trigger
with a completely different in-tree network driver as well with your tests? Would
be useful to track/debug sk_rmem_alloc increases/decreases to see from which path
new rmem is being charged in the time between packet_release() and packet_sock_destruct()
for that socket ...

>>> Driver calls dev_kfree_skb_any->dev_kfree_skb_irq
>>> and it adds buffer in completion queue to free and raises softirq NET_TX_SOFTIRQ
>>>
>>> net_tx_action->__kfree_skb->skb_release_all->skb_release_head_state->sock_wfree->
>>> __sk_free->packet_sock_destruct
>>>
>>> Also purging of receive queue has been taken care in other protocols.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
@ 2016-02-05  5:23 Vaneet Narang
  0 siblings, 0 replies; 11+ messages in thread
From: Vaneet Narang @ 2016-02-05  5:23 UTC (permalink / raw)
  To: Daniel Borkmann, Maninder Singh
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee

Hi,

>>
>> Issue is coming for 3.10.58.
>

Sorry for late reply, we were trying to reproduce the issue but its not that frequent.

>What driver are you using (is that in-tree)?

we are facing issue with wireless driver. Is it possible that wireless driver
can cause any issue because rmem accounting is done by kernel.

>Can you reproduce the same issue
>with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 or
>4.4)? There has been quite a bit of changes in err queue handling (which also
>accounts rmem) as well.

It difficult to port on 4.3 as of now but can you suggest some patches which can
be helpful. One more thing, can https://lkml.org/lkml/2015/9/15/634. change 
in linux kernel cause this issue. Just an observation (could be incorrect), after 
including this change we are facing this issue.

>How reliably can you trigger the issue? Does it trigger
>with a completely different in-tree network driver as well with your tests?

This issue is not easily reproducible. This issue gets reproduced in long term
testing. Yes wireless network driver is not in tree.

>Would be useful to track/debug sk_rmem_alloc increases/decreases to see from which path
>new rmem is being charged in the time between packet_release() and packet_sock_destruct()
>for that socket ...

As i mentioned, its not easily reproducible but we have added some debug patch in 
packet_sock_destruct to check the value of rmem_alloc. 
So as per logs, At the entry point rmem_alloc was 0 but after error queue purge 
it becomes some 576(seems a new packet added). Not sure which queue. 
Is it possible that kernel can still add packets in receive queue when socket is already closed, 
can you point the kernel code where this is avoided or is there any way this gets added to error
queue. 
As per my understanding rmem_alloc gets increased only if packets gets added to receive queue
or error queue. Is it any other queue which also changes rmem_alloc?

Thanks,
Vaneet Narang
.....................

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-01-18  9:44 ` Daniel Borkmann
  2016-01-18 10:29   ` Daniel Borkmann
@ 2016-02-05 11:26   ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2016-02-05 11:26 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Maninder Singh, davem, willemb, edumazet, eyal.birger, tklauser,
	fruggeri, dwmw2, netdev, linux-kernel, pankaj.m, gh007.kim,
	hakbong5.lee, Vaneet Narang

On Mon, 2016-01-18 at 10:44 +0100, Daniel Borkmann wrote:
> On 01/18/2016 07:37 AM, Maninder Singh wrote:
> > Receieve queue is not purged when socket dectruction is called
> > results in kernel warning because of non zero sk_rmem_alloc.
> >
> > WARNING: at net/packet/af_packet.c:1142 packet_sock_destruct
> >
> > Backtrace:
> > WARN_ON(atomic_read(&sk->sk_rmem_alloc)
> > packet_sock_destruct
> > __sk_free
> > sock_wfree
> > skb_release_head_state
> > skb_release_all
> > __kfree_skb
> > net_tx_action
> > __do_softirq
> > run_ksoftirqd
> >
> > Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> > Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
> 
> Thanks for the fix. While it fixes the WARN_ON(), I believe some more
> investigation is needed here on why it is happening:
> 
> We call first into packet_release(), which removes the socket hook from
> the kernel (unregister_prot_hook()), later calls synchronize_net() to
> make sure no more skbs will come in. The receive queue is purged right
> after the synchronize_net() already.
> 
> packet_sock_destruct() will be called afterwards, when there are no more
> refs on the socket anymore and no af_packet skbs in tx waiting for completion.
> Only then, in sk_destruct(), we'll call into packet_sock_destruct().
> 
> So, eventually double purging the sk_receive_queue seems not the right
> thing to do at first look, and w/o any deeper analysis in the commit description.
> 
> Could you look a bit further into the issue? Do you have a reproducer to
> trigger it?

So while synchronize_net() makes sure no packets can be delivered from
normal packet processing (through packet hook, if driver is not horribly
buggy (like delivering packets while IFF_UPP is not there...)), we still
might have some TX packet in a cpu completion_queue (fed in
dev_kfree_skb_irq())

This can happen if the cpu having these TX skbs had to schedule
ksoftirqd under stress. RCU quiescent period is ended before ksoftirqd
could perform its work.

If sock_queue_err_skb() is called from __skb_complete_tx_timestamp() at
the wrong time, we might get into this state.

sock_queue_err_skb() might need to be more careful ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
@ 2016-02-10 12:43 Vaneet Narang
  2016-02-10 14:56 ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Vaneet Narang @ 2016-02-10 12:43 UTC (permalink / raw)
  To: Daniel Borkmann, Maninder Singh, Vaneet Narang
  Cc: davem@davemloft.net, willemb@google.com, edumazet@google.com,
	eyal.birger@gmail.com, tklauser@distanz.ch,
	fruggeri@aristanetworks.com, dwmw2@infradead.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	PANKAJ MISHRA, Geon-ho Kim, Hak-Bong Lee, ajeet.y@samsung.com,
	AKHILESH KUMAR, AMIT NAGAL

Hi,

>What driver are you using (is that in-tree)? Can you reproduce the same issue
>with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 or
>4.4)? There has been quite a bit of changes in err queue handling (which also
>accounts rmem) as well. How reliably can you trigger the issue? Does it trigger
>with a completely different in-tree network driver as well with your tests? Would
>be useful to track/debug sk_rmem_alloc increases/decreases to see from which path
>new rmem is being charged in the time between packet_release() and packet_sock_destruct()
>for that socket ...
>
It seems race condition to us between packet_rcv and packet_close, we have tried to reproduce
this issue by adding delay in skb_set_owner_r and issue gets reproduced quite frequently. 
we have added some traces and on analyzing we have realised following possible race condition.

packet_rcv                                        

                                                packet_close                       
skb_set_owner_r(skb, sk);

                                                  skb_queue_purge(&sk->sk_receive_queue);

spin_lock(&sk->sk_receive_queue.lock);
__skb_queue_tail(&sk->sk_receive_queue, skb);
spin_unlock(&sk->sk_receive_queue.lock);

Since packet was not added to receive queue so receive queue purge will no have any impact.
It will not free sk_buff stored in receive queue. So to fix this issue, we have make sure
skb_set_owner_r(skb, sk) & __skb_queue_tail(&sk->sk_receive_queue, skb) is called under 
receive queue lock and we have moved receive queue purge from packet_release to 
packet_sock_destruct.

we have added some traces in skb_set_owner_r & packet_sock_destruct. So this is
what we got
CPU 0
 sk = db6d17c0 sk->sk_flags = 0x320 Size = 1984
Backtrace:
 (dump_backtrace+0x0/0x128) from (show_stack+0x20/0x28) 
 (show_stack+0x0/0x28) from (dump_stack+0x24/0x28)
 (dump_stack+0x0/0x28) from (packet_rcv+0x480/0x488)
 (packet_rcv+0x0/0x488) from (__netif_receive_skb_core+0x53c/0x674)
 (__netif_receive_skb_core+0x0/0x674) from (__netif_receive_skb+0x20/0x74)
 (__netif_receive_skb+0x0/0x74) from (netif_receive_skb+0x2c/0xbc)
 (netif_receive_skb+0x0/0xbc) from (napi_gro_receive+0x90/0xc0)
......
 (net_rx_action+0x0/0x300) from(__do_softirq+0x160/0x340)
 (__do_softirq+0x0/0x340) from (do_softirq+0xc4/0xe0)
 (do_softirq+0x0/0xe0) from (irq_exit+0xc4/0xf8)
 (irq_exit+0x0/0xf8) from (handle_IRQ+0x88/0x10c)
 (handle_IRQ+0x0/0x10c) from  (gic_handle_irq+0x64/0xac)

 CPU 1
Backtrace:
sk = db6d17c0 sk->sk_rmem_alloc=1984  sk->sk_flags = 0x141
Receive Queue Empty = "Yes"  Error queue empty = "Yes"

 (packet_sock_destruct+0x0/0x1f4) from (__sk_free+0x28/0x18c)
 (__sk_free+0x0/0x18c) from (sk_free+0x40/0x48)
 (sk_free+0x0/0x48) from (packet_release+0x29c/0x310)
 (packet_release+0x0/0x310) from  (sock_release+0x30/0xb8)
 (sock_release+0x0/0xb8) from (sock_close+0x1c/0x28)
 (sock_close+0x0/0x28) from (__fput+0x9c/0x2b4)
 (__fput+0x0/0x2b4) from (____fput+0x18/0x20)
 (____fput+0x0/0x20) from (task_work_run+0xc0/0xfc)
 (task_work_run+0x0/0xfc) from (do_work_pending+0x108/0x114)
 (do_work_pending+0x0/0x114) from (work_pending+0xc/0x20)

>From this it appears packet_rcv was called when packet_release
was not done  as sk_flags = 0x320  (SOCK_DEAD is not set) & packet_sock_destruct was called 
when sk_rmem_alloc was increased but packet was not added to receive queue.
sk_rmem_alloc pending is same as size of last packet received on socket.

Kindly comment on the fix shared at following link.
http://www.spinics.net/lists/kernel/msg2184815.html

Thanks & Regards,
Vaneet Narang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] af_packet: Raw socket destruction warning fix
  2016-02-10 12:43 Vaneet Narang
@ 2016-02-10 14:56 ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2016-02-10 14:56 UTC (permalink / raw)
  To: v.narang
  Cc: Daniel Borkmann, Maninder Singh, davem@davemloft.net,
	willemb@google.com, edumazet@google.com, eyal.birger@gmail.com,
	tklauser@distanz.ch, fruggeri@aristanetworks.com,
	dwmw2@infradead.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, PANKAJ MISHRA, Geon-ho Kim,
	Hak-Bong Lee, ajeet.y@samsung.com, AKHILESH KUMAR, AMIT NAGAL

On Wed, 2016-02-10 at 12:43 +0000, Vaneet Narang wrote:
> Hi,
> 
> >What driver are you using (is that in-tree)? Can you reproduce the same issue
> >with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 or
> >4.4)? There has been quite a bit of changes in err queue handling (which also
> >accounts rmem) as well. How reliably can you trigger the issue? Does it trigger
> >with a completely different in-tree network driver as well with your tests? Would
> >be useful to track/debug sk_rmem_alloc increases/decreases to see from which path
> >new rmem is being charged in the time between packet_release() and packet_sock_destruct()
> >for that socket ...
> >
> It seems race condition to us between packet_rcv and packet_close, we have tried to reproduce
> this issue by adding delay in skb_set_owner_r and issue gets reproduced quite frequently. 
> we have added some traces and on analyzing we have realised following possible race condition.



Even if you add a delay in skb_set_owner_r(), this should not allow the
dismantle phase to complete, since at least one cpu is still in a
rcu_read_lock() section.

synchronize_rcu() must complete only when all cpus pass an rcu quiescent
point.

packet_close() should certainly not be called while another cpu is still
in the middle of packet_rcv()

Your patch does not address the root cause.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-02-10 14:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-05  5:23 [PATCH] af_packet: Raw socket destruction warning fix Vaneet Narang
  -- strict thread matches above, loose matches on Subject: below --
2016-02-10 12:43 Vaneet Narang
2016-02-10 14:56 ` Eric Dumazet
2016-01-21 11:40 Maninder Singh
2016-01-26  0:13 ` Daniel Borkmann
2016-01-18 10:11 Vaneet Narang
2016-01-18 11:08 ` Daniel Borkmann
2016-01-18  6:37 Maninder Singh
2016-01-18  9:44 ` Daniel Borkmann
2016-01-18 10:29   ` Daniel Borkmann
2016-02-05 11:26   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).