From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <pabeni@redhat.com>
Cc: <billy@starlabs.sg>, <davem@davemloft.net>, <edumazet@google.com>,
<kuba@kernel.org>, <kuni1840@gmail.com>, <kuniyu@amazon.com>,
<netdev@vger.kernel.org>
Subject: Re: [PATCH v1 net] af_unix: Update unix_sk(sk)->oob_skb under sk_receive_queue lock.
Date: Fri, 10 May 2024 18:11:38 +0900 [thread overview]
Message-ID: <20240510091138.23367-1-kuniyu@amazon.com> (raw)
In-Reply-To: <dc8e67fac99c7a1d2cb36bff2217515116bf58cf.camel@redhat.com>
From: Paolo Abeni <pabeni@redhat.com>
Date: Fri, 10 May 2024 09:53:25 +0200
> On Fri, 2024-05-10 at 14:03 +0900, Kuniyuki Iwashima wrote:
> > From: Paolo Abeni <pabeni@redhat.com>
> > Date: Thu, 09 May 2024 11:12:38 +0200
> > > On Tue, 2024-05-07 at 10:00 -0700, Kuniyuki Iwashima wrote:
> > > > Billy Jheng Bing-Jhong reported a race between __unix_gc() and
> > > > queue_oob().
> > > >
> > > > __unix_gc() tries to garbage-collect close()d inflight sockets,
> > > > and then if the socket has MSG_OOB in unix_sk(sk)->oob_skb, GC
> > > > will drop the reference and set NULL to it locklessly.
> > > >
> > > > However, the peer socket still can send MSG_OOB message to the
> > > > GC candidate and queue_oob() can update unix_sk(sk)->oob_skb
> > > > concurrently, resulting in NULL pointer dereference. [0]
> > > >
> > > > To avoid the race, let's update unix_sk(sk)->oob_skb under the
> > > > sk_receive_queue's lock.
> > >
> > > I'm sorry to delay this fix but...
> > >
> > > AFAICS every time AF_UNIX touches the ooo_skb, it's under the receiver
> > > unix_state_lock. The only exception is __unix_gc. What about just
> > > acquiring such lock there?
> >
> > In the new GC, there is unix_state_lock -> gc_lock ordering, and
> > we need another fix then.
> >
> > That's why I chose locking recvq for old GC too.
> > https://lore.kernel.org/netdev/20240507172606.85532-1-kuniyu@amazon.com/
> >
> > Also, Linus says:
> >
> > I really get the feeling that 'sb->oob_skb' should actually be forced
> > to always be in sync with the receive queue by always doing the
> > accesses under the receive_queue lock.
> >
> > ( That's in the security@ thread I added you, but I just noticed
> > Linus replied to the previous mail. I'll forward the mails to you. )
> >
> >
> > > Otherwise there are other chunk touching the ooo_skb is touched where
> > > this patch does not add the receive queue spin lock protection e.g. in
> > > unix_stream_recv_urg(), making the code a bit inconsistent.
> >
> > Yes, now the receive path is protected by unix_state_lock() and the
> > send path is by unix_state_lock() and recvq lock.
> >
> > Ideally, as Linus suggested, we should acquire recvq lock everywhere
> > touching oob_skb and remove the additional refcount by skb_get(), but
> > I thought it's too much as a fix and I would do that refactoring in
> > the next cycle.
> >
> > What do you think ?
>
> I missed/forgot the unix_state_lock -> gc_lock ordering on net-next.
>
> What about using the receive queue lock, and acquiring that everywhere
> oob_skb is touched, without the additional refcount refactor?
>
> Would be more consistent and reasonably small. It should work on the
> new CG, too.
>
> The refcount refactor could later come on net-next, and will be less
> complex with the lock already in place.
yeah, sounds good.
will post v2 with additional recvq locks.
Thanks!
>
> Incremental patch on top of yours, completely untested:
> ---
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 9a6ad5974dff..a489f2aef29d 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2614,8 +2614,10 @@ static int unix_stream_recv_urg(struct unix_stream_read_state *state)
>
> mutex_lock(&u->iolock);
> unix_state_lock(sk);
> + spin_lock(&sk->sk_receive_queue.lock);
>
> if (sock_flag(sk, SOCK_URGINLINE) || !u->oob_skb) {
> + spin_unlock(&sk->sk_receive_queue.lock);
> unix_state_unlock(sk);
> mutex_unlock(&u->iolock);
> return -EINVAL;
> @@ -2627,6 +2629,7 @@ static int unix_stream_recv_urg(struct unix_stream_read_state *state)
> WRITE_ONCE(u->oob_skb, NULL);
> else
> skb_get(oob_skb);
> + spin_unlock(&sk->sk_receive_queue.lock);
> unix_state_unlock(sk);
>
> chunk = state->recv_actor(oob_skb, 0, chunk, state);
> @@ -2655,6 +2658,7 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
> consume_skb(skb);
> skb = NULL;
> } else {
> + spin_lock(&sk->sk_receive_queue.lock);
> if (skb == u->oob_skb) {
> if (copied) {
> skb = NULL;
> @@ -2673,6 +2677,7 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
> skb = skb_peek(&sk->sk_receive_queue);
> }
> }
> + spin_unlock(&sk->sk_receive_queue.lock);
> }
> return skb;
> }
>
prev parent reply other threads:[~2024-05-10 9:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 17:00 [PATCH v1 net] af_unix: Update unix_sk(sk)->oob_skb under sk_receive_queue lock Kuniyuki Iwashima
2024-05-09 9:12 ` Paolo Abeni
2024-05-10 5:03 ` Kuniyuki Iwashima
2024-05-10 7:53 ` Paolo Abeni
2024-05-10 9:11 ` Kuniyuki Iwashima [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240510091138.23367-1-kuniyu@amazon.com \
--to=kuniyu@amazon.com \
--cc=billy@starlabs.sg \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).