LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH net] ipv6: avoid atomic fragment on GSO packets
@ 2023-09-29 17:50 Yan Zhai
  2023-09-30 11:08 ` Florian Westphal
  0 siblings, 1 reply; 6+ messages in thread
From: Yan Zhai @ 2023-09-29 17:50 UTC (permalink / raw
  To: netdev
  Cc: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Aya Levin, Tariq Toukan, linux-kernel, kernel-team

GSO packets can contain a trailing segment that is smaller than
gso_size. When examining the dst MTU for such packet, if its gso_size
is too large, then all segments would be fragmented. However, there is a
good chance the trailing segment has smaller actual size than both
gso_size as well as the MTU, which leads to an "atomic fragment".
RFC-8021 explicitly recommend to deprecate such use case. An Existing
report from APNIC also shows that atomic fragments can be dropped
unexpectedly along the path [1].

Add an extra check in ip6_fragment to catch all possible generation of
atomic fragments. Skip atomic header if it is called on a packet no
larger than MTU.

Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
Reported-by: David Wragg <dwragg@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
 net/ipv6/ip6_output.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 951ba8089b5b..42f5f68a6e24 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 	__be32 frag_id;
 	u8 *prevhdr, nexthdr = 0;
 
+	/* RFC-8021 recommended atomic fragments to be deprecated. Double check
+	 * the actual packet size before fragment it.
+	 */
+	mtu = ip6_skb_dst_mtu(skb);
+	if (unlikely(skb->len <= mtu))
+		return output(net, sk, skb);
+
 	err = ip6_find_1stfragopt(skb, &prevhdr);
 	if (err < 0)
 		goto fail;
@@ -861,7 +868,6 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 	nexthdr = *prevhdr;
 	nexthdr_offset = prevhdr - skb_network_header(skb);
 
-	mtu = ip6_skb_dst_mtu(skb);
 
 	/* We must not fragment if the socket is set to force MTU discovery
 	 * or if the skb it not generated by a local socket.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets
  2023-09-29 17:50 [PATCH net] ipv6: avoid atomic fragment on GSO packets Yan Zhai
@ 2023-09-30 11:08 ` Florian Westphal
  2023-10-02  6:52   ` Willem de Bruijn
  2023-10-02 15:47   ` Yan Zhai
  0 siblings, 2 replies; 6+ messages in thread
From: Florian Westphal @ 2023-09-30 11:08 UTC (permalink / raw
  To: Yan Zhai
  Cc: netdev, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Aya Levin, Tariq Toukan,
	linux-kernel, kernel-team

Yan Zhai <yan@cloudflare.com> wrote:
> GSO packets can contain a trailing segment that is smaller than
> gso_size. When examining the dst MTU for such packet, if its gso_size
> is too large, then all segments would be fragmented. However, there is a
> good chance the trailing segment has smaller actual size than both
> gso_size as well as the MTU, which leads to an "atomic fragment".
> RFC-8021 explicitly recommend to deprecate such use case. An Existing
> report from APNIC also shows that atomic fragments can be dropped
> unexpectedly along the path [1].
> 
> Add an extra check in ip6_fragment to catch all possible generation of
> atomic fragments. Skip atomic header if it is called on a packet no
> larger than MTU.
> 
> Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
> Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
> Reported-by: David Wragg <dwragg@cloudflare.com>
> Signed-off-by: Yan Zhai <yan@cloudflare.com>
> ---
>  net/ipv6/ip6_output.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 951ba8089b5b..42f5f68a6e24 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
>  	__be32 frag_id;
>  	u8 *prevhdr, nexthdr = 0;
>  
> +	/* RFC-8021 recommended atomic fragments to be deprecated. Double check
> +	 * the actual packet size before fragment it.
> +	 */
> +	mtu = ip6_skb_dst_mtu(skb);
> +	if (unlikely(skb->len <= mtu))
> +		return output(net, sk, skb);
> +

This helper is also called for skbs where IP6CB(skb)->frag_max_size
exceeds the MTU, so this check looks wrong to me.

Same remark for dst_allfrag() check in __ip6_finish_output(),
after this patch, it would be ignored.

I think you should consider to first refactor __ip6_finish_output to make
the existing checks more readable (e.g. handle gso vs. non-gso in separate
branches) and then add the check to last seg in
ip6_finish_output_gso_slowpath_drop().

Alternatively you might be able to pass more info down to
ip6_fragment and move decisions there.

In any case we should make same frag-or-no-frag decisions,
regardless of this being the orig skb or a segmented one,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets
  2023-09-30 11:08 ` Florian Westphal
@ 2023-10-02  6:52   ` Willem de Bruijn
  2023-10-02 15:51     ` Yan Zhai
  2023-10-02 15:47   ` Yan Zhai
  1 sibling, 1 reply; 6+ messages in thread
From: Willem de Bruijn @ 2023-10-02  6:52 UTC (permalink / raw
  To: Florian Westphal
  Cc: Yan Zhai, netdev, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Aya Levin, Tariq Toukan,
	linux-kernel, kernel-team

On Sat, Sep 30, 2023 at 1:09 PM Florian Westphal <fw@strlen.de> wrote:
>
> Yan Zhai <yan@cloudflare.com> wrote:
> > GSO packets can contain a trailing segment that is smaller than
> > gso_size. When examining the dst MTU for such packet, if its gso_size
> > is too large, then all segments would be fragmented. However, there is a
> > good chance the trailing segment has smaller actual size than both
> > gso_size as well as the MTU, which leads to an "atomic fragment".
> > RFC-8021 explicitly recommend to deprecate such use case. An Existing
> > report from APNIC also shows that atomic fragments can be dropped
> > unexpectedly along the path [1].
> >
> > Add an extra check in ip6_fragment to catch all possible generation of
> > atomic fragments. Skip atomic header if it is called on a packet no
> > larger than MTU.
> >
> > Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
> > Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
> > Reported-by: David Wragg <dwragg@cloudflare.com>
> > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> > ---
> >  net/ipv6/ip6_output.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index 951ba8089b5b..42f5f68a6e24 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
> >       __be32 frag_id;
> >       u8 *prevhdr, nexthdr = 0;
> >
> > +     /* RFC-8021 recommended atomic fragments to be deprecated. Double check
> > +      * the actual packet size before fragment it.
> > +      */
> > +     mtu = ip6_skb_dst_mtu(skb);
> > +     if (unlikely(skb->len <= mtu))
> > +             return output(net, sk, skb);
> > +
>
> This helper is also called for skbs where IP6CB(skb)->frag_max_size
> exceeds the MTU, so this check looks wrong to me.
>
> Same remark for dst_allfrag() check in __ip6_finish_output(),
> after this patch, it would be ignored.
>
> I think you should consider to first refactor __ip6_finish_output to make
> the existing checks more readable (e.g. handle gso vs. non-gso in separate
> branches) and then add the check to last seg in
> ip6_finish_output_gso_slowpath_drop().
>
> Alternatively you might be able to pass more info down to
> ip6_fragment and move decisions there.
>
> In any case we should make same frag-or-no-frag decisions,
> regardless of this being the orig skb or a segmented one,

To add to that: if this is a suggestion to update the algorithm to
match RFC 8021, not a fix for a bug in the current implementation,
then I think this should target net-next.

That will also make it easier to include the kind of refactoring that
Florian suggests.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets
  2023-09-30 11:08 ` Florian Westphal
  2023-10-02  6:52   ` Willem de Bruijn
@ 2023-10-02 15:47   ` Yan Zhai
  2023-10-02 17:11     ` Florian Westphal
  1 sibling, 1 reply; 6+ messages in thread
From: Yan Zhai @ 2023-10-02 15:47 UTC (permalink / raw
  To: Florian Westphal
  Cc: netdev, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Aya Levin, Tariq Toukan,
	linux-kernel, kernel-team

On Sat, Sep 30, 2023 at 6:09 AM Florian Westphal <fw@strlen.de> wrote:
>
> Yan Zhai <yan@cloudflare.com> wrote:
> > GSO packets can contain a trailing segment that is smaller than
> > gso_size. When examining the dst MTU for such packet, if its gso_size
> > is too large, then all segments would be fragmented. However, there is a
> > good chance the trailing segment has smaller actual size than both
> > gso_size as well as the MTU, which leads to an "atomic fragment".
> > RFC-8021 explicitly recommend to deprecate such use case. An Existing
> > report from APNIC also shows that atomic fragments can be dropped
> > unexpectedly along the path [1].
> >
> > Add an extra check in ip6_fragment to catch all possible generation of
> > atomic fragments. Skip atomic header if it is called on a packet no
> > larger than MTU.
> >
> > Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
> > Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
> > Reported-by: David Wragg <dwragg@cloudflare.com>
> > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> > ---
> >  net/ipv6/ip6_output.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index 951ba8089b5b..42f5f68a6e24 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
> >       __be32 frag_id;
> >       u8 *prevhdr, nexthdr = 0;
> >
> > +     /* RFC-8021 recommended atomic fragments to be deprecated. Double check
> > +      * the actual packet size before fragment it.
> > +      */
> > +     mtu = ip6_skb_dst_mtu(skb);
> > +     if (unlikely(skb->len <= mtu))
> > +             return output(net, sk, skb);
> > +
>
> This helper is also called for skbs where IP6CB(skb)->frag_max_size
> exceeds the MTU, so this check looks wrong to me.
>
> Same remark for dst_allfrag() check in __ip6_finish_output(),
> after this patch, it would be ignored.
>
Thanks for covering my carelessness. I was just considering the GSO
case so frag_max_size was overlooked. dst_allfrag is indeed a case
based on the code logic. But just out of curiosity, do we still see
any use of this feature? From commit messages it is set when PMTU
values signals smaller than min IPv6 MTU. But such PMTU values are
just dropped in __ip6_rt_update_pmtu now. Iproute2 code also does not
provide this route feature anymore. So it might be actually a dead
check?

> I think you should consider to first refactor __ip6_finish_output to make
> the existing checks more readable (e.g. handle gso vs. non-gso in separate
> branches) and then add the check to last seg in
> ip6_finish_output_gso_slowpath_drop().
>
Agree with refactoring to mirror what IPv4 code is doing. It might not
hurt if we check every segments in this case, since it is already the
slowpath and it will make code more compact.

> Alternatively you might be able to pass more info down to
> ip6_fragment and move decisions there.
>
> In any case we should make same frag-or-no-frag decisions,
> regardless of this being the orig skb or a segmented one,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets
  2023-10-02  6:52   ` Willem de Bruijn
@ 2023-10-02 15:51     ` Yan Zhai
  0 siblings, 0 replies; 6+ messages in thread
From: Yan Zhai @ 2023-10-02 15:51 UTC (permalink / raw
  To: Willem de Bruijn
  Cc: Florian Westphal, netdev, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Aya Levin,
	Tariq Toukan, linux-kernel, kernel-team

On Mon, Oct 2, 2023 at 1:53 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> To add to that: if this is a suggestion to update the algorithm to
> match RFC 8021, not a fix for a bug in the current implementation,
> then I think this should target net-next.

Ack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets
  2023-10-02 15:47   ` Yan Zhai
@ 2023-10-02 17:11     ` Florian Westphal
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2023-10-02 17:11 UTC (permalink / raw
  To: Yan Zhai
  Cc: Florian Westphal, netdev, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Aya Levin,
	Tariq Toukan, linux-kernel, kernel-team

Yan Zhai <yan@cloudflare.com> wrote:
> On Sat, Sep 30, 2023 at 6:09 AM Florian Westphal <fw@strlen.de> wrote:
> >
> > This helper is also called for skbs where IP6CB(skb)->frag_max_size
> > exceeds the MTU, so this check looks wrong to me.
> >
> > Same remark for dst_allfrag() check in __ip6_finish_output(),
> > after this patch, it would be ignored.
> >
> Thanks for covering my carelessness. I was just considering the GSO
> case so frag_max_size was overlooked. dst_allfrag is indeed a case
> based on the code logic. But just out of curiosity, do we still see
> any use of this feature? From commit messages it is set when PMTU
> values signals smaller than min IPv6 MTU. But such PMTU values are
> just dropped in __ip6_rt_update_pmtu now. Iproute2 code also does not
> provide this route feature anymore. So it might be actually a dead
> check?

I don't think iproute2 ever exposed it, I think we can axe
dst_allfrag().

> > I think you should consider to first refactor __ip6_finish_output to make
> > the existing checks more readable (e.g. handle gso vs. non-gso in separate
> > branches) and then add the check to last seg in
> > ip6_finish_output_gso_slowpath_drop().
> >
> Agree with refactoring to mirror what IPv4 code is doing. It might not
> hurt if we check every segments in this case, since it is already the
> slowpath and it will make code more compact.

No objections from my side.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-02 17:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-29 17:50 [PATCH net] ipv6: avoid atomic fragment on GSO packets Yan Zhai
2023-09-30 11:08 ` Florian Westphal
2023-10-02  6:52   ` Willem de Bruijn
2023-10-02 15:51     ` Yan Zhai
2023-10-02 15:47   ` Yan Zhai
2023-10-02 17:11     ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).