LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3 0/2] net: gro: add flush/flush_id checks and fix wrong offset in udp
@ 2024-04-24 16:30 Richard Gobert
  2024-04-24 16:30 ` [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb Richard Gobert
  2024-04-24 16:30 ` [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment Richard Gobert
  0 siblings, 2 replies; 7+ messages in thread
From: Richard Gobert @ 2024-04-24 16:30 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
	alobakin, netdev, linux-kernel
  Cc: Richard Gobert

This series fixes a bug in the complete phase of UDP in GRO, in which
socket lookup fails due to using network_header when parsing encapsulated
packets. The fix is to add network_offset and inner_network_offset to
napi_gro_cb and use these offsets for socket lookup.

In addition, p->flush/flush_id should be checked in all UDP flows. The 
same logic from tcp_gro_receive is applied for all flows in
udp_gro_receive_segment. This prevents packets with mismatching network
headers (flush/flush_id turned on) from merging in UDP GRO.

The original series includes a change to vxlan test which adds the local
parameter to prevent similar future bugs. I plan to submit it separately to
net-next.

This series is part of a previously submitted series to net-next:
https://lore.kernel.org/all/20240408141720.98832-1-richardbgobert@gmail.com/

v2 -> v3:
 - Add network_offsets and fix udp bug in a single commit to make backporting easier
 - Write to inner_network_offset in {inet,ipv6}_gro_receive
 - Use network_offsets union in tcp[46]_gro_complete as well
 - v2:
 https://lore.kernel.org/netdev/20240419153542.121087-1-richardbgobert@gmail.com/

v1 -> v2:
 - Use network_offsets instead of p_poff param as suggested by Willem
 - Check flush before postpull, and for all UDP GRO flows
 - v1:
 https://lore.kernel.org/netdev/20240412152120.115067-1-richardbgobert@gmail.com/

Richard Gobert (2):
  net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  net: gro: add flush check in udp_gro_receive_segment

 include/net/gro.h        | 18 ++++++++++++++++--
 net/8021q/vlan_core.c    |  2 ++
 net/core/gro.c           |  1 +
 net/ipv4/af_inet.c       |  5 +----
 net/ipv4/tcp_offload.c   |  3 ++-
 net/ipv4/udp.c           |  3 ++-
 net/ipv4/udp_offload.c   | 15 +++++++++++++--
 net/ipv6/ip6_offload.c   |  6 +++---
 net/ipv6/tcpv6_offload.c |  3 ++-
 net/ipv6/udp.c           |  3 ++-
 net/ipv6/udp_offload.c   |  3 ++-
 11 files changed, 46 insertions(+), 16 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  2024-04-24 16:30 [PATCH net v3 0/2] net: gro: add flush/flush_id checks and fix wrong offset in udp Richard Gobert
@ 2024-04-24 16:30 ` Richard Gobert
  2024-04-25  2:15   ` Willem de Bruijn
  2024-04-24 16:30 ` [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment Richard Gobert
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Gobert @ 2024-04-24 16:30 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
	alobakin, netdev, linux-kernel
  Cc: Richard Gobert

Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
complete phase of gro. The functions always return skb->network_header,
which in the case of encapsulated packets at the gro complete phase, is
always set to the innermost L3 of the packet. That means that calling
{ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
L3/L4 may return an unexpected value.

This incorrect usage leads to a bug in GRO's UDP socket lookup.
udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
*_hdr functions return network_header which will point to the innermost L3,
resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
encapsulated packets.

This patch adds network_offset and inner_network_offset to napi_gro_cb, and
makes sure both are set correctly.

To fix the issue, network_offsets union is used inside napi_gro_cb, in
which both the outer and the inner network offsets are saved.

Reproduction example:

Endpoint configuration example (fou + local address bind)

    # ip fou add port 6666 ipproto 4
    # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
    # ip link set tun1 up
    # ip a add 1.1.1.2/24 dev tun1

Netperf TCP_STREAM result on net-next before patch is applied:

net-next main, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.28        2.37

net-next main, GRO disabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2745.06

patch applied, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2877.38

Fixes: 57c67ff4bd92 ("udp: additional GRO support")
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 include/net/gro.h        | 18 ++++++++++++++++--
 net/8021q/vlan_core.c    |  2 ++
 net/core/gro.c           |  1 +
 net/ipv4/af_inet.c       |  5 +----
 net/ipv4/tcp_offload.c   |  3 ++-
 net/ipv4/udp.c           |  3 ++-
 net/ipv4/udp_offload.c   |  3 ++-
 net/ipv6/ip6_offload.c   |  6 +++---
 net/ipv6/tcpv6_offload.c |  3 ++-
 net/ipv6/udp.c           |  3 ++-
 net/ipv6/udp_offload.c   |  3 ++-
 11 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 50f1e403dbbb..1faff23b66e8 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -87,6 +87,15 @@ struct napi_gro_cb {
 
 	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
 	__wsum	csum;
+
+	/* L3 offsets */
+	union {
+		struct {
+			u16 network_offset;
+			u16 inner_network_offset;
+		};
+		u16 network_offsets[2];
+	};
 };
 
 #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
@@ -172,12 +181,17 @@ static inline void *skb_gro_header(struct sk_buff *skb, unsigned int hlen,
 	return ptr;
 }
 
+static inline int skb_gro_network_offset(const struct sk_buff *skb)
+{
+	return NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark];
+}
+
 static inline void *skb_gro_network_header(const struct sk_buff *skb)
 {
 	if (skb_gro_may_pull(skb, skb_gro_offset(skb)))
-		return skb_gro_header_fast(skb, skb_network_offset(skb));
+		return skb_gro_header_fast(skb, skb_gro_network_offset(skb));
 
-	return skb_network_header(skb);
+	return skb->data + skb_gro_network_offset(skb);
 }
 
 static inline __wsum inet_gro_compute_pseudo(const struct sk_buff *skb,
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index f00158234505..9404dd551dfd 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -478,6 +478,8 @@ static struct sk_buff *vlan_gro_receive(struct list_head *head,
 	if (unlikely(!vhdr))
 		goto out;
 
+	NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = hlen;
+
 	type = vhdr->h_vlan_encapsulated_proto;
 
 	ptype = gro_find_receive_by_type(type);
diff --git a/net/core/gro.c b/net/core/gro.c
index 83f35d99a682..c7901253a1a8 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -371,6 +371,7 @@ static inline void skb_gro_reset_offset(struct sk_buff *skb, u32 nhoff)
 	const skb_frag_t *frag0;
 	unsigned int headlen;
 
+	NAPI_GRO_CB(skb)->network_offset = 0;
 	NAPI_GRO_CB(skb)->data_offset = 0;
 	headlen = skb_headlen(skb);
 	NAPI_GRO_CB(skb)->frag0 = skb->data;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 55bd72997b31..2daf635ab99e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1568,10 +1568,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
 
 	NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));
 	NAPI_GRO_CB(skb)->flush |= flush;
-	skb_set_network_header(skb, off);
-	/* The above will be needed by the transport layer if there is one
-	 * immediately following this IP hdr.
-	 */
+	NAPI_GRO_CB(skb)->inner_network_offset = off;
 
 	/* Note : No need to call skb_gro_postpull_rcsum() here,
 	 * as we already checked checksum over ipv4 header was 0
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index ebe4722bb020..be8ddf6da88c 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -332,7 +332,8 @@ struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb)
 
 INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
 	struct tcphdr *th = tcp_hdr(skb);
 
 	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c02bf011d4a6..1399fce82b3f 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -532,7 +532,8 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
 	struct net *net = dev_net(skb->dev);
 	int iif, sdif;
 
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3498dd1d0694..fd29d21d579c 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -718,7 +718,8 @@ EXPORT_SYMBOL(udp_gro_complete);
 
 INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
 	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index b41e35af69ea..5d6b875a4638 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -67,7 +67,7 @@ static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
 		off += len;
 	}
 
-	skb_gro_pull(skb, off - skb_network_offset(skb));
+	skb_gro_pull(skb, off - skb_gro_network_offset(skb));
 	return proto;
 }
 
@@ -236,7 +236,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
 	if (unlikely(!iph))
 		goto out;
 
-	skb_set_network_header(skb, off);
+	NAPI_GRO_CB(skb)->inner_network_offset = off;
 
 	flush += ntohs(iph->payload_len) != skb->len - hlen;
 
@@ -259,7 +259,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
 	NAPI_GRO_CB(skb)->proto = proto;
 
 	flush--;
-	nlen = skb_network_header_len(skb);
+	nlen = skb_gro_offset(skb) - off;
 
 	list_for_each_entry(p, head, list) {
 		const struct ipv6hdr *iph2;
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index 4b07d1e6c952..e70d60e0f86f 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -29,7 +29,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
 
 INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
 {
-	const struct ipv6hdr *iph = ipv6_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
 	struct tcphdr *th = tcp_hdr(skb);
 
 	th->check = ~tcp_v6_check(skb->len - thoff, &iph->saddr,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 8b1dd7f51249..f7880e306410 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -272,7 +272,8 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
-	const struct ipv6hdr *iph = ipv6_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
 	struct net *net = dev_net(skb->dev);
 	int iif, sdif;
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index bbd347de00b4..b41152dd4246 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -164,7 +164,8 @@ struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
 
 INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 {
-	const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+	const struct ipv6hdr *ipv6h = (struct ipv6hdr *)(skb->data + offset);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
 	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment
  2024-04-24 16:30 [PATCH net v3 0/2] net: gro: add flush/flush_id checks and fix wrong offset in udp Richard Gobert
  2024-04-24 16:30 ` [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb Richard Gobert
@ 2024-04-24 16:30 ` Richard Gobert
  2024-04-25  2:45   ` Willem de Bruijn
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Gobert @ 2024-04-24 16:30 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
	alobakin, netdev, linux-kernel
  Cc: Richard Gobert

GRO-GSO path is supposed to be transparent and as such L3 flush checks are
relevant to all UDP flows merging in GRO. This patch uses the same logic
and code from tcp_gro_receive, terminating merge if flush is non zero.

Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 net/ipv4/udp_offload.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index fd29d21d579c..8721fe5beca2 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -471,6 +471,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 	struct sk_buff *p;
 	unsigned int ulen;
 	int ret = 0;
+	int flush;
 
 	/* requires non zero csum, for symmetry with GSO */
 	if (!uh->check) {
@@ -504,13 +505,22 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 			return p;
 		}
 
+		flush = NAPI_GRO_CB(p)->flush;
+
+		if (NAPI_GRO_CB(p)->flush_id != 1 ||
+		    NAPI_GRO_CB(p)->count != 1 ||
+		    !NAPI_GRO_CB(p)->is_atomic)
+			flush |= NAPI_GRO_CB(p)->flush_id;
+		else
+			NAPI_GRO_CB(p)->is_atomic = false;
+
 		/* Terminate the flow on len mismatch or if it grow "too much".
 		 * Under small packet flood GRO count could elsewhere grow a lot
 		 * leading to excessive truesize values.
 		 * On len mismatch merge the first packet shorter than gso_size,
 		 * otherwise complete the GRO packet.
 		 */
-		if (ulen > ntohs(uh2->len)) {
+		if (ulen > ntohs(uh2->len) || flush) {
 			pp = p;
 		} else {
 			if (NAPI_GRO_CB(skb)->is_flist) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  2024-04-24 16:30 ` [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb Richard Gobert
@ 2024-04-25  2:15   ` Willem de Bruijn
  2024-04-25 11:59     ` Richard Gobert
  0 siblings, 1 reply; 7+ messages in thread
From: Willem de Bruijn @ 2024-04-25  2:15 UTC (permalink / raw)
  To: Richard Gobert, davem, edumazet, kuba, pabeni, dsahern,
	willemdebruijn.kernel, alobakin, netdev, linux-kernel
  Cc: Richard Gobert

Richard Gobert wrote:
> Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
> additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
> complete phase of gro. The functions always return skb->network_header,
> which in the case of encapsulated packets at the gro complete phase, is
> always set to the innermost L3 of the packet. That means that calling
> {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
> gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
> L3/L4 may return an unexpected value.
> 
> This incorrect usage leads to a bug in GRO's UDP socket lookup.
> udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
> *_hdr functions return network_header which will point to the innermost L3,
> resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
> encapsulated packets.
> 
> This patch adds network_offset and inner_network_offset to napi_gro_cb, and
> makes sure both are set correctly.
> 
> To fix the issue, network_offsets union is used inside napi_gro_cb, in
> which both the outer and the inner network offsets are saved.
> 
> Reproduction example:
> 
> Endpoint configuration example (fou + local address bind)
> 
>     # ip fou add port 6666 ipproto 4
>     # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
>     # ip link set tun1 up
>     # ip a add 1.1.1.2/24 dev tun1
> 
> Netperf TCP_STREAM result on net-next before patch is applied:
> 
> net-next main, GRO enabled:
>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>     Recv   Send    Send
>     Socket Socket  Message  Elapsed
>     Size   Size    Size     Time     Throughput
>     bytes  bytes   bytes    secs.    10^6bits/sec
> 
>     131072  16384  16384    5.28        2.37
> 
> net-next main, GRO disabled:
>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>     Recv   Send    Send
>     Socket Socket  Message  Elapsed
>     Size   Size    Size     Time     Throughput
>     bytes  bytes   bytes    secs.    10^6bits/sec
> 
>     131072  16384  16384    5.01     2745.06
> 
> patch applied, GRO enabled:
>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>     Recv   Send    Send
>     Socket Socket  Message  Elapsed
>     Size   Size    Size     Time     Throughput
>     bytes  bytes   bytes    secs.    10^6bits/sec
> 
>     131072  16384  16384    5.01     2877.38
> 
> Fixes: 57c67ff4bd92 ("udp: additional GRO support")
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
>  include/net/gro.h        | 18 ++++++++++++++++--
>  net/8021q/vlan_core.c    |  2 ++
>  net/core/gro.c           |  1 +
>  net/ipv4/af_inet.c       |  5 +----
>  net/ipv4/tcp_offload.c   |  3 ++-
>  net/ipv4/udp.c           |  3 ++-
>  net/ipv4/udp_offload.c   |  3 ++-
>  net/ipv6/ip6_offload.c   |  6 +++---
>  net/ipv6/tcpv6_offload.c |  3 ++-
>  net/ipv6/udp.c           |  3 ++-
>  net/ipv6/udp_offload.c   |  3 ++-
>  11 files changed, 35 insertions(+), 15 deletions(-)
> 
> diff --git a/include/net/gro.h b/include/net/gro.h
> index 50f1e403dbbb..1faff23b66e8 100644
> --- a/include/net/gro.h
> +++ b/include/net/gro.h
> @@ -87,6 +87,15 @@ struct napi_gro_cb {
>  
>  	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
>  	__wsum	csum;
> +
> +	/* L3 offsets */
> +	union {
> +		struct {
> +			u16 network_offset;
> +			u16 inner_network_offset;
> +		};
> +		u16 network_offsets[2];
> +	};
>  };
>  
>  #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
> @@ -172,12 +181,17 @@ static inline void *skb_gro_header(struct sk_buff *skb, unsigned int hlen,
>  	return ptr;
>  }
>  
> +static inline int skb_gro_network_offset(const struct sk_buff *skb)
> +{
> +	return NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark];
> +}
> +
>  static inline void *skb_gro_network_header(const struct sk_buff *skb)
>  {
>  	if (skb_gro_may_pull(skb, skb_gro_offset(skb)))
> -		return skb_gro_header_fast(skb, skb_network_offset(skb));
> +		return skb_gro_header_fast(skb, skb_gro_network_offset(skb));
>  
> -	return skb_network_header(skb);
> +	return skb->data + skb_gro_network_offset(skb);
>  }
>  
>  static inline __wsum inet_gro_compute_pseudo(const struct sk_buff *skb,
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index f00158234505..9404dd551dfd 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -478,6 +478,8 @@ static struct sk_buff *vlan_gro_receive(struct list_head *head,
>  	if (unlikely(!vhdr))
>  		goto out;
>  
> +	NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = hlen;
> +
>  	type = vhdr->h_vlan_encapsulated_proto;
>  
>  	ptype = gro_find_receive_by_type(type);
> diff --git a/net/core/gro.c b/net/core/gro.c
> index 83f35d99a682..c7901253a1a8 100644
> --- a/net/core/gro.c
> +++ b/net/core/gro.c
> @@ -371,6 +371,7 @@ static inline void skb_gro_reset_offset(struct sk_buff *skb, u32 nhoff)
>  	const skb_frag_t *frag0;
>  	unsigned int headlen;
>  
> +	NAPI_GRO_CB(skb)->network_offset = 0;
>  	NAPI_GRO_CB(skb)->data_offset = 0;
>  	headlen = skb_headlen(skb);
>  	NAPI_GRO_CB(skb)->frag0 = skb->data;
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 55bd72997b31..2daf635ab99e 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1568,10 +1568,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>  
>  	NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));
>  	NAPI_GRO_CB(skb)->flush |= flush;
> -	skb_set_network_header(skb, off);
> -	/* The above will be needed by the transport layer if there is one
> -	 * immediately following this IP hdr.
> -	 */
> +	NAPI_GRO_CB(skb)->inner_network_offset = off;
>  
>  	/* Note : No need to call skb_gro_postpull_rcsum() here,
>  	 * as we already checked checksum over ipv4 header was 0
> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
> index ebe4722bb020..be8ddf6da88c 100644
> --- a/net/ipv4/tcp_offload.c
> +++ b/net/ipv4/tcp_offload.c
> @@ -332,7 +332,8 @@ struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb)
>  
>  INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
>  {
> -	const struct iphdr *iph = ip_hdr(skb);
> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>  	struct tcphdr *th = tcp_hdr(skb);
>  
>  	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index c02bf011d4a6..1399fce82b3f 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -532,7 +532,8 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
>  struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
>  				 __be16 sport, __be16 dport)
>  {
> -	const struct iphdr *iph = ip_hdr(skb);
> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>  	struct net *net = dev_net(skb->dev);
>  	int iif, sdif;
>  
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 3498dd1d0694..fd29d21d579c 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -718,7 +718,8 @@ EXPORT_SYMBOL(udp_gro_complete);
>  
>  INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
>  {
> -	const struct iphdr *iph = ip_hdr(skb);
> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>  	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
>  
>  	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
> index b41e35af69ea..5d6b875a4638 100644
> --- a/net/ipv6/ip6_offload.c
> +++ b/net/ipv6/ip6_offload.c
> @@ -67,7 +67,7 @@ static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
>  		off += len;
>  	}
>  
> -	skb_gro_pull(skb, off - skb_network_offset(skb));
> +	skb_gro_pull(skb, off - skb_gro_network_offset(skb));
>  	return proto;
>  }
>  
> @@ -236,7 +236,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>  	if (unlikely(!iph))
>  		goto out;
>  
> -	skb_set_network_header(skb, off);
> +	NAPI_GRO_CB(skb)->inner_network_offset = off;
>  
>  	flush += ntohs(iph->payload_len) != skb->len - hlen;
>  
> @@ -259,7 +259,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>  	NAPI_GRO_CB(skb)->proto = proto;
>  
>  	flush--;
> -	nlen = skb_network_header_len(skb);
> +	nlen = skb_gro_offset(skb) - off;
>  
>  	list_for_each_entry(p, head, list) {
>  		const struct ipv6hdr *iph2;
> diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
> index 4b07d1e6c952..e70d60e0f86f 100644
> --- a/net/ipv6/tcpv6_offload.c
> +++ b/net/ipv6/tcpv6_offload.c
> @@ -29,7 +29,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
>  
>  INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
>  {
> -	const struct ipv6hdr *iph = ipv6_hdr(skb);
> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
> +	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
>  	struct tcphdr *th = tcp_hdr(skb);
>  

Only udp code is affected, as only that can be used as tunnel.

For bug fixes, let's try to avoid touching other code. Also that vlan.

As a minimal patch all that is needed is the following, right?

- add the fields
- store in inet_gro_receive + ipv6_gro_receive
- read in udp[46]_gro_complete and udp[46]_lib_lookup_skb

+++ b/include/net/gro.h
@@ -87,6 +87,8 @@ struct napi_gro_cb {
 
        /* used to support CHECKSUM_COMPLETE for tunneling protocols */
        __wsum  csum;
+
+       u16 network_offsets[2];
 };
 
 #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
@@ -180,6 +182,11 @@ static inline void *skb_gro_network_header(const struct sk_buff *skb)
        return skb_network_header(skb);
 }
 
+static inline void *skb_gro_complete_network_header(const struct sk_buff *skb)
+{
+       return skb->data + NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
+}
+
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index a7cfeda28bb2..4294efdd5613 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1569,6 +1569,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)

        NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));
        NAPI_GRO_CB(skb)->flush |= flush;
+       NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = off;
        skb_set_network_header(skb, off);
        /* The above will be needed by the transport layer if there is one
         * immediately following this IP hdr.
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 944922172c98..1469e19bafd5 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -543,7 +543,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
                                 __be16 sport, __be16 dport)
 {
-       const struct iphdr *iph = ip_hdr(skb);
+       const struct iphdr *iph = skb_gro_complete_network_header(skb);
        struct net *net = dev_net(skb->dev);
        int iif, sdif;
 diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3498dd1d0694..e93ddd42a091 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -718,7 +718,7 @@ EXPORT_SYMBOL(udp_gro_complete);

 INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 {
-       const struct iphdr *iph = ip_hdr(skb);
+       const struct iphdr *iph = skb_gro_complete_network_header(skb);
        struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
        /* do fraglist only if there is no outer UDP encap (or we already processed it) */
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index b41e35af69ea..5033ebfd5e92 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -236,6 +236,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
        if (unlikely(!iph))
                goto out;
 
+       NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = off;
        skb_set_network_header(skb, off);
 
        flush += ntohs(iph->payload_len) != skb->len - hlen;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 9914e73f4785..c012f6c41444 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -285,7 +285,7 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
                                 __be16 sport, __be16 dport)
 {
-       const struct ipv6hdr *iph = ipv6_hdr(skb);
+       const struct ipv6hdr *iph = skb_gro_complete_network_header(skb);
        struct net *net = dev_net(skb->dev);
        int iif, sdif;

diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index bbd347de00b4..274dd2983216 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -164,7 +164,7 @@ struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)

 INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 {
-       const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+       const struct ipv6hdr *ipv6h = skb_gro_complete_network_header(skb);
        struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 

The part in udp_gro_complete that clears encap_mark and sets
encapsulation is definitely something to watch out for.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment
  2024-04-24 16:30 ` [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment Richard Gobert
@ 2024-04-25  2:45   ` Willem de Bruijn
  0 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2024-04-25  2:45 UTC (permalink / raw)
  To: Richard Gobert, davem, edumazet, kuba, pabeni, dsahern,
	willemdebruijn.kernel, alobakin, netdev, linux-kernel
  Cc: Richard Gobert

Richard Gobert wrote:
> GRO-GSO path is supposed to be transparent and as such L3 flush checks are
> relevant to all UDP flows merging in GRO. This patch uses the same logic
> and code from tcp_gro_receive, terminating merge if flush is non zero.
> 
> Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")

Should this be

Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")

>  net/ipv4/udp_offload.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index fd29d21d579c..8721fe5beca2 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -471,6 +471,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
>  	struct sk_buff *p;
>  	unsigned int ulen;
>  	int ret = 0;
> +	int flush;
>  
>  	/* requires non zero csum, for symmetry with GSO */
>  	if (!uh->check) {
> @@ -504,13 +505,22 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
>  			return p;
>  		}
>  
> +		flush = NAPI_GRO_CB(p)->flush;
> +
> +		if (NAPI_GRO_CB(p)->flush_id != 1 ||
> +		    NAPI_GRO_CB(p)->count != 1 ||
> +		    !NAPI_GRO_CB(p)->is_atomic)
> +			flush |= NAPI_GRO_CB(p)->flush_id;
> +		else
> +			NAPI_GRO_CB(p)->is_atomic = false;
> +
>  		/* Terminate the flow on len mismatch or if it grow "too much".
>  		 * Under small packet flood GRO count could elsewhere grow a lot
>  		 * leading to excessive truesize values.
>  		 * On len mismatch merge the first packet shorter than gso_size,
>  		 * otherwise complete the GRO packet.
>  		 */
> -		if (ulen > ntohs(uh2->len)) {
> +		if (ulen > ntohs(uh2->len) || flush) {
>  			pp = p;

I suppose this branch already could immediately return p. And avoid
a level of indentation below. But agreed to not change that in this
bug fix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  2024-04-25  2:15   ` Willem de Bruijn
@ 2024-04-25 11:59     ` Richard Gobert
  2024-04-25 13:56       ` Willem de Bruijn
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Gobert @ 2024-04-25 11:59 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni, dsahern,
	alobakin, netdev, linux-kernel

Willem de Bruijn wrote:
> Richard Gobert wrote:
>> Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
>> additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
>> complete phase of gro. The functions always return skb->network_header,
>> which in the case of encapsulated packets at the gro complete phase, is
>> always set to the innermost L3 of the packet. That means that calling
>> {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
>> gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
>> L3/L4 may return an unexpected value.
>>
>> This incorrect usage leads to a bug in GRO's UDP socket lookup.
>> udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
>> *_hdr functions return network_header which will point to the innermost L3,
>> resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
>> encapsulated packets.
>>
>> This patch adds network_offset and inner_network_offset to napi_gro_cb, and
>> makes sure both are set correctly.
>>
>> To fix the issue, network_offsets union is used inside napi_gro_cb, in
>> which both the outer and the inner network offsets are saved.
>>
>> Reproduction example:
>>
>> Endpoint configuration example (fou + local address bind)
>>
>>     # ip fou add port 6666 ipproto 4
>>     # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
>>     # ip link set tun1 up
>>     # ip a add 1.1.1.2/24 dev tun1
>>
>> Netperf TCP_STREAM result on net-next before patch is applied:
>>
>> net-next main, GRO enabled:
>>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>>     Recv   Send    Send
>>     Socket Socket  Message  Elapsed
>>     Size   Size    Size     Time     Throughput
>>     bytes  bytes   bytes    secs.    10^6bits/sec
>>
>>     131072  16384  16384    5.28        2.37
>>
>> net-next main, GRO disabled:
>>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>>     Recv   Send    Send
>>     Socket Socket  Message  Elapsed
>>     Size   Size    Size     Time     Throughput
>>     bytes  bytes   bytes    secs.    10^6bits/sec
>>
>>     131072  16384  16384    5.01     2745.06
>>
>> patch applied, GRO enabled:
>>     $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
>>     Recv   Send    Send
>>     Socket Socket  Message  Elapsed
>>     Size   Size    Size     Time     Throughput
>>     bytes  bytes   bytes    secs.    10^6bits/sec
>>
>>     131072  16384  16384    5.01     2877.38
>>
>> Fixes: 57c67ff4bd92 ("udp: additional GRO support")
>> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
>> ---
>>  include/net/gro.h        | 18 ++++++++++++++++--
>>  net/8021q/vlan_core.c    |  2 ++
>>  net/core/gro.c           |  1 +
>>  net/ipv4/af_inet.c       |  5 +----
>>  net/ipv4/tcp_offload.c   |  3 ++-
>>  net/ipv4/udp.c           |  3 ++-
>>  net/ipv4/udp_offload.c   |  3 ++-
>>  net/ipv6/ip6_offload.c   |  6 +++---
>>  net/ipv6/tcpv6_offload.c |  3 ++-
>>  net/ipv6/udp.c           |  3 ++-
>>  net/ipv6/udp_offload.c   |  3 ++-
>>  11 files changed, 35 insertions(+), 15 deletions(-)
>>
>> diff --git a/include/net/gro.h b/include/net/gro.h
>> index 50f1e403dbbb..1faff23b66e8 100644
>> --- a/include/net/gro.h
>> +++ b/include/net/gro.h
>> @@ -87,6 +87,15 @@ struct napi_gro_cb {
>>  
>>  	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
>>  	__wsum	csum;
>> +
>> +	/* L3 offsets */
>> +	union {
>> +		struct {
>> +			u16 network_offset;
>> +			u16 inner_network_offset;
>> +		};
>> +		u16 network_offsets[2];
>> +	};
>>  };
>>  
>>  #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
>> @@ -172,12 +181,17 @@ static inline void *skb_gro_header(struct sk_buff *skb, unsigned int hlen,
>>  	return ptr;
>>  }
>>  
>> +static inline int skb_gro_network_offset(const struct sk_buff *skb)
>> +{
>> +	return NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark];
>> +}
>> +
>>  static inline void *skb_gro_network_header(const struct sk_buff *skb)
>>  {
>>  	if (skb_gro_may_pull(skb, skb_gro_offset(skb)))
>> -		return skb_gro_header_fast(skb, skb_network_offset(skb));
>> +		return skb_gro_header_fast(skb, skb_gro_network_offset(skb));
>>  
>> -	return skb_network_header(skb);
>> +	return skb->data + skb_gro_network_offset(skb);
>>  }
>>  
>>  static inline __wsum inet_gro_compute_pseudo(const struct sk_buff *skb,
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index f00158234505..9404dd551dfd 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -478,6 +478,8 @@ static struct sk_buff *vlan_gro_receive(struct list_head *head,
>>  	if (unlikely(!vhdr))
>>  		goto out;
>>  
>> +	NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = hlen;
>> +
>>  	type = vhdr->h_vlan_encapsulated_proto;
>>  
>>  	ptype = gro_find_receive_by_type(type);
>> diff --git a/net/core/gro.c b/net/core/gro.c
>> index 83f35d99a682..c7901253a1a8 100644
>> --- a/net/core/gro.c
>> +++ b/net/core/gro.c
>> @@ -371,6 +371,7 @@ static inline void skb_gro_reset_offset(struct sk_buff *skb, u32 nhoff)
>>  	const skb_frag_t *frag0;
>>  	unsigned int headlen;
>>  
>> +	NAPI_GRO_CB(skb)->network_offset = 0;
>>  	NAPI_GRO_CB(skb)->data_offset = 0;
>>  	headlen = skb_headlen(skb);
>>  	NAPI_GRO_CB(skb)->frag0 = skb->data;
>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>> index 55bd72997b31..2daf635ab99e 100644
>> --- a/net/ipv4/af_inet.c
>> +++ b/net/ipv4/af_inet.c
>> @@ -1568,10 +1568,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>>  
>>  	NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));
>>  	NAPI_GRO_CB(skb)->flush |= flush;
>> -	skb_set_network_header(skb, off);
>> -	/* The above will be needed by the transport layer if there is one
>> -	 * immediately following this IP hdr.
>> -	 */
>> +	NAPI_GRO_CB(skb)->inner_network_offset = off;
>>  
>>  	/* Note : No need to call skb_gro_postpull_rcsum() here,
>>  	 * as we already checked checksum over ipv4 header was 0
>> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
>> index ebe4722bb020..be8ddf6da88c 100644
>> --- a/net/ipv4/tcp_offload.c
>> +++ b/net/ipv4/tcp_offload.c
>> @@ -332,7 +332,8 @@ struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb)
>>  
>>  INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
>>  {
>> -	const struct iphdr *iph = ip_hdr(skb);
>> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
>> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>>  	struct tcphdr *th = tcp_hdr(skb);
>>  
>>  	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index c02bf011d4a6..1399fce82b3f 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -532,7 +532,8 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
>>  struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
>>  				 __be16 sport, __be16 dport)
>>  {
>> -	const struct iphdr *iph = ip_hdr(skb);
>> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
>> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>>  	struct net *net = dev_net(skb->dev);
>>  	int iif, sdif;
>>  
>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>> index 3498dd1d0694..fd29d21d579c 100644
>> --- a/net/ipv4/udp_offload.c
>> +++ b/net/ipv4/udp_offload.c
>> @@ -718,7 +718,8 @@ EXPORT_SYMBOL(udp_gro_complete);
>>  
>>  INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
>>  {
>> -	const struct iphdr *iph = ip_hdr(skb);
>> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
>> +	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>>  	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
>>  
>>  	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
>> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
>> index b41e35af69ea..5d6b875a4638 100644
>> --- a/net/ipv6/ip6_offload.c
>> +++ b/net/ipv6/ip6_offload.c
>> @@ -67,7 +67,7 @@ static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
>>  		off += len;
>>  	}
>>  
>> -	skb_gro_pull(skb, off - skb_network_offset(skb));
>> +	skb_gro_pull(skb, off - skb_gro_network_offset(skb));
>>  	return proto;
>>  }
>>  
>> @@ -236,7 +236,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>>  	if (unlikely(!iph))
>>  		goto out;
>>  
>> -	skb_set_network_header(skb, off);
>> +	NAPI_GRO_CB(skb)->inner_network_offset = off;
>>  
>>  	flush += ntohs(iph->payload_len) != skb->len - hlen;
>>  
>> @@ -259,7 +259,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>>  	NAPI_GRO_CB(skb)->proto = proto;
>>  
>>  	flush--;
>> -	nlen = skb_network_header_len(skb);
>> +	nlen = skb_gro_offset(skb) - off;
>>  
>>  	list_for_each_entry(p, head, list) {
>>  		const struct ipv6hdr *iph2;
>> diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
>> index 4b07d1e6c952..e70d60e0f86f 100644
>> --- a/net/ipv6/tcpv6_offload.c
>> +++ b/net/ipv6/tcpv6_offload.c
>> @@ -29,7 +29,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
>>  
>>  INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
>>  {
>> -	const struct ipv6hdr *iph = ipv6_hdr(skb);
>> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
>> +	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
>>  	struct tcphdr *th = tcp_hdr(skb);
>>  
> 
> Only udp code is affected, as only that can be used as tunnel.
> 
> For bug fixes, let's try to avoid touching other code. Also that vlan.
> 
> As a minimal patch all that is needed is the following, right?
> 
> - add the fields
> - store in inet_gro_receive + ipv6_gro_receive
> - read in udp[46]_gro_complete and udp[46]_lib_lookup_skb
> 

This approach is smaller, thanks for writing it down.

What do you think about doing this and still writing to
inner_network_offset exclusively in {inet,ipv6}_gro_receive? I still
prefer it for reasons discussed in the previous series. The code line
in vlan_gro_receive will still be there, but that will be the only
addition to your snippet.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  2024-04-25 11:59     ` Richard Gobert
@ 2024-04-25 13:56       ` Willem de Bruijn
  0 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2024-04-25 13:56 UTC (permalink / raw)
  To: Richard Gobert, Willem de Bruijn, davem, edumazet, kuba, pabeni,
	dsahern, alobakin, netdev, linux-kernel

> >> --- a/net/ipv6/tcpv6_offload.c
> >> +++ b/net/ipv6/tcpv6_offload.c
> >> @@ -29,7 +29,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>  
> >>  INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
> >>  {
> >> -	const struct ipv6hdr *iph = ipv6_hdr(skb);
> >> +	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
> >> +	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
> >>  	struct tcphdr *th = tcp_hdr(skb);
> >>  
> > 
> > Only udp code is affected, as only that can be used as tunnel.
> > 
> > For bug fixes, let's try to avoid touching other code. Also that vlan.
> > 
> > As a minimal patch all that is needed is the following, right?
> > 
> > - add the fields
> > - store in inet_gro_receive + ipv6_gro_receive
> > - read in udp[46]_gro_complete and udp[46]_lib_lookup_skb
> > 
> 
> This approach is smaller, thanks for writing it down.
> 
> What do you think about doing this and still writing to
> inner_network_offset exclusively in {inet,ipv6}_gro_receive? I still
> prefer it for reasons discussed in the previous series. The code line
> in vlan_gro_receive will still be there, but that will be the only
> addition to your snippet.

That sounds fine, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-25 13:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 16:30 [PATCH net v3 0/2] net: gro: add flush/flush_id checks and fix wrong offset in udp Richard Gobert
2024-04-24 16:30 ` [PATCH net v3 1/2] net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb Richard Gobert
2024-04-25  2:15   ` Willem de Bruijn
2024-04-25 11:59     ` Richard Gobert
2024-04-25 13:56       ` Willem de Bruijn
2024-04-24 16:30 ` [PATCH net v3 2/2] net: gro: add flush check in udp_gro_receive_segment Richard Gobert
2024-04-25  2:45   ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).