All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] bpf: support setting max RTO for bpf_setsockopt
@ 2025-02-13  0:43 Jason Xing
  2025-02-13  0:43 ` [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition Jason Xing
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-13  0:43 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu
  Cc: bpf, netdev, Jason Xing

Support max RTO set by BPF program calling bpf_setsockopt().
Add corresponding selftests.

Jason Xing (3):
  tcp: add TCP_RTO_MAX_MIN_SEC definition
  bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  selftests/bpf: add rto max for bpf_setsockopt test

 Documentation/networking/ip-sysctl.rst        |  3 +-
 include/net/tcp.h                             |  1 +
 include/uapi/linux/bpf.h                      |  2 ++
 net/core/filter.c                             |  6 ++++
 net/ipv4/sysctl_net_ipv4.c                    |  3 +-
 net/ipv4/tcp.c                                |  3 +-
 tools/include/uapi/linux/bpf.h                |  2 ++
 .../bpf/prog_tests/tcp_hdr_options.c          | 28 +++++++++++++------
 .../bpf/progs/test_tcp_hdr_options.c          | 26 +++++++++++++++++
 .../selftests/bpf/test_tcp_hdr_options.h      |  3 ++
 10 files changed, 66 insertions(+), 11 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition
  2025-02-13  0:43 [PATCH net-next 0/3] bpf: support setting max RTO for bpf_setsockopt Jason Xing
@ 2025-02-13  0:43 ` Jason Xing
  2025-02-13  2:19   ` Kuniyuki Iwashima
  2025-02-13  0:43 ` [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt Jason Xing
  2025-02-13  0:43 ` [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test Jason Xing
  2 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-13  0:43 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu
  Cc: bpf, netdev, Jason Xing

Add minimum value definition as the lower bound of RTO MAX
set by users. In the next patch, bpf_sol_tcp_setsockopt()
will use this in the test statement.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
---
 include/net/tcp.h          | 1 +
 net/ipv4/sysctl_net_ipv4.c | 3 ++-
 net/ipv4/tcp.c             | 3 ++-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7fd2d7fa4532..b6bedbe68636 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -143,6 +143,7 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX);
 #define TCP_DELACK_MIN	4U
 #define TCP_ATO_MIN	4U
 #endif
+#define TCP_RTO_MAX_MIN_SEC 1
 #define TCP_RTO_MAX_SEC 120
 #define TCP_RTO_MAX	((unsigned)(TCP_RTO_MAX_SEC * HZ))
 #define TCP_RTO_MIN	((unsigned)(HZ / 5))
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 3a43010d726f..53942c225e0b 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -28,6 +28,7 @@ static int tcp_adv_win_scale_max = 31;
 static int tcp_app_win_max = 31;
 static int tcp_min_snd_mss_min = TCP_MIN_SND_MSS;
 static int tcp_min_snd_mss_max = 65535;
+static int tcp_rto_max_min = TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC;
 static int tcp_rto_max_max = TCP_RTO_MAX_SEC * MSEC_PER_SEC;
 static int ip_privileged_port_min;
 static int ip_privileged_port_max = 65535;
@@ -1590,7 +1591,7 @@ static struct ctl_table ipv4_net_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ONE_THOUSAND,
+		.extra1		= &tcp_rto_max_min,
 		.extra2		= &tcp_rto_max_max,
 	},
 };
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 992d5c9b2487..2373ab1a1d47 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3812,7 +3812,8 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
 					   TCP_RTO_MAX / HZ));
 		return 0;
 	case TCP_RTO_MAX_MS:
-		if (val < MSEC_PER_SEC || val > TCP_RTO_MAX_SEC * MSEC_PER_SEC)
+		if (val < TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC ||
+		    val > TCP_RTO_MAX_SEC * MSEC_PER_SEC)
 			return -EINVAL;
 		WRITE_ONCE(inet_csk(sk)->icsk_rto_max, msecs_to_jiffies(val));
 		return 0;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13  0:43 [PATCH net-next 0/3] bpf: support setting max RTO for bpf_setsockopt Jason Xing
  2025-02-13  0:43 ` [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition Jason Xing
@ 2025-02-13  0:43 ` Jason Xing
  2025-02-13  2:25   ` Kuniyuki Iwashima
  2025-02-13 23:41   ` Stanislav Fomichev
  2025-02-13  0:43 ` [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test Jason Xing
  2 siblings, 2 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-13  0:43 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu
  Cc: bpf, netdev, Jason Xing

Support bpf_setsockopt() to set the maximum value of RTO for
BPF program.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
---
 Documentation/networking/ip-sysctl.rst | 3 ++-
 include/uapi/linux/bpf.h               | 2 ++
 net/core/filter.c                      | 6 ++++++
 tools/include/uapi/linux/bpf.h         | 2 ++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 054561f8dcae..78eb0959438a 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
 
 tcp_rto_max_ms - INTEGER
 	Maximal TCP retransmission timeout (in ms).
-	Note that TCP_RTO_MAX_MS socket option has higher precedence.
+	Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
+	higher precedence for configuring this setting.
 
 	When changing tcp_rto_max_ms, it is important to understand
 	that tcp_retries2 might need a change.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2acf9b336371..8ab6ef144217 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2868,6 +2868,7 @@ union bpf_attr {
  * 		  **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
  * 		  **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
  *		  **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
+ *		  **TCP_BPF_RTO_MAX**
  * 		* **IPPROTO_IP**, which supports *optname* **IP_TOS**.
  * 		* **IPPROTO_IPV6**, which supports the following *optname*\ s:
  * 		  **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7091,6 +7092,7 @@ enum {
 	TCP_BPF_SYN_IP		= 1006, /* Copy the IP[46] and TCP header */
 	TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
 	TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
+	TCP_BPF_RTO_MAX		= 1009, /* Max delay ack in msecs */
 };
 
 enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 2ec162dd83c4..a21a147e0a86 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5303,6 +5303,12 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
 			return -EINVAL;
 		tp->bpf_sock_ops_cb_flags = val;
 		break;
+	case TCP_BPF_RTO_MAX:
+		if (val > TCP_RTO_MAX_SEC * MSEC_PER_SEC ||
+		    val < TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC)
+			return -EINVAL;
+		inet_csk(sk)->icsk_rto_max = msecs_to_jiffies(val);
+		break;
 	default:
 		return -EINVAL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2acf9b336371..8ab6ef144217 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2868,6 +2868,7 @@ union bpf_attr {
  * 		  **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
  * 		  **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
  *		  **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
+ *		  **TCP_BPF_RTO_MAX**
  * 		* **IPPROTO_IP**, which supports *optname* **IP_TOS**.
  * 		* **IPPROTO_IPV6**, which supports the following *optname*\ s:
  * 		  **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7091,6 +7092,7 @@ enum {
 	TCP_BPF_SYN_IP		= 1006, /* Copy the IP[46] and TCP header */
 	TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
 	TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
+	TCP_BPF_RTO_MAX		= 1009, /* Max delay ack in msecs */
 };
 
 enum {
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test
  2025-02-13  0:43 [PATCH net-next 0/3] bpf: support setting max RTO for bpf_setsockopt Jason Xing
  2025-02-13  0:43 ` [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition Jason Xing
  2025-02-13  0:43 ` [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt Jason Xing
@ 2025-02-13  0:43 ` Jason Xing
  2025-02-13  2:30   ` Kuniyuki Iwashima
  2 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-13  0:43 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu
  Cc: bpf, netdev, Jason Xing

Add TCP_BPF_RTO_MAX selftests for active and passive flows
in the BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB and
BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB bpf callbacks.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
---
 .../bpf/prog_tests/tcp_hdr_options.c          | 28 +++++++++++++------
 .../bpf/progs/test_tcp_hdr_options.c          | 26 +++++++++++++++++
 .../selftests/bpf/test_tcp_hdr_options.h      |  3 ++
 3 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c
index 56685fc03c7e..714d48df6b3a 100644
--- a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c
+++ b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c
@@ -60,8 +60,9 @@ static void print_hdr_stg(const struct hdr_stg *hdr_stg, const char *prefix)
 
 static void print_option(const struct bpf_test_option *opt, const char *prefix)
 {
-	fprintf(stderr, "%s{flags:0x%x, max_delack_ms:%u, rand:0x%x}\n",
-		prefix ? : "", opt->flags, opt->max_delack_ms, opt->rand);
+	fprintf(stderr, "%s{flags:0x%x, max_delack_ms:%u, rand:0x%x}, max_rto_sec:%u\n",
+		prefix ? : "", opt->flags, opt->max_delack_ms, opt->rand,
+		opt->max_rto_sec);
 }
 
 static void sk_fds_close(struct sk_fds *sk_fds)
@@ -300,13 +301,17 @@ static void fastopen_estab(void)
 	hdr_stg_map_fd = bpf_map__fd(skel->maps.hdr_stg_map);
 	lport_linum_map_fd = bpf_map__fd(skel->maps.lport_linum_map);
 
-	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS;
+	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
+				     OPTION_F_MAX_RTO_SEC;
 	exp_passive_estab_in.rand = 0xfa;
 	exp_passive_estab_in.max_delack_ms = 11;
+	exp_passive_estab_in.max_rto_sec = 1;
 
-	exp_active_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS;
+	exp_active_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
+				    OPTION_F_MAX_RTO_SEC;
 	exp_active_estab_in.rand = 0xce;
 	exp_active_estab_in.max_delack_ms = 22;
+	exp_active_estab_in.max_rto_sec = 2;
 
 	exp_passive_hdr_stg.fastopen = true;
 
@@ -337,14 +342,17 @@ static void syncookie_estab(void)
 	hdr_stg_map_fd = bpf_map__fd(skel->maps.hdr_stg_map);
 	lport_linum_map_fd = bpf_map__fd(skel->maps.lport_linum_map);
 
-	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS;
+	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
+				     OPTION_F_MAX_RTO_SEC;
 	exp_passive_estab_in.rand = 0xfa;
 	exp_passive_estab_in.max_delack_ms = 11;
+	exp_passive_estab_in.max_rto_sec = 1;
 
 	exp_active_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
-					OPTION_F_RESEND;
+				    OPTION_F_RESEND | OPTION_F_MAX_RTO_SEC;
 	exp_active_estab_in.rand = 0xce;
 	exp_active_estab_in.max_delack_ms = 22;
+	exp_active_estab_in.max_rto_sec = 2;
 
 	exp_passive_hdr_stg.syncookie = true;
 	exp_active_hdr_stg.resend_syn = true;
@@ -413,13 +421,17 @@ static void __simple_estab(bool exprm)
 	hdr_stg_map_fd = bpf_map__fd(skel->maps.hdr_stg_map);
 	lport_linum_map_fd = bpf_map__fd(skel->maps.lport_linum_map);
 
-	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS;
+	exp_passive_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
+				     OPTION_F_MAX_RTO_SEC;
 	exp_passive_estab_in.rand = 0xfa;
 	exp_passive_estab_in.max_delack_ms = 11;
+	exp_passive_estab_in.max_rto_sec = 1;
 
-	exp_active_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS;
+	exp_active_estab_in.flags = OPTION_F_RAND | OPTION_F_MAX_DELACK_MS |
+				    OPTION_F_MAX_RTO_SEC;
 	exp_active_estab_in.rand = 0xce;
 	exp_active_estab_in.max_delack_ms = 22;
+	exp_active_estab_in.max_rto_sec = 2;
 
 	prepare_out();
 
diff --git a/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c b/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
index 5f4e87ee949a..92da239adb49 100644
--- a/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
+++ b/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
@@ -80,6 +80,9 @@ static void write_test_option(const struct bpf_test_option *test_opt,
 
 	if (TEST_OPTION_FLAGS(test_opt->flags, OPTION_RAND))
 		data[offset++] = test_opt->rand;
+
+	if (TEST_OPTION_FLAGS(test_opt->flags, OPTION_MAX_RTO_SEC))
+		data[offset++] = test_opt->max_rto_sec;
 }
 
 static int store_option(struct bpf_sock_ops *skops,
@@ -124,6 +127,9 @@ static int parse_test_option(struct bpf_test_option *opt, const __u8 *start)
 	if (TEST_OPTION_FLAGS(opt->flags, OPTION_RAND))
 		opt->rand = *start++;
 
+	if (TEST_OPTION_FLAGS(opt->flags, OPTION_MAX_RTO_SEC))
+		opt->max_rto_sec = *start++;
+
 	return 0;
 }
 
@@ -411,6 +417,14 @@ static int set_rto_min(struct bpf_sock_ops *skops, __u8 peer_max_delack_ms)
 			      sizeof(min_rto_us));
 }
 
+static int set_rto_max(struct bpf_sock_ops *skops, __u8 max_rto_sec)
+{
+	__u32 max_rto_ms = max_rto_sec * 1000;
+
+	return bpf_setsockopt(skops, SOL_TCP, TCP_BPF_RTO_MAX, &max_rto_ms,
+			      sizeof(max_rto_ms));
+}
+
 static int handle_active_estab(struct bpf_sock_ops *skops)
 {
 	struct hdr_stg init_stg = {
@@ -459,6 +473,12 @@ static int handle_active_estab(struct bpf_sock_ops *skops)
 			RET_CG_ERR(err);
 	}
 
+	if (active_estab_in.max_rto_sec) {
+		err = set_rto_max(skops, active_estab_in.max_rto_sec);
+		if (err)
+			RET_CG_ERR(err);
+	}
+
 	return CG_OK;
 }
 
@@ -525,6 +545,12 @@ static int handle_passive_estab(struct bpf_sock_ops *skops)
 			RET_CG_ERR(err);
 	}
 
+	if (passive_estab_in.max_rto_sec) {
+		err = set_rto_max(skops, passive_estab_in.max_rto_sec);
+		if (err)
+			RET_CG_ERR(err);
+	}
+
 	return CG_OK;
 }
 
diff --git a/tools/testing/selftests/bpf/test_tcp_hdr_options.h b/tools/testing/selftests/bpf/test_tcp_hdr_options.h
index 56c9f8a3ad3d..c91fad861f84 100644
--- a/tools/testing/selftests/bpf/test_tcp_hdr_options.h
+++ b/tools/testing/selftests/bpf/test_tcp_hdr_options.h
@@ -8,18 +8,21 @@ struct bpf_test_option {
 	__u8 flags;
 	__u8 max_delack_ms;
 	__u8 rand;
+	__u8 max_rto_sec;
 } __attribute__((packed));
 
 enum {
 	OPTION_RESEND,
 	OPTION_MAX_DELACK_MS,
 	OPTION_RAND,
+	OPTION_MAX_RTO_SEC,
 	__NR_OPTION_FLAGS,
 };
 
 #define OPTION_F_RESEND		(1 << OPTION_RESEND)
 #define OPTION_F_MAX_DELACK_MS	(1 << OPTION_MAX_DELACK_MS)
 #define OPTION_F_RAND		(1 << OPTION_RAND)
+#define OPTION_F_MAX_RTO_SEC	(1 << OPTION_MAX_RTO_SEC)
 #define OPTION_MASK		((1 << __NR_OPTION_FLAGS) - 1)
 
 #define TEST_OPTION_FLAGS(flags, option) (1 & ((flags) >> (option)))
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition
  2025-02-13  0:43 ` [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition Jason Xing
@ 2025-02-13  2:19   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  2:19 UTC (permalink / raw)
  To: kerneljasonxing
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kpsingh, kuba, kuniyu,
	martin.lau, ncardwell, netdev, pabeni, sdf, song, yonghong.song

From: Jason Xing <kerneljasonxing@gmail.com>
Date: Thu, 13 Feb 2025 08:43:52 +0800
> Add minimum value definition as the lower bound of RTO MAX
> set by users. In the next patch, bpf_sol_tcp_setsockopt()
> will use this in the test statement.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13  0:43 ` [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt Jason Xing
@ 2025-02-13  2:25   ` Kuniyuki Iwashima
  2025-02-13  2:32     ` Kuniyuki Iwashima
  2025-02-13 23:41   ` Stanislav Fomichev
  1 sibling, 1 reply; 24+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  2:25 UTC (permalink / raw)
  To: kerneljasonxing
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kpsingh, kuba, kuniyu,
	martin.lau, ncardwell, netdev, pabeni, sdf, song, yonghong.song

From: Jason Xing <kerneljasonxing@gmail.com>
Date: Thu, 13 Feb 2025 08:43:53 +0800
> Support bpf_setsockopt() to set the maximum value of RTO for
> BPF program.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
>  Documentation/networking/ip-sysctl.rst | 3 ++-
>  include/uapi/linux/bpf.h               | 2 ++
>  net/core/filter.c                      | 6 ++++++
>  tools/include/uapi/linux/bpf.h         | 2 ++
>  4 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index 054561f8dcae..78eb0959438a 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>  
>  tcp_rto_max_ms - INTEGER
>  	Maximal TCP retransmission timeout (in ms).
> -	Note that TCP_RTO_MAX_MS socket option has higher precedence.
> +	Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> +	higher precedence for configuring this setting.
>  
>  	When changing tcp_rto_max_ms, it is important to understand
>  	that tcp_retries2 might need a change.
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2acf9b336371..8ab6ef144217 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2868,6 +2868,7 @@ union bpf_attr {
>   * 		  **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
>   * 		  **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
>   *		  **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
> + *		  **TCP_BPF_RTO_MAX**
>   * 		* **IPPROTO_IP**, which supports *optname* **IP_TOS**.
>   * 		* **IPPROTO_IPV6**, which supports the following *optname*\ s:
>   * 		  **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
> @@ -7091,6 +7092,7 @@ enum {
>  	TCP_BPF_SYN_IP		= 1006, /* Copy the IP[46] and TCP header */
>  	TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
>  	TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
> +	TCP_BPF_RTO_MAX		= 1009, /* Max delay ack in msecs */
>  };
>  
>  enum {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2ec162dd83c4..a21a147e0a86 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5303,6 +5303,12 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
>  			return -EINVAL;
>  		tp->bpf_sock_ops_cb_flags = val;
>  		break;
> +	case TCP_BPF_RTO_MAX:
> +		if (val > TCP_RTO_MAX_SEC * MSEC_PER_SEC ||
> +		    val < TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC)
> +			return -EINVAL;
> +		inet_csk(sk)->icsk_rto_max = msecs_to_jiffies(val);
> +		break;
>  	default:
>  		return -EINVAL;
>  	}

You need not define TCP_BPF_RTO_MAX because TCP_RTO_MAX is not a
BPF specific option, and you can just reuse do_tcp_setsockopt(),
then bpf_getsockopt() also works.

---8<---
diff --git a/net/core/filter.c b/net/core/filter.c
index 2ec162dd83c4..77732f10097c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
 	case TCP_USER_TIMEOUT:
 	case TCP_NOTSENT_LOWAT:
 	case TCP_SAVE_SYN:
+	case TCP_RTO_MAX:
 		if (*optlen != sizeof(int))
 			return -EINVAL;
 		break;
---8<---

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test
  2025-02-13  0:43 ` [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test Jason Xing
@ 2025-02-13  2:30   ` Kuniyuki Iwashima
  2025-02-13  3:13     ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  2:30 UTC (permalink / raw)
  To: kerneljasonxing
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kpsingh, kuba, kuniyu,
	martin.lau, ncardwell, netdev, pabeni, sdf, song, yonghong.song

From: Jason Xing <kerneljasonxing@gmail.com>
Date: Thu, 13 Feb 2025 08:43:54 +0800
> Add TCP_BPF_RTO_MAX selftests for active and passive flows
> in the BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB and
> BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB bpf callbacks.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
>  .../bpf/prog_tests/tcp_hdr_options.c          | 28 +++++++++++++------
>  .../bpf/progs/test_tcp_hdr_options.c          | 26 +++++++++++++++++
>  .../selftests/bpf/test_tcp_hdr_options.h      |  3 ++

It would be nice to update sol_tcp_testsp[] with TCP_RTO_MAX_MS
in tools/testing/selftests/bpf/progs/setget_sockopt.c.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13  2:25   ` Kuniyuki Iwashima
@ 2025-02-13  2:32     ` Kuniyuki Iwashima
  2025-02-13  3:14       ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  2:32 UTC (permalink / raw)
  To: kuniyu
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kerneljasonxing, kpsingh,
	kuba, martin.lau, ncardwell, netdev, pabeni, sdf, song,
	yonghong.song

From: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Thu, 13 Feb 2025 11:25:01 +0900
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 2ec162dd83c4..a21a147e0a86 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5303,6 +5303,12 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
> >  			return -EINVAL;
> >  		tp->bpf_sock_ops_cb_flags = val;
> >  		break;
> > +	case TCP_BPF_RTO_MAX:
> > +		if (val > TCP_RTO_MAX_SEC * MSEC_PER_SEC ||
> > +		    val < TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC)
> > +			return -EINVAL;
> > +		inet_csk(sk)->icsk_rto_max = msecs_to_jiffies(val);
> > +		break;
> >  	default:
> >  		return -EINVAL;
> >  	}
> 
> You need not define TCP_BPF_RTO_MAX because TCP_RTO_MAX is not a
> BPF specific option, and you can just reuse do_tcp_setsockopt(),
> then bpf_getsockopt() also works.
> 
> ---8<---
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2ec162dd83c4..77732f10097c 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
>  	case TCP_USER_TIMEOUT:
>  	case TCP_NOTSENT_LOWAT:
>  	case TCP_SAVE_SYN:
> +	case TCP_RTO_MAX:

s/TCP_RTO_MAX/TCP_RTO_MAX_MS/ :)


>  		if (*optlen != sizeof(int))
>  			return -EINVAL;
>  		break;
> ---8<---

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test
  2025-02-13  2:30   ` Kuniyuki Iwashima
@ 2025-02-13  3:13     ` Jason Xing
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-13  3:13 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kpsingh, kuba, martin.lau,
	ncardwell, netdev, pabeni, sdf, song, yonghong.song

On Thu, Feb 13, 2025 at 10:30 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Jason Xing <kerneljasonxing@gmail.com>
> Date: Thu, 13 Feb 2025 08:43:54 +0800
> > Add TCP_BPF_RTO_MAX selftests for active and passive flows
> > in the BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB and
> > BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB bpf callbacks.
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > ---
> >  .../bpf/prog_tests/tcp_hdr_options.c          | 28 +++++++++++++------
> >  .../bpf/progs/test_tcp_hdr_options.c          | 26 +++++++++++++++++
> >  .../selftests/bpf/test_tcp_hdr_options.h      |  3 ++
>
> It would be nice to update sol_tcp_testsp[] with TCP_RTO_MAX_MS
> in tools/testing/selftests/bpf/progs/setget_sockopt.c.

Sure, it's not hard to add this test.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13  2:32     ` Kuniyuki Iwashima
@ 2025-02-13  3:14       ` Jason Xing
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-13  3:14 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: andrii, ast, bpf, daniel, davem, dsahern, eddyz87, edumazet,
	haoluo, horms, john.fastabend, jolsa, kpsingh, kuba, martin.lau,
	ncardwell, netdev, pabeni, sdf, song, yonghong.song

On Thu, Feb 13, 2025 at 10:32 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@amazon.com>
> Date: Thu, 13 Feb 2025 11:25:01 +0900
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index 2ec162dd83c4..a21a147e0a86 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -5303,6 +5303,12 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
> > >                     return -EINVAL;
> > >             tp->bpf_sock_ops_cb_flags = val;
> > >             break;
> > > +   case TCP_BPF_RTO_MAX:
> > > +           if (val > TCP_RTO_MAX_SEC * MSEC_PER_SEC ||
> > > +               val < TCP_RTO_MAX_MIN_SEC * MSEC_PER_SEC)
> > > +                   return -EINVAL;
> > > +           inet_csk(sk)->icsk_rto_max = msecs_to_jiffies(val);
> > > +           break;
> > >     default:
> > >             return -EINVAL;
> > >     }
> >
> > You need not define TCP_BPF_RTO_MAX because TCP_RTO_MAX is not a
> > BPF specific option, and you can just reuse do_tcp_setsockopt(),
> > then bpf_getsockopt() also works.
> >
> > ---8<---
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 2ec162dd83c4..77732f10097c 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
> >       case TCP_USER_TIMEOUT:
> >       case TCP_NOTSENT_LOWAT:
> >       case TCP_SAVE_SYN:
> > +     case TCP_RTO_MAX:
>
> s/TCP_RTO_MAX/TCP_RTO_MAX_MS/ :)

Thanks for spotting this simpler approach. It works.

Thanks,
Jason

>
>
> >               if (*optlen != sizeof(int))
> >                       return -EINVAL;
> >               break;
> > ---8<---

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13  0:43 ` [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt Jason Xing
  2025-02-13  2:25   ` Kuniyuki Iwashima
@ 2025-02-13 23:41   ` Stanislav Fomichev
  2025-02-13 23:57     ` Jason Xing
  1 sibling, 1 reply; 24+ messages in thread
From: Stanislav Fomichev @ 2025-02-13 23:41 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf, netdev

On 02/13, Jason Xing wrote:
> Support bpf_setsockopt() to set the maximum value of RTO for
> BPF program.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
>  Documentation/networking/ip-sysctl.rst | 3 ++-
>  include/uapi/linux/bpf.h               | 2 ++
>  net/core/filter.c                      | 6 ++++++
>  tools/include/uapi/linux/bpf.h         | 2 ++
>  4 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index 054561f8dcae..78eb0959438a 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>  
>  tcp_rto_max_ms - INTEGER
>  	Maximal TCP retransmission timeout (in ms).
> -	Note that TCP_RTO_MAX_MS socket option has higher precedence.
> +	Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> +	higher precedence for configuring this setting.
 
The cover letter needs more explanation about the motivation. And
the precedence as well.

WRT precedence, can you install setsockopt cgroup program and filter out
calls to TCP_RTO_MAX_MS?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13 23:41   ` Stanislav Fomichev
@ 2025-02-13 23:57     ` Jason Xing
  2025-02-14  2:14       ` Martin KaFai Lau
  2025-02-14 15:48       ` Stanislav Fomichev
  0 siblings, 2 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-13 23:57 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf, netdev

On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
>
> On 02/13, Jason Xing wrote:
> > Support bpf_setsockopt() to set the maximum value of RTO for
> > BPF program.
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > ---
> >  Documentation/networking/ip-sysctl.rst | 3 ++-
> >  include/uapi/linux/bpf.h               | 2 ++
> >  net/core/filter.c                      | 6 ++++++
> >  tools/include/uapi/linux/bpf.h         | 2 ++
> >  4 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> > index 054561f8dcae..78eb0959438a 100644
> > --- a/Documentation/networking/ip-sysctl.rst
> > +++ b/Documentation/networking/ip-sysctl.rst
> > @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> >
> >  tcp_rto_max_ms - INTEGER
> >       Maximal TCP retransmission timeout (in ms).
> > -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> > +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> > +     higher precedence for configuring this setting.
>
> The cover letter needs more explanation about the motivation. And
> the precedence as well.

I am targeting the net-next tree because of recent changes[1] made by
Eric. It probably hasn't merged into the bpf-next tree.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc

>
> WRT precedence, can you install setsockopt cgroup program and filter out
> calls to TCP_RTO_MAX_MS?

Yesterday, as suggested by Kuniyuki, I decided to re-use the same
logic of TCP_RTO_MAX_MS for bpf_setsockopt():
diff --git a/net/core/filter.c b/net/core/filter.c
index 2ec162dd83c4..ffec7b4357f9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
        case TCP_USER_TIMEOUT:
        case TCP_NOTSENT_LOWAT:
        case TCP_SAVE_SYN:
+       case TCP_RTO_MAX_MS:
                if (*optlen != sizeof(int))
                        return -EINVAL;
                break;

Are you referring to using the previous way (by introducing a new flag
for BPF) because we need to know the explicit precedence between
setsockopt() and bpf_setsockopt() or other reasons? If so, I think
there are more places than setsockopt() to modify.

And, sorry that I don't follow what you meant by saying "install
setsockopt cgroup program" here. Please provide more hints.

Thanks for the review:)

Thanks,
Jason

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13 23:57     ` Jason Xing
@ 2025-02-14  2:14       ` Martin KaFai Lau
  2025-02-14  3:09         ` Jason Xing
  2025-02-14 15:48       ` Stanislav Fomichev
  1 sibling, 1 reply; 24+ messages in thread
From: Martin KaFai Lau @ 2025-02-14  2:14 UTC (permalink / raw)
  To: Jason Xing
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On 2/13/25 3:57 PM, Jason Xing wrote:
> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
>> On 02/13, Jason Xing wrote:
>>> Support bpf_setsockopt() to set the maximum value of RTO for
>>> BPF program.
>>>
>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
>>> ---
>>>   Documentation/networking/ip-sysctl.rst | 3 ++-
>>>   include/uapi/linux/bpf.h               | 2 ++
>>>   net/core/filter.c                      | 6 ++++++
>>>   tools/include/uapi/linux/bpf.h         | 2 ++
>>>   4 files changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>>> index 054561f8dcae..78eb0959438a 100644
>>> --- a/Documentation/networking/ip-sysctl.rst
>>> +++ b/Documentation/networking/ip-sysctl.rst
>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>>>
>>>   tcp_rto_max_ms - INTEGER
>>>        Maximal TCP retransmission timeout (in ms).
>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
>>> +     higher precedence for configuring this setting.
>> The cover letter needs more explanation about the motivation.

+1

I haven't looked at the patches. The cover letter has no word on the use case. 
Using test_tcp_hdr_options.c as the test is unnecessarily complicated just for 
adding a new optname support. setget_sockopt.c is the right test to reuse.


> I am targeting the net-next tree because of recent changes[1] made by
> Eric. It probably hasn't merged into the bpf-next tree.

There is the bpf-next/net tree. It should have the needed changes.

pw-bot: cr

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  2:14       ` Martin KaFai Lau
@ 2025-02-14  3:09         ` Jason Xing
  2025-02-14  5:41           ` Martin KaFai Lau
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-14  3:09 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 2/13/25 3:57 PM, Jason Xing wrote:
> > On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
> >> On 02/13, Jason Xing wrote:
> >>> Support bpf_setsockopt() to set the maximum value of RTO for
> >>> BPF program.
> >>>
> >>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
> >>> ---
> >>>   Documentation/networking/ip-sysctl.rst | 3 ++-
> >>>   include/uapi/linux/bpf.h               | 2 ++
> >>>   net/core/filter.c                      | 6 ++++++
> >>>   tools/include/uapi/linux/bpf.h         | 2 ++
> >>>   4 files changed, 12 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> >>> index 054561f8dcae..78eb0959438a 100644
> >>> --- a/Documentation/networking/ip-sysctl.rst
> >>> +++ b/Documentation/networking/ip-sysctl.rst
> >>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> >>>
> >>>   tcp_rto_max_ms - INTEGER
> >>>        Maximal TCP retransmission timeout (in ms).
> >>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> >>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> >>> +     higher precedence for configuring this setting.
> >> The cover letter needs more explanation about the motivation.
>
> +1
>
> I haven't looked at the patches. The cover letter has no word on the use case.

I will add and copy some words from Eric's patch series :)

> Using test_tcp_hdr_options.c as the test is unnecessarily complicated just for
> adding a new optname support. setget_sockopt.c is the right test to reuse.

Will use setget_sockopt.c only.

>
>
> > I am targeting the net-next tree because of recent changes[1] made by
> > Eric. It probably hasn't merged into the bpf-next tree.
>
> There is the bpf-next/net tree. It should have the needed changes.

[1] was recently merged in the net-next tree, so the only one branch I
can target is net-next.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc

Am I missing something?

Thanks,
Jason

>
> pw-bot: cr

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  3:09         ` Jason Xing
@ 2025-02-14  5:41           ` Martin KaFai Lau
  2025-02-14  6:12             ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Martin KaFai Lau @ 2025-02-14  5:41 UTC (permalink / raw)
  To: Jason Xing
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On 2/13/25 7:09 PM, Jason Xing wrote:
> On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 2/13/25 3:57 PM, Jason Xing wrote:
>>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
>>>> On 02/13, Jason Xing wrote:
>>>>> Support bpf_setsockopt() to set the maximum value of RTO for
>>>>> BPF program.
>>>>>
>>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
>>>>> ---
>>>>>    Documentation/networking/ip-sysctl.rst | 3 ++-
>>>>>    include/uapi/linux/bpf.h               | 2 ++
>>>>>    net/core/filter.c                      | 6 ++++++
>>>>>    tools/include/uapi/linux/bpf.h         | 2 ++
>>>>>    4 files changed, 12 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>>>>> index 054561f8dcae..78eb0959438a 100644
>>>>> --- a/Documentation/networking/ip-sysctl.rst
>>>>> +++ b/Documentation/networking/ip-sysctl.rst
>>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>>>>>
>>>>>    tcp_rto_max_ms - INTEGER
>>>>>         Maximal TCP retransmission timeout (in ms).
>>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
>>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
>>>>> +     higher precedence for configuring this setting.
>>>> The cover letter needs more explanation about the motivation.
>>
>> +1
>>
>> I haven't looked at the patches. The cover letter has no word on the use case.

The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your 
current use case is to have bpf setting it after reading the tcp header option, 
like the selftest in patch 3?

> 
> I will add and copy some words from Eric's patch series :)


>>> I am targeting the net-next tree because of recent changes[1] made by
>>> Eric. It probably hasn't merged into the bpf-next tree.
>>
>> There is the bpf-next/net tree. It should have the needed changes.
> 
> [1] was recently merged in the net-next tree, so the only one branch I
> can target is net-next.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> 
> Am I missing something?

There is a net branch:
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  5:41           ` Martin KaFai Lau
@ 2025-02-14  6:12             ` Jason Xing
  2025-02-14  6:40               ` Martin KaFai Lau
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-14  6:12 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On Fri, Feb 14, 2025 at 1:41 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 2/13/25 7:09 PM, Jason Xing wrote:
> > On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 2/13/25 3:57 PM, Jason Xing wrote:
> >>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
> >>>> On 02/13, Jason Xing wrote:
> >>>>> Support bpf_setsockopt() to set the maximum value of RTO for
> >>>>> BPF program.
> >>>>>
> >>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
> >>>>> ---
> >>>>>    Documentation/networking/ip-sysctl.rst | 3 ++-
> >>>>>    include/uapi/linux/bpf.h               | 2 ++
> >>>>>    net/core/filter.c                      | 6 ++++++
> >>>>>    tools/include/uapi/linux/bpf.h         | 2 ++
> >>>>>    4 files changed, 12 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> >>>>> index 054561f8dcae..78eb0959438a 100644
> >>>>> --- a/Documentation/networking/ip-sysctl.rst
> >>>>> +++ b/Documentation/networking/ip-sysctl.rst
> >>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> >>>>>
> >>>>>    tcp_rto_max_ms - INTEGER
> >>>>>         Maximal TCP retransmission timeout (in ms).
> >>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> >>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> >>>>> +     higher precedence for configuring this setting.
> >>>> The cover letter needs more explanation about the motivation.
> >>
> >> +1
> >>
> >> I haven't looked at the patches. The cover letter has no word on the use case.
>
> The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your
> current use case is to have bpf setting it after reading the tcp header option,
> like the selftest in patch 3?

Oops, I misunderstood the real situation of the tcp header option
test. My intention is to bpf_setsockopt() just like setget_sockopt
does.

Thanks for reminding me. I will totally remove the header test in the
next version.

>
> >
> > I will add and copy some words from Eric's patch series :)
>
>
> >>> I am targeting the net-next tree because of recent changes[1] made by
> >>> Eric. It probably hasn't merged into the bpf-next tree.
> >>
> >> There is the bpf-next/net tree. It should have the needed changes.
> >
> > [1] was recently merged in the net-next tree, so the only one branch I
> > can target is net-next.
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> >
> > Am I missing something?
>
> There is a net branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

But this branch hasn't included the rto max feature. I was trying to
say that what I wrote is based on the rto max feature which only
exists in the net-next tree for now.

I wonder whether I need to introduce a new flag like this patch or
reuse the TCP_RTO_MAX_MS socket option for bpf?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  6:12             ` Jason Xing
@ 2025-02-14  6:40               ` Martin KaFai Lau
  2025-02-14  6:56                 ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Martin KaFai Lau @ 2025-02-14  6:40 UTC (permalink / raw)
  To: Jason Xing
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On 2/13/25 10:12 PM, Jason Xing wrote:
> On Fri, Feb 14, 2025 at 1:41 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 2/13/25 7:09 PM, Jason Xing wrote:
>>> On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>
>>>> On 2/13/25 3:57 PM, Jason Xing wrote:
>>>>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
>>>>>> On 02/13, Jason Xing wrote:
>>>>>>> Support bpf_setsockopt() to set the maximum value of RTO for
>>>>>>> BPF program.
>>>>>>>
>>>>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
>>>>>>> ---
>>>>>>>     Documentation/networking/ip-sysctl.rst | 3 ++-
>>>>>>>     include/uapi/linux/bpf.h               | 2 ++
>>>>>>>     net/core/filter.c                      | 6 ++++++
>>>>>>>     tools/include/uapi/linux/bpf.h         | 2 ++
>>>>>>>     4 files changed, 12 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>>>>>>> index 054561f8dcae..78eb0959438a 100644
>>>>>>> --- a/Documentation/networking/ip-sysctl.rst
>>>>>>> +++ b/Documentation/networking/ip-sysctl.rst
>>>>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>>>>>>>
>>>>>>>     tcp_rto_max_ms - INTEGER
>>>>>>>          Maximal TCP retransmission timeout (in ms).
>>>>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
>>>>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
>>>>>>> +     higher precedence for configuring this setting.
>>>>>> The cover letter needs more explanation about the motivation.
>>>>
>>>> +1
>>>>
>>>> I haven't looked at the patches. The cover letter has no word on the use case.
>>
>> The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your
>> current use case is to have bpf setting it after reading the tcp header option,
>> like the selftest in patch 3?
> 
> Oops, I misunderstood the real situation of the tcp header option
> test. My intention is to bpf_setsockopt() just like setget_sockopt
> does.
> 
> Thanks for reminding me. I will totally remove the header test in the
> next version.

If your use case was in the header, it is ok although it won't be the first 
useful place I have in my mind. Regardless, it is useful to say a few words 
where you are planning to set it in the bpf. During a cb in sockops or during 
socket create ...etc. Without it, we can only guess from the selftest :(

> 
>>
>>>
>>> I will add and copy some words from Eric's patch series :)
>>
>>
>>>>> I am targeting the net-next tree because of recent changes[1] made by
>>>>> Eric. It probably hasn't merged into the bpf-next tree.
>>>>
>>>> There is the bpf-next/net tree. It should have the needed changes.
>>>
>>> [1] was recently merged in the net-next tree, so the only one branch I
>>> can target is net-next.
>>>
>>> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
>>>
>>> Am I missing something?
>>
>> There is a net branch:
               ^^^

>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> 
> But this branch hasn't included the rto max feature. I was trying to

Which branch? I was talking about the **net** branch. Not the master branch. Try 
to pull again if your local copy does not have it. The net branch should have 
the TCP_RTO_MAX_MS patches.

> say that what I wrote is based on the rto max feature which only
> exists in the net-next tree for now.
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  6:40               ` Martin KaFai Lau
@ 2025-02-14  6:56                 ` Jason Xing
  2025-02-14 23:44                   ` Martin KaFai Lau
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-14  6:56 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On Fri, Feb 14, 2025 at 2:40 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 2/13/25 10:12 PM, Jason Xing wrote:
> > On Fri, Feb 14, 2025 at 1:41 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 2/13/25 7:09 PM, Jason Xing wrote:
> >>> On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>>>
> >>>> On 2/13/25 3:57 PM, Jason Xing wrote:
> >>>>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
> >>>>>> On 02/13, Jason Xing wrote:
> >>>>>>> Support bpf_setsockopt() to set the maximum value of RTO for
> >>>>>>> BPF program.
> >>>>>>>
> >>>>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
> >>>>>>> ---
> >>>>>>>     Documentation/networking/ip-sysctl.rst | 3 ++-
> >>>>>>>     include/uapi/linux/bpf.h               | 2 ++
> >>>>>>>     net/core/filter.c                      | 6 ++++++
> >>>>>>>     tools/include/uapi/linux/bpf.h         | 2 ++
> >>>>>>>     4 files changed, 12 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> >>>>>>> index 054561f8dcae..78eb0959438a 100644
> >>>>>>> --- a/Documentation/networking/ip-sysctl.rst
> >>>>>>> +++ b/Documentation/networking/ip-sysctl.rst
> >>>>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> >>>>>>>
> >>>>>>>     tcp_rto_max_ms - INTEGER
> >>>>>>>          Maximal TCP retransmission timeout (in ms).
> >>>>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> >>>>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> >>>>>>> +     higher precedence for configuring this setting.
> >>>>>> The cover letter needs more explanation about the motivation.
> >>>>
> >>>> +1
> >>>>
> >>>> I haven't looked at the patches. The cover letter has no word on the use case.
> >>
> >> The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your
> >> current use case is to have bpf setting it after reading the tcp header option,
> >> like the selftest in patch 3?
> >
> > Oops, I misunderstood the real situation of the tcp header option
> > test. My intention is to bpf_setsockopt() just like setget_sockopt
> > does.
> >
> > Thanks for reminding me. I will totally remove the header test in the
> > next version.
>
> If your use case was in the header, it is ok although it won't be the first

I was planning to add a simple test to only see if the rto max for bpf
feature works, so I found the rto min selftests and then did a similar
one.

> useful place I have in my mind. Regardless, it is useful to say a few words
> where you are planning to set it in the bpf. During a cb in sockops or during
> socket create ...etc. Without it, we can only guess from the selftest :(

I see your point. After evaluating and comparing those two tests, I
think the setsock_opt is a better place to go. Do we even apply the
use of rto min to setsock_opt as well?

What do you think?

>
> >
> >>
> >>>
> >>> I will add and copy some words from Eric's patch series :)
> >>
> >>
> >>>>> I am targeting the net-next tree because of recent changes[1] made by
> >>>>> Eric. It probably hasn't merged into the bpf-next tree.
> >>>>
> >>>> There is the bpf-next/net tree. It should have the needed changes.
> >>>
> >>> [1] was recently merged in the net-next tree, so the only one branch I
> >>> can target is net-next.
> >>>
> >>> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> >>>
> >>> Am I missing something?
> >>
> >> There is a net branch:
>                ^^^
>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> >
> > But this branch hasn't included the rto max feature. I was trying to
>
> Which branch? I was talking about the **net** branch. Not the master branch. Try
> to pull again if your local copy does not have it. The net branch should have
> the TCP_RTO_MAX_MS patches.

Oh, I always use the master branch, never heard of net branch. You're
right, I checked out the net branch and then found it. Thanks.

One more thing I have to ask in advance is that in this case what the
title looks like? [patch bpf] or [patch bpf net]?

Thanks,
Jason

>
> > say that what I wrote is based on the rto max feature which only
> > exists in the net-next tree for now.
> >
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-13 23:57     ` Jason Xing
  2025-02-14  2:14       ` Martin KaFai Lau
@ 2025-02-14 15:48       ` Stanislav Fomichev
  2025-02-14 23:21         ` Jason Xing
  1 sibling, 1 reply; 24+ messages in thread
From: Stanislav Fomichev @ 2025-02-14 15:48 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf, netdev

On 02/14, Jason Xing wrote:
> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> >
> > On 02/13, Jason Xing wrote:
> > > Support bpf_setsockopt() to set the maximum value of RTO for
> > > BPF program.
> > >
> > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > ---
> > >  Documentation/networking/ip-sysctl.rst | 3 ++-
> > >  include/uapi/linux/bpf.h               | 2 ++
> > >  net/core/filter.c                      | 6 ++++++
> > >  tools/include/uapi/linux/bpf.h         | 2 ++
> > >  4 files changed, 12 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> > > index 054561f8dcae..78eb0959438a 100644
> > > --- a/Documentation/networking/ip-sysctl.rst
> > > +++ b/Documentation/networking/ip-sysctl.rst
> > > @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> > >
> > >  tcp_rto_max_ms - INTEGER
> > >       Maximal TCP retransmission timeout (in ms).
> > > -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> > > +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> > > +     higher precedence for configuring this setting.
> >
> > The cover letter needs more explanation about the motivation. And
> > the precedence as well.
> 
> I am targeting the net-next tree because of recent changes[1] made by
> Eric. It probably hasn't merged into the bpf-next tree.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> 
> >
> > WRT precedence, can you install setsockopt cgroup program and filter out
> > calls to TCP_RTO_MAX_MS?
> 
> Yesterday, as suggested by Kuniyuki, I decided to re-use the same
> logic of TCP_RTO_MAX_MS for bpf_setsockopt():
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2ec162dd83c4..ffec7b4357f9 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
>         case TCP_USER_TIMEOUT:
>         case TCP_NOTSENT_LOWAT:
>         case TCP_SAVE_SYN:
> +       case TCP_RTO_MAX_MS:
>                 if (*optlen != sizeof(int))
>                         return -EINVAL;
>                 break;
> 
> Are you referring to using the previous way (by introducing a new flag
> for BPF) because we need to know the explicit precedence between
> setsockopt() and bpf_setsockopt() or other reasons? If so, I think
> there are more places than setsockopt() to modify.
> 
> And, sorry that I don't follow what you meant by saying "install
> setsockopt cgroup program" here. Please provide more hints.

Ah, sorry, I misread it as bpf options taking precedence over tcp ones;
ignore the suggestion about setsockopt cgroup prog.

And yes, reusing the logic of TCP_RTO_MAX_MS looks better!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14 15:48       ` Stanislav Fomichev
@ 2025-02-14 23:21         ` Jason Xing
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-14 23:21 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: davem, edumazet, kuba, pabeni, dsahern, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf, netdev

On Fri, Feb 14, 2025 at 11:48 PM Stanislav Fomichev
<stfomichev@gmail.com> wrote:
>
> On 02/14, Jason Xing wrote:
> > On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> > >
> > > On 02/13, Jason Xing wrote:
> > > > Support bpf_setsockopt() to set the maximum value of RTO for
> > > > BPF program.
> > > >
> > > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > > ---
> > > >  Documentation/networking/ip-sysctl.rst | 3 ++-
> > > >  include/uapi/linux/bpf.h               | 2 ++
> > > >  net/core/filter.c                      | 6 ++++++
> > > >  tools/include/uapi/linux/bpf.h         | 2 ++
> > > >  4 files changed, 12 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> > > > index 054561f8dcae..78eb0959438a 100644
> > > > --- a/Documentation/networking/ip-sysctl.rst
> > > > +++ b/Documentation/networking/ip-sysctl.rst
> > > > @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> > > >
> > > >  tcp_rto_max_ms - INTEGER
> > > >       Maximal TCP retransmission timeout (in ms).
> > > > -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> > > > +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> > > > +     higher precedence for configuring this setting.
> > >
> > > The cover letter needs more explanation about the motivation. And
> > > the precedence as well.
> >
> > I am targeting the net-next tree because of recent changes[1] made by
> > Eric. It probably hasn't merged into the bpf-next tree.
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> >
> > >
> > > WRT precedence, can you install setsockopt cgroup program and filter out
> > > calls to TCP_RTO_MAX_MS?
> >
> > Yesterday, as suggested by Kuniyuki, I decided to re-use the same
> > logic of TCP_RTO_MAX_MS for bpf_setsockopt():
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 2ec162dd83c4..ffec7b4357f9 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5382,6 +5382,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
> >         case TCP_USER_TIMEOUT:
> >         case TCP_NOTSENT_LOWAT:
> >         case TCP_SAVE_SYN:
> > +       case TCP_RTO_MAX_MS:
> >                 if (*optlen != sizeof(int))
> >                         return -EINVAL;
> >                 break;
> >
> > Are you referring to using the previous way (by introducing a new flag
> > for BPF) because we need to know the explicit precedence between
> > setsockopt() and bpf_setsockopt() or other reasons? If so, I think
> > there are more places than setsockopt() to modify.
> >
> > And, sorry that I don't follow what you meant by saying "install
> > setsockopt cgroup program" here. Please provide more hints.
>
> Ah, sorry, I misread it as bpf options taking precedence over tcp ones;
> ignore the suggestion about setsockopt cgroup prog.
>
> And yes, reusing the logic of TCP_RTO_MAX_MS looks better!

Okay, then I will send a patch soon. BTW, which tree should this
series go in? Should I use the prefix '[patch bpf-next]' or something
else in the title?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14  6:56                 ` Jason Xing
@ 2025-02-14 23:44                   ` Martin KaFai Lau
  2025-02-14 23:53                     ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Martin KaFai Lau @ 2025-02-14 23:44 UTC (permalink / raw)
  To: Jason Xing
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On 2/13/25 10:56 PM, Jason Xing wrote:
> On Fri, Feb 14, 2025 at 2:40 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 2/13/25 10:12 PM, Jason Xing wrote:
>>> On Fri, Feb 14, 2025 at 1:41 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>
>>>> On 2/13/25 7:09 PM, Jason Xing wrote:
>>>>> On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>>>
>>>>>> On 2/13/25 3:57 PM, Jason Xing wrote:
>>>>>>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
>>>>>>>> On 02/13, Jason Xing wrote:
>>>>>>>>> Support bpf_setsockopt() to set the maximum value of RTO for
>>>>>>>>> BPF program.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
>>>>>>>>> ---
>>>>>>>>>      Documentation/networking/ip-sysctl.rst | 3 ++-
>>>>>>>>>      include/uapi/linux/bpf.h               | 2 ++
>>>>>>>>>      net/core/filter.c                      | 6 ++++++
>>>>>>>>>      tools/include/uapi/linux/bpf.h         | 2 ++
>>>>>>>>>      4 files changed, 12 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>>>>>>>>> index 054561f8dcae..78eb0959438a 100644
>>>>>>>>> --- a/Documentation/networking/ip-sysctl.rst
>>>>>>>>> +++ b/Documentation/networking/ip-sysctl.rst
>>>>>>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
>>>>>>>>>
>>>>>>>>>      tcp_rto_max_ms - INTEGER
>>>>>>>>>           Maximal TCP retransmission timeout (in ms).
>>>>>>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
>>>>>>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
>>>>>>>>> +     higher precedence for configuring this setting.
>>>>>>>> The cover letter needs more explanation about the motivation.
>>>>>>
>>>>>> +1
>>>>>>
>>>>>> I haven't looked at the patches. The cover letter has no word on the use case.
>>>>
>>>> The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your
>>>> current use case is to have bpf setting it after reading the tcp header option,
>>>> like the selftest in patch 3?
>>>
>>> Oops, I misunderstood the real situation of the tcp header option
>>> test. My intention is to bpf_setsockopt() just like setget_sockopt
>>> does.
>>>
>>> Thanks for reminding me. I will totally remove the header test in the
>>> next version.
>>
>> If your use case was in the header, it is ok although it won't be the first
> 
> I was planning to add a simple test to only see if the rto max for bpf
> feature works, so I found the rto min selftests and then did a similar
> one.
> 
>> useful place I have in my mind. Regardless, it is useful to say a few words
>> where you are planning to set it in the bpf. During a cb in sockops or during
>> socket create ...etc. Without it, we can only guess from the selftest :(
> 
> I see your point. After evaluating and comparing those two tests, I
> think the setsock_opt is a better place to go. Do we even apply the
> use of rto min to setsock_opt as well?
> 
> What do you think?

Adding to sol_tcp_tests[] as Kuniyuki suggested should be the straight forward way.

Please still describe how you are going to use it in bpf in the cover letter.

> 
>>
>>>
>>>>
>>>>>
>>>>> I will add and copy some words from Eric's patch series :)
>>>>
>>>>
>>>>>>> I am targeting the net-next tree because of recent changes[1] made by
>>>>>>> Eric. It probably hasn't merged into the bpf-next tree.
>>>>>>
>>>>>> There is the bpf-next/net tree. It should have the needed changes.
>>>>>
>>>>> [1] was recently merged in the net-next tree, so the only one branch I
>>>>> can target is net-next.
>>>>>
>>>>> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
>>>>>
>>>>> Am I missing something?
>>>>
>>>> There is a net branch:
>>                 ^^^
>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
>>>
>>> But this branch hasn't included the rto max feature. I was trying to
>>
>> Which branch? I was talking about the **net** branch. Not the master branch. Try
>> to pull again if your local copy does not have it. The net branch should have
>> the TCP_RTO_MAX_MS patches.
> 
> Oh, I always use the master branch, never heard of net branch. You're
> right, I checked out the net branch and then found it. Thanks.
> 
> One more thing I have to ask in advance is that in this case what the
> title looks like? [patch bpf] or [patch bpf net]?

[PATCH bpf-next]


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14 23:44                   ` Martin KaFai Lau
@ 2025-02-14 23:53                     ` Jason Xing
  2025-02-15  2:39                       ` Martin KaFai Lau
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Xing @ 2025-02-14 23:53 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On Sat, Feb 15, 2025 at 7:45 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 2/13/25 10:56 PM, Jason Xing wrote:
> > On Fri, Feb 14, 2025 at 2:40 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 2/13/25 10:12 PM, Jason Xing wrote:
> >>> On Fri, Feb 14, 2025 at 1:41 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>>>
> >>>> On 2/13/25 7:09 PM, Jason Xing wrote:
> >>>>> On Fri, Feb 14, 2025 at 10:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>>>>>
> >>>>>> On 2/13/25 3:57 PM, Jason Xing wrote:
> >>>>>>> On Fri, Feb 14, 2025 at 7:41 AM Stanislav Fomichev<stfomichev@gmail.com> wrote:
> >>>>>>>> On 02/13, Jason Xing wrote:
> >>>>>>>>> Support bpf_setsockopt() to set the maximum value of RTO for
> >>>>>>>>> BPF program.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Jason Xing<kerneljasonxing@gmail.com>
> >>>>>>>>> ---
> >>>>>>>>>      Documentation/networking/ip-sysctl.rst | 3 ++-
> >>>>>>>>>      include/uapi/linux/bpf.h               | 2 ++
> >>>>>>>>>      net/core/filter.c                      | 6 ++++++
> >>>>>>>>>      tools/include/uapi/linux/bpf.h         | 2 ++
> >>>>>>>>>      4 files changed, 12 insertions(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> >>>>>>>>> index 054561f8dcae..78eb0959438a 100644
> >>>>>>>>> --- a/Documentation/networking/ip-sysctl.rst
> >>>>>>>>> +++ b/Documentation/networking/ip-sysctl.rst
> >>>>>>>>> @@ -1241,7 +1241,8 @@ tcp_rto_min_us - INTEGER
> >>>>>>>>>
> >>>>>>>>>      tcp_rto_max_ms - INTEGER
> >>>>>>>>>           Maximal TCP retransmission timeout (in ms).
> >>>>>>>>> -     Note that TCP_RTO_MAX_MS socket option has higher precedence.
> >>>>>>>>> +     Note that TCP_BPF_RTO_MAX and TCP_RTO_MAX_MS socket option have the
> >>>>>>>>> +     higher precedence for configuring this setting.
> >>>>>>>> The cover letter needs more explanation about the motivation.
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> I haven't looked at the patches. The cover letter has no word on the use case.
> >>>>
> >>>> The question was your _use case_ in bpf. Not what the TCP_RTO_MAX_MS does. Your
> >>>> current use case is to have bpf setting it after reading the tcp header option,
> >>>> like the selftest in patch 3?
> >>>
> >>> Oops, I misunderstood the real situation of the tcp header option
> >>> test. My intention is to bpf_setsockopt() just like setget_sockopt
> >>> does.
> >>>
> >>> Thanks for reminding me. I will totally remove the header test in the
> >>> next version.
> >>
> >> If your use case was in the header, it is ok although it won't be the first
> >
> > I was planning to add a simple test to only see if the rto max for bpf
> > feature works, so I found the rto min selftests and then did a similar
> > one.
> >
> >> useful place I have in my mind. Regardless, it is useful to say a few words
> >> where you are planning to set it in the bpf. During a cb in sockops or during
> >> socket create ...etc. Without it, we can only guess from the selftest :(
> >
> > I see your point. After evaluating and comparing those two tests, I
> > think the setsock_opt is a better place to go. Do we even apply the
> > use of rto min to setsock_opt as well?
> >
> > What do you think?
>
> Adding to sol_tcp_tests[] as Kuniyuki suggested should be the straight forward way.
>
> Please still describe how you are going to use it in bpf in the cover letter.

Sure, I will.

Another related topic about rto min test, do you think it's necessary
to add TCP_BPF_RTO_MIN into the setget_sockopt test?

>
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>> I will add and copy some words from Eric's patch series :)
> >>>>
> >>>>
> >>>>>>> I am targeting the net-next tree because of recent changes[1] made by
> >>>>>>> Eric. It probably hasn't merged into the bpf-next tree.
> >>>>>>
> >>>>>> There is the bpf-next/net tree. It should have the needed changes.
> >>>>>
> >>>>> [1] was recently merged in the net-next tree, so the only one branch I
> >>>>> can target is net-next.
> >>>>>
> >>>>> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ae9b3c0e79bc
> >>>>>
> >>>>> Am I missing something?
> >>>>
> >>>> There is a net branch:
> >>                 ^^^
> >>
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> >>>
> >>> But this branch hasn't included the rto max feature. I was trying to
> >>
> >> Which branch? I was talking about the **net** branch. Not the master branch. Try
> >> to pull again if your local copy does not have it. The net branch should have
> >> the TCP_RTO_MAX_MS patches.
> >
> > Oh, I always use the master branch, never heard of net branch. You're
> > right, I checked out the net branch and then found it. Thanks.
> >
> > One more thing I have to ask in advance is that in this case what the
> > title looks like? [patch bpf] or [patch bpf net]?
>
> [PATCH bpf-next]

Thanks. Will do it.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-14 23:53                     ` Jason Xing
@ 2025-02-15  2:39                       ` Martin KaFai Lau
  2025-02-15  2:52                         ` Jason Xing
  0 siblings, 1 reply; 24+ messages in thread
From: Martin KaFai Lau @ 2025-02-15  2:39 UTC (permalink / raw)
  To: Jason Xing
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On 2/14/25 3:53 PM, Jason Xing wrote:
> Another related topic about rto min test, do you think it's necessary
> to add TCP_BPF_RTO_MIN into the setget_sockopt test?

hmm... not sure why it is related to the existing TCP_BPF_RTO_MIN.
I thought this patch is adding the new TCP_RTO_MAX_MS...

or you want to say, while adding a TCP_RTO_MAX_MS test, add a test for the 
existing TCP_BPF_RTO_MIN also because it is missing in the setget_sockopt?
iirc, I added setget_sockopt.c to test a patch that reuses the kernel 
do_*_{set,get}sockopt. Thus, it assumes the optname supports both set and get. 
TCP_BPF_RTO_MIN does not support get, so I suspect setget_sockopt will not be a 
good fit. They are unrelated, so I would leave it out of your patch for now.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt
  2025-02-15  2:39                       ` Martin KaFai Lau
@ 2025-02-15  2:52                         ` Jason Xing
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Xing @ 2025-02-15  2:52 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, davem, edumazet, kuba, pabeni, dsahern, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, horms, ncardwell, kuniyu, bpf,
	netdev

On Sat, Feb 15, 2025 at 10:39 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 2/14/25 3:53 PM, Jason Xing wrote:
> > Another related topic about rto min test, do you think it's necessary
> > to add TCP_BPF_RTO_MIN into the setget_sockopt test?
>
> hmm... not sure why it is related to the existing TCP_BPF_RTO_MIN.
> I thought this patch is adding the new TCP_RTO_MAX_MS...
>
> or you want to say, while adding a TCP_RTO_MAX_MS test, add a test for the
> existing TCP_BPF_RTO_MIN also because it is missing in the setget_sockopt?
> iirc, I added setget_sockopt.c to test a patch that reuses the kernel
> do_*_{set,get}sockopt. Thus, it assumes the optname supports both set and get.
> TCP_BPF_RTO_MIN does not support get, so I suspect setget_sockopt will not be a
> good fit. They are unrelated, so I would leave it out of your patch for now.

From my perspective, rto min or max is quite similar and only related
to the time limitation of RTO, so I assume they can have the same
usage... Right, what you said is exactly what I would like to know. As
you said, handling rto min will not be included in this series.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-02-15  2:52 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-13  0:43 [PATCH net-next 0/3] bpf: support setting max RTO for bpf_setsockopt Jason Xing
2025-02-13  0:43 ` [PATCH net-next 1/3] tcp: add TCP_RTO_MAX_MIN_SEC definition Jason Xing
2025-02-13  2:19   ` Kuniyuki Iwashima
2025-02-13  0:43 ` [PATCH net-next 2/3] bpf: add TCP_BPF_RTO_MAX for bpf_setsockopt Jason Xing
2025-02-13  2:25   ` Kuniyuki Iwashima
2025-02-13  2:32     ` Kuniyuki Iwashima
2025-02-13  3:14       ` Jason Xing
2025-02-13 23:41   ` Stanislav Fomichev
2025-02-13 23:57     ` Jason Xing
2025-02-14  2:14       ` Martin KaFai Lau
2025-02-14  3:09         ` Jason Xing
2025-02-14  5:41           ` Martin KaFai Lau
2025-02-14  6:12             ` Jason Xing
2025-02-14  6:40               ` Martin KaFai Lau
2025-02-14  6:56                 ` Jason Xing
2025-02-14 23:44                   ` Martin KaFai Lau
2025-02-14 23:53                     ` Jason Xing
2025-02-15  2:39                       ` Martin KaFai Lau
2025-02-15  2:52                         ` Jason Xing
2025-02-14 15:48       ` Stanislav Fomichev
2025-02-14 23:21         ` Jason Xing
2025-02-13  0:43 ` [PATCH net-next 3/3] selftests/bpf: add rto max for bpf_setsockopt test Jason Xing
2025-02-13  2:30   ` Kuniyuki Iwashima
2025-02-13  3:13     ` Jason Xing

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.