Stable Archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Zhengchao Shao <shaozhengchao@huawei.com>
Subject: [PATCH 4.19 69/77] tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge()
Date: Tue, 30 Apr 2024 12:39:48 +0200	[thread overview]
Message-ID: <20240430103043.176513161@linuxfoundation.org> (raw)
In-Reply-To: <20240430103041.111219002@linuxfoundation.org>

4.19-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 1c4e97dd2d3c9a3e84f7e26346aa39bc426d3249 upstream.

inet_twsk_purge() uses rcu to find TIME_WAIT and NEW_SYN_RECV
objects to purge.

These objects use SLAB_TYPESAFE_BY_RCU semantic and need special
care. We need to use refcount_inc_not_zero(&sk->sk_refcnt).

Reuse the existing correct logic I wrote for TIME_WAIT,
because both structures have common locations for
sk_state, sk_family, and netns pointer.

If after the refcount_inc_not_zero() the object fields longer match
the keys, use sock_gen_put(sk) to release the refcount.

Then we can call inet_twsk_deschedule_put() for TIME_WAIT,
inet_csk_reqsk_queue_drop_and_put() for NEW_SYN_RECV sockets,
with BH disabled.

Then we need to restart the loop because we had drop rcu_read_lock().

Fixes: 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in inet_twsk_purge()")
Link: https://lore.kernel.org/netdev/CANn89iLvFuuihCtt9PME2uS1WJATnf5fKjDToa1WzVnRzHnPfg@mail.gmail.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240308200122.64357-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[shaozhengchao: resolved conflicts in 5.10]
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/inet_timewait_sock.c |   41 +++++++++++++++++++----------------------
 1 file changed, 19 insertions(+), 22 deletions(-)

--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -253,12 +253,12 @@ void __inet_twsk_schedule(struct inet_ti
 }
 EXPORT_SYMBOL_GPL(__inet_twsk_schedule);
 
+/* Remove all non full sockets (TIME_WAIT and NEW_SYN_RECV) for dead netns */
 void inet_twsk_purge(struct inet_hashinfo *hashinfo, int family)
 {
-	struct inet_timewait_sock *tw;
-	struct sock *sk;
 	struct hlist_nulls_node *node;
 	unsigned int slot;
+	struct sock *sk;
 
 	for (slot = 0; slot <= hashinfo->ehash_mask; slot++) {
 		struct inet_ehash_bucket *head = &hashinfo->ehash[slot];
@@ -267,38 +267,35 @@ restart_rcu:
 		rcu_read_lock();
 restart:
 		sk_nulls_for_each_rcu(sk, node, &head->chain) {
-			if (sk->sk_state != TCP_TIME_WAIT) {
-				/* A kernel listener socket might not hold refcnt for net,
-				 * so reqsk_timer_handler() could be fired after net is
-				 * freed.  Userspace listener and reqsk never exist here.
-				 */
-				if (unlikely(sk->sk_state == TCP_NEW_SYN_RECV &&
-					     hashinfo->pernet)) {
-					struct request_sock *req = inet_reqsk(sk);
-
-					inet_csk_reqsk_queue_drop_and_put(req->rsk_listener, req);
-				}
+			int state = inet_sk_state_load(sk);
 
+			if ((1 << state) & ~(TCPF_TIME_WAIT |
+					     TCPF_NEW_SYN_RECV))
 				continue;
-			}
 
-			tw = inet_twsk(sk);
-			if ((tw->tw_family != family) ||
-				refcount_read(&twsk_net(tw)->count))
+			if (sk->sk_family != family ||
+			    refcount_read(&sock_net(sk)->count))
 				continue;
 
-			if (unlikely(!refcount_inc_not_zero(&tw->tw_refcnt)))
+			if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
 				continue;
 
-			if (unlikely((tw->tw_family != family) ||
-				     refcount_read(&twsk_net(tw)->count))) {
-				inet_twsk_put(tw);
+			if (unlikely(sk->sk_family != family ||
+				     refcount_read(&sock_net(sk)->count))) {
+				sock_gen_put(sk);
 				goto restart;
 			}
 
 			rcu_read_unlock();
 			local_bh_disable();
-			inet_twsk_deschedule_put(tw);
+			if (state == TCP_TIME_WAIT) {
+				inet_twsk_deschedule_put(inet_twsk(sk));
+			} else {
+				struct request_sock *req = inet_reqsk(sk);
+
+				inet_csk_reqsk_queue_drop_and_put(req->rsk_listener,
+								  req);
+			}
 			local_bh_enable();
 			goto restart_rcu;
 		}



  parent reply	other threads:[~2024-04-30 10:44 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 10:38 [PATCH 4.19 00/77] 4.19.313-rc1 review Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 01/77] batman-adv: Avoid infinite loop trying to resize local TT Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 02/77] Bluetooth: Fix memory leak in hci_req_sync_complete() Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 03/77] nouveau: fix function cast warning Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 04/77] geneve: fix header validation in geneve[6]_xmit_skb Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 05/77] ipv6: fib: hide unused pn variable Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 06/77] ipv4/route: avoid unused-but-set-variable warning Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 07/77] ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 08/77] net/mlx5: Properly link new fs rules into the tree Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 09/77] tracing: hide unused ftrace_event_id_fops Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 10/77] vhost: Add smp_rmb() in vhost_vq_avail_empty() Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 11/77] selftests: timers: Fix abs() warning in posix_timers test Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 12/77] x86/apic: Force native_apic_mem_read() to use the MOV instruction Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 13/77] btrfs: record delayed inode root in transaction Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 14/77] selftests/ftrace: Limit length in subsystem-enable tests Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 15/77] kprobes: Fix possible use-after-free issue on kprobe registration Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 16/77] Revert "tracing/trigger: Fix to return error if failed to alloc snapshot" Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 17/77] netfilter: nf_tables: __nft_expr_type_get() selects specific family type Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 18/77] netfilter: nf_tables: Fix potential data-race in __nft_expr_type_get() Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 19/77] tun: limit printing rate when illegal packet received by tun dev Greg Kroah-Hartman
2024-04-30 10:38 ` [PATCH 4.19 20/77] RDMA/mlx5: Fix port number for counter query in multi-port configuration Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 21/77] drm: nv04: Fix out of bounds access Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 22/77] comedi: vmk80xx: fix incomplete endpoint checking Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 23/77] serial/pmac_zilog: Remove flawed mitigation for rx irq flood Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 24/77] USB: serial: option: add Fibocom FM135-GL variants Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 25/77] USB: serial: option: add support for Fibocom FM650/FG650 Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 26/77] USB: serial: option: add Lonsung U8300/U9300 product Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 27/77] USB: serial: option: support Quectel EM060K sub-models Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 28/77] USB: serial: option: add Rolling RW101-GL and RW135-GL support Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 29/77] USB: serial: option: add Telit FN920C04 rmnet compositions Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 30/77] Revert "usb: cdc-wdm: close race between read and workqueue" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 31/77] usb: dwc2: host: Fix dereference issue in DDMA completion flow Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 32/77] speakup: Avoid crash on very long word Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 33/77] fs: sysfs: Fix reference leak in sysfs_break_active_protection() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 34/77] nouveau: fix instmem race condition around ptr stores Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 35/77] nilfs2: fix OOB in nilfs_set_de_type Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 36/77] tracing: Remove hist trigger synth_var_refs Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 37/77] tracing: Use var_refs[] for hist trigger reference checking Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 38/77] arm64: dts: rockchip: fix alphabetical ordering RK3399 puma Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 39/77] arm64: dts: rockchip: enable internal pull-up on PCIE_WAKE# for RK3399 Puma Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 40/77] arm64: dts: mediatek: mt7622: fix IR nodename Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 41/77] arm64: dts: mediatek: mt7622: fix ethernet controller "compatible" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 42/77] arm64: dts: mediatek: mt7622: drop "reset-names" from thermal block Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 43/77] ARC: [plat-hsdk]: Remove misplaced interrupt-cells property Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 44/77] vxlan: drop packets from invalid src-address Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 45/77] mlxsw: core: Unregister EMAD trap using FORWARD action Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 46/77] NFC: trf7970a: disable all regulators on removal Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 47/77] net: usb: ax88179_178a: stop lying about skb->truesize Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 48/77] net: gtp: Fix Use-After-Free in gtp_dellink Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 49/77] ipvs: Fix checksumming on GSO of SCTP packets Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 50/77] net: openvswitch: ovs_ct_exit to be done under ovs_lock Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 51/77] net: openvswitch: Fix Use-After-Free in ovs_ct_exit Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 52/77] i40e: Do not use WQ_MEM_RECLAIM flag for workqueue Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 53/77] serial: core: Provide port lock wrappers Greg Kroah-Hartman
2024-04-30 10:50   ` John Ogness
2024-04-30 12:10     ` John Ogness
2024-04-30 12:40       ` Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 54/77] serial: mxs-auart: add spinlock around changing cts state Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 55/77] drm/amdgpu: restrict bo mapping within gpu address limits Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 56/77] amdgpu: validate offset_in_bo of drm_amdgpu_gem_va Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 57/77] drm/amdgpu: validate the parameters of bo mapping operations more clearly Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 58/77] Revert "crypto: api - Disallow identical driver names" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 59/77] tracing: Show size of requested perf buffer Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 60/77] tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 61/77] Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 62/77] btrfs: fix information leak in btrfs_ioctl_logical_to_ino() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 63/77] arm64: dts: rockchip: enable internal pull-up for Q7_THRM# on RK3399 Puma Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 64/77] irqchip/gic-v3-its: Prevent double free on error Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 65/77] net: b44: set pause params only when interface is up Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 66/77] stackdepot: respect __GFP_NOLOCKDEP allocation flag Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 67/77] mtd: diskonchip: work around ubsan link failure Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 68/77] tcp: Clean up kernel listeners reqsk in inet_twsk_purge() Greg Kroah-Hartman
2024-04-30 10:39 ` Greg Kroah-Hartman [this message]
2024-04-30 10:39 ` [PATCH 4.19 70/77] dmaengine: owl: fix register access functions Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 71/77] idma64: Dont try to serve interrupts when device is powered off Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 72/77] i2c: smbus: fix NULL function pointer dereference Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 73/77] HID: i2c-hid: remove I2C_HID_READ_PENDING flag to prevent lock-up Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 74/77] Revert "loop: Remove sector_t truncation checks" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 75/77] Revert "y2038: rusage: use __kernel_old_timeval" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 76/77] udp: preserve the connected status if only UDP cmsg Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 4.19 77/77] serial: core: fix kernel-doc for uart_port_unlock_irqrestore() Greg Kroah-Hartman
2024-05-01 13:37 ` [PATCH 4.19 00/77] 4.19.313-rc1 review Jon Hunter
2024-05-01 19:44 ` Pavel Machek
2024-05-02  3:13 ` Shuah Khan
2024-05-02  7:49 ` Naresh Kamboju
2024-05-02  8:31 ` Harshit Mogalapalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240430103043.176513161@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=shaozhengchao@huawei.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).