Stable Archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Ido Schimmel <idosch@nvidia.com>,
	Alexander Zubkov <green@qrator.net>,
	Petr Machata <petrm@nvidia.com>, Simon Horman <horms@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 072/107] mlxsw: spectrum_acl_tcam: Fix memory leak during rehash
Date: Tue, 30 Apr 2024 12:40:32 +0200	[thread overview]
Message-ID: <20240430103046.781038178@linuxfoundation.org> (raw)
In-Reply-To: <20240430103044.655968143@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ido Schimmel <idosch@nvidia.com>

[ Upstream commit 8ca3f7a7b61393804c46f170743c3b839df13977 ]

The rehash delayed work migrates filters from one region to another.
This is done by iterating over all chunks (all the filters with the same
priority) in the region and in each chunk iterating over all the
filters.

If the migration fails, the code tries to migrate the filters back to
the old region. However, the rollback itself can also fail in which case
another migration will be erroneously performed. Besides the fact that
this ping pong is not a very good idea, it also creates a problem.

Each virtual chunk references two chunks: The currently used one
('vchunk->chunk') and a backup ('vchunk->chunk2'). During migration the
first holds the chunk we want to migrate filters to and the second holds
the chunk we are migrating filters from.

The code currently assumes - but does not verify - that the backup chunk
does not exist (NULL) if the currently used chunk does not reference the
target region. This assumption breaks when we are trying to rollback a
rollback, resulting in the backup chunk being overwritten and leaked
[1].

Fix by not rolling back a failed rollback and add a warning to avoid
future cases.

[1]
WARNING: CPU: 5 PID: 1063 at lib/parman.c:291 parman_destroy+0x17/0x20
Modules linked in:
CPU: 5 PID: 1063 Comm: kworker/5:11 Tainted: G        W          6.9.0-rc2-custom-00784-gc6a05c468a0b #14
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:parman_destroy+0x17/0x20
[...]
Call Trace:
 <TASK>
 mlxsw_sp_acl_atcam_region_fini+0x19/0x60
 mlxsw_sp_acl_tcam_region_destroy+0x49/0xf0
 mlxsw_sp_acl_tcam_vregion_rehash_work+0x1f1/0x470
 process_one_work+0x151/0x370
 worker_thread+0x2cb/0x3e0
 kthread+0xd0/0x100
 ret_from_fork+0x34/0x50
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: 843500518509 ("mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/d5edd4f4503934186ae5cfe268503b16345b4e0f.1713797103.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 5175ed6cdae08..abfa4b44f468d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -1271,6 +1271,8 @@ mlxsw_sp_acl_tcam_vchunk_migrate_start(struct mlxsw_sp *mlxsw_sp,
 {
 	struct mlxsw_sp_acl_tcam_chunk *new_chunk;
 
+	WARN_ON(vchunk->chunk2);
+
 	new_chunk = mlxsw_sp_acl_tcam_chunk_create(mlxsw_sp, vchunk, region);
 	if (IS_ERR(new_chunk))
 		return PTR_ERR(new_chunk);
@@ -1405,6 +1407,8 @@ mlxsw_sp_acl_tcam_vregion_migrate(struct mlxsw_sp *mlxsw_sp,
 	err = mlxsw_sp_acl_tcam_vchunk_migrate_all(mlxsw_sp, vregion,
 						   ctx, credits);
 	if (err) {
+		if (ctx->this_is_rollback)
+			return err;
 		/* In case migration was not successful, we need to swap
 		 * so the original region pointer is assigned again
 		 * to vregion->region.
-- 
2.43.0




  parent reply	other threads:[~2024-04-30 11:23 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 10:39 [PATCH 5.4 000/107] 5.4.275-rc1 review Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 001/107] batman-adv: Avoid infinite loop trying to resize local TT Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 002/107] Bluetooth: Fix memory leak in hci_req_sync_complete() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 003/107] nouveau: fix function cast warning Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 004/107] net: openvswitch: fix unwanted error log on timeout policy probing Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 005/107] u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one file Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 006/107] geneve: fix header validation in geneve[6]_xmit_skb Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 007/107] ipv6: fib: hide unused pn variable Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 008/107] ipv4/route: avoid unused-but-set-variable warning Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 009/107] ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 010/107] net/mlx5: Properly link new fs rules into the tree Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 011/107] af_unix: Do not use atomic ops for unix_sk(sk)->inflight Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 012/107] af_unix: Fix garbage collector racing against connect() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 013/107] net: ena: Fix potential sign extension issue Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 014/107] btrfs: qgroup: correctly model root qgroup rsv in convert Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 015/107] drm/client: Fully protect modes[] with dev->mode_config.mutex Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 016/107] vhost: Add smp_rmb() in vhost_vq_avail_empty() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 017/107] selftests: timers: Fix abs() warning in posix_timers test Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 018/107] x86/apic: Force native_apic_mem_read() to use the MOV instruction Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 019/107] btrfs: record delayed inode root in transaction Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 020/107] selftests/ftrace: Limit length in subsystem-enable tests Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 021/107] kprobes: Fix possible use-after-free issue on kprobe registration Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 022/107] Revert "tracing/trigger: Fix to return error if failed to alloc snapshot" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 023/107] netfilter: nf_tables: Fix potential data-race in __nft_expr_type_get() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 024/107] tun: limit printing rate when illegal packet received by tun dev Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 025/107] RDMA/rxe: Fix the problem "mutex_destroy missing" Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 026/107] RDMA/mlx5: Fix port number for counter query in multi-port configuration Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 027/107] drm: nv04: Fix out of bounds access Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 028/107] clk: Remove prepare_lock hold assertion in __clk_release() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 029/107] clk: Mark all_lists as const Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 030/107] clk: remove extra empty line Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 031/107] clk: Print an info line before disabling unused clocks Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 032/107] clk: Initialize struct clk_core kref earlier Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 033/107] clk: Get runtime PM before walking tree during disable_unused Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 034/107] x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 035/107] binder: check offset alignment in binder_get_object() Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 036/107] comedi: vmk80xx: fix incomplete endpoint checking Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 037/107] serial/pmac_zilog: Remove flawed mitigation for rx irq flood Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 038/107] USB: serial: option: add Fibocom FM135-GL variants Greg Kroah-Hartman
2024-04-30 10:39 ` [PATCH 5.4 039/107] USB: serial: option: add support for Fibocom FM650/FG650 Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 040/107] USB: serial: option: add Lonsung U8300/U9300 product Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 041/107] USB: serial: option: support Quectel EM060K sub-models Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 042/107] USB: serial: option: add Rolling RW101-GL and RW135-GL support Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 043/107] USB: serial: option: add Telit FN920C04 rmnet compositions Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 044/107] Revert "usb: cdc-wdm: close race between read and workqueue" Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 045/107] usb: dwc2: host: Fix dereference issue in DDMA completion flow Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 046/107] usb: Disable USB3 LPM at shutdown Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 047/107] speakup: Avoid crash on very long word Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 048/107] fs: sysfs: Fix reference leak in sysfs_break_active_protection() Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 049/107] nouveau: fix instmem race condition around ptr stores Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 050/107] nilfs2: fix OOB in nilfs_set_de_type Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 051/107] KVM: async_pf: Cleanup kvm_setup_async_pf() Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 052/107] arm64: dts: rockchip: fix alphabetical ordering RK3399 puma Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 053/107] arm64: dts: rockchip: enable internal pull-up on PCIE_WAKE# for RK3399 Puma Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 054/107] arm64: dts: mediatek: mt8183: Add power-domains properity to mfgcfg Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 055/107] arm64: dts: mediatek: mt7622: fix IR nodename Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 056/107] arm64: dts: mediatek: mt7622: fix ethernet controller "compatible" Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 057/107] arm64: dts: mediatek: mt7622: drop "reset-names" from thermal block Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 058/107] arm64: dts: mt2712: add ethernet device node Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 059/107] arm64: dts: mediatek: mt2712: fix validation errors Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 060/107] ARC: [plat-hsdk]: Remove misplaced interrupt-cells property Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 061/107] vxlan: drop packets from invalid src-address Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 062/107] mlxsw: core: Unregister EMAD trap using FORWARD action Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 063/107] NFC: trf7970a: disable all regulators on removal Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 064/107] net: usb: ax88179_178a: stop lying about skb->truesize Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 065/107] net: gtp: Fix Use-After-Free in gtp_dellink Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 066/107] ipvs: Fix checksumming on GSO of SCTP packets Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 067/107] net: openvswitch: Fix Use-After-Free in ovs_ct_exit Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 068/107] mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 069/107] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 070/107] mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 071/107] mlxsw: spectrum_acl_tcam: Rate limit error message Greg Kroah-Hartman
2024-04-30 10:40 ` Greg Kroah-Hartman [this message]
2024-04-30 10:40 ` [PATCH 5.4 073/107] mlxsw: spectrum_acl_tcam: Fix warning during rehash Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 074/107] mlxsw: spectrum_acl_tcam: Fix incorrect list API usage Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 075/107] mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 076/107] i40e: Do not use WQ_MEM_RECLAIM flag for workqueue Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 077/107] iavf: Fix TC config comparison with existing adapter TC config Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 078/107] af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc() Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 079/107] serial: core: Provide port lock wrappers Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 080/107] serial: mxs-auart: add spinlock around changing cts state Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 081/107] drm/amdgpu: restrict bo mapping within gpu address limits Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 082/107] amdgpu: validate offset_in_bo of drm_amdgpu_gem_va Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 083/107] drm/amdgpu: validate the parameters of bo mapping operations more clearly Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 084/107] Revert "crypto: api - Disallow identical driver names" Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 085/107] net/mlx5e: Fix a race in command alloc flow Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 086/107] tracing: Show size of requested perf buffer Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 087/107] tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 088/107] Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old() Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 089/107] Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0bda:0x4853 Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 090/107] btrfs: fix information leak in btrfs_ioctl_logical_to_ino() Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 091/107] arm64: dts: rockchip: enable internal pull-up for Q7_THRM# on RK3399 Puma Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 092/107] drm/amdgpu: Fix leak when GPU memory allocation fails Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 093/107] irqchip/gic-v3-its: Prevent double free on error Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 094/107] ethernet: Add helper for assigning packet type when dest address does not match device address Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 095/107] net: b44: set pause params only when interface is up Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 096/107] stackdepot: respect __GFP_NOLOCKDEP allocation flag Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 097/107] mtd: diskonchip: work around ubsan link failure Greg Kroah-Hartman
2024-04-30 10:40 ` [PATCH 5.4 098/107] tcp: Clean up kernel listeners reqsk in inet_twsk_purge() Greg Kroah-Hartman
2024-05-06  1:34   ` shaozhengchao
2024-05-06  3:46     ` shaozhengchao
2024-04-30 10:40 ` [PATCH 5.4 099/107] tcp: Fix NEW_SYN_RECV handling " Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 100/107] dmaengine: owl: fix register access functions Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 101/107] idma64: Dont try to serve interrupts when device is powered off Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 102/107] i2c: smbus: fix NULL function pointer dereference Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 103/107] HID: i2c-hid: remove I2C_HID_READ_PENDING flag to prevent lock-up Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 104/107] bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 105/107] dm: limit the number of targets and parameter size area Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 106/107] udp: preserve the connected status if only UDP cmsg Greg Kroah-Hartman
2024-04-30 10:41 ` [PATCH 5.4 107/107] serial: core: fix kernel-doc for uart_port_unlock_irqrestore() Greg Kroah-Hartman
2024-04-30 11:44 ` [PATCH 5.4 000/107] 5.4.275-rc1 review Jon Hunter
2024-04-30 12:10 ` Harshit Mogalapalli
2024-04-30 13:38   ` Greg Kroah-Hartman
2024-05-02  3:11 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240430103046.781038178@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=green@qrator.net \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=petrm@nvidia.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).