Linux-XFS Archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
  2024-05-07  5:13 [PATCHv3 0/1] xfs: soft lockups in unmapping and reflink remapping path Ritesh Harjani (IBM)
@ 2024-05-07  5:13 ` Ritesh Harjani (IBM)
  2024-05-07  6:06   ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-05-07  5:13 UTC (permalink / raw
  To: linux-xfs
  Cc: Dave Chinner, Darrick J . Wong, Christoph Hellwig, Ojaswin Mujoo,
	Ritesh Harjani

An async dio write to a sparse file can generate a lot of extents
and when we unlink this file (using rm), the kernel can be busy in umapping
and freeing those extents as part of transaction processing.

Similarly xfs reflink remapping path also goes through
xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
in this function, so let's add cond_resched() to avoid softlockup messages
like these.

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S  L   6.9.0-rc5-0-default #1
Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
Call Trace:
  xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
  xfs_alloc_compute_aligned+0x5c/0x144
  xfs_alloc_ag_vextent_size+0x238/0x8d4
  xfs_alloc_fix_freelist+0x540/0x694
  xfs_free_extent_fix_freelist+0x84/0xe0
  __xfs_free_extent+0x74/0x1ec
  xfs_extent_free_finish_item+0xcc/0x214
  xfs_defer_finish_one+0x194/0x388
  xfs_defer_finish_noroll+0x1b4/0x5c8
  xfs_defer_finish+0x2c/0xc4
  xfs_bunmapi_range+0xa4/0x100
  xfs_itruncate_extents_flags+0x1b8/0x2f4
  xfs_inactive_truncate+0xe0/0x124
  xfs_inactive+0x30c/0x3e0
  xfs_inodegc_worker+0x140/0x234
  process_scheduled_works+0x240/0x57c
  worker_thread+0x198/0x468
  kthread+0x138/0x140
  start_kernel_thread+0x14/0x18

run fstests generic/175 at 2024-02-02 04:40:21
[   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
 watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
 CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
 NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
 LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
 Call Trace:
  0xc000000014107c00 (unreliable)
  __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
  xfs_btree_update_keys+0x150/0x170 [xfs]
  xfs_btree_lshift+0x534/0x660 [xfs]
  xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
  xfs_btree_insrec+0x4e4/0x630 [xfs]
  xfs_btree_insert+0x104/0x2d0 [xfs]
  xfs_rmap_insert+0xc4/0x260 [xfs]
  xfs_rmap_map_shared+0x228/0x630 [xfs]
  xfs_rmap_finish_one+0x2d4/0x350 [xfs]
  xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
  xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
  __xfs_trans_commit+0x1f4/0x400 [xfs]
  xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
  xfs_reflink_remap_blocks+0x154/0x320 [xfs]
  xfs_file_remap_range+0x138/0x3a0 [xfs]
  do_clone_file_range+0x11c/0x2f0
  vfs_clone_file_range+0x60/0x1c0
  ioctl_file_clone+0x78/0x140
  sys_ioctl+0x934/0x1270
  system_call_exception+0x158/0x320
  system_call_vectored_common+0x15c/0x2ec

Cc: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/xfs/libxfs/xfs_defer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index c13276095cc0..cb185b97447d 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -705,6 +705,7 @@ xfs_defer_finish_noroll(
 		error = xfs_defer_finish_one(*tp, dfp);
 		if (error && error != -EAGAIN)
 			goto out_shutdown;
+		cond_resched();
 	}

 	/* Requeue the paused items in the outgoing transaction. */
--
2.44.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
  2024-05-07  5:13 ` [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani (IBM)
@ 2024-05-07  6:06   ` Dave Chinner
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2024-05-07  6:06 UTC (permalink / raw
  To: Ritesh Harjani (IBM)
  Cc: linux-xfs, Darrick J . Wong, Christoph Hellwig, Ojaswin Mujoo

On Tue, May 07, 2024 at 10:43:07AM +0530, Ritesh Harjani (IBM) wrote:
> An async dio write to a sparse file can generate a lot of extents
> and when we unlink this file (using rm), the kernel can be busy in umapping
> and freeing those extents as part of transaction processing.
> 
> Similarly xfs reflink remapping path also goes through
> xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
> in this function, so let's add cond_resched() to avoid softlockup messages
> like these.

You proposed this change two weeks ago: I said:

"The problem is the number of extents being processed without
yielding, not the time spent processing each individual deferred
work chain to free the extent. Hence the explicit rescheduling
should be at the top level loop where it can be easily explained and
understand, not hidden deep inside the defer chain mechanism...."

https://lore.kernel.org/linux-xfs/ZiWjbWrD60W%2F0s%2FF@dread.disaster.area/

> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
> CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S  L   6.9.0-rc5-0-default #1
> Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
> NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
> LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
> Call Trace:
>   xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
>   xfs_alloc_compute_aligned+0x5c/0x144
>   xfs_alloc_ag_vextent_size+0x238/0x8d4
>   xfs_alloc_fix_freelist+0x540/0x694
>   xfs_free_extent_fix_freelist+0x84/0xe0
>   __xfs_free_extent+0x74/0x1ec
>   xfs_extent_free_finish_item+0xcc/0x214
>   xfs_defer_finish_one+0x194/0x388
>   xfs_defer_finish_noroll+0x1b4/0x5c8
>   xfs_defer_finish+0x2c/0xc4
>   xfs_bunmapi_range+0xa4/0x100
>   xfs_itruncate_extents_flags+0x1b8/0x2f4
>   xfs_inactive_truncate+0xe0/0x124
>   xfs_inactive+0x30c/0x3e0

This is the same issue as you originally reported two weeks ago. My
comments about it still stand.

>   xfs_inodegc_worker+0x140/0x234
>   process_scheduled_works+0x240/0x57c
>   worker_thread+0x198/0x468
>   kthread+0x138/0x140
>   start_kernel_thread+0x14/0x18
> 
> run fstests generic/175 at 2024-02-02 04:40:21
> [   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>  watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>  CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
>  NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
>  LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
>  Call Trace:
>   0xc000000014107c00 (unreliable)
>   __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
>   xfs_btree_update_keys+0x150/0x170 [xfs]
>   xfs_btree_lshift+0x534/0x660 [xfs]
>   xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
>   xfs_btree_insrec+0x4e4/0x630 [xfs]
>   xfs_btree_insert+0x104/0x2d0 [xfs]
>   xfs_rmap_insert+0xc4/0x260 [xfs]
>   xfs_rmap_map_shared+0x228/0x630 [xfs]
>   xfs_rmap_finish_one+0x2d4/0x350 [xfs]
>   xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
>   xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
>   __xfs_trans_commit+0x1f4/0x400 [xfs]
>   xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
>   xfs_reflink_remap_blocks+0x154/0x320 [xfs]

And this is the same case of iterating millions of extents without
ever blocking on a lock, a buffer cache miss, transaction
reservation, etc. IOWs, xfs_reflink_remap_blocks() is another high
level extent iterator that needs the cond_resched() calls.

I can't wait for the kernel to always be pre-emptible so this whole
class of nasty hacks can go away forever. But in the mean time, they
still need to be placed appropriately.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
@ 2024-05-07  7:22 Ritesh Harjani
  0 siblings, 0 replies; 3+ messages in thread
From: Ritesh Harjani @ 2024-05-07  7:22 UTC (permalink / raw
  To: Dave Chinner
  Cc: linux-xfs, Darrick J . Wong, Christoph Hellwig, Ojaswin Mujoo

Dave Chinner <david@fromorbit.com> writes:

> On Tue, May 07, 2024 at 10:43:07AM +0530, Ritesh Harjani (IBM) wrote:
>> An async dio write to a sparse file can generate a lot of extents
>> and when we unlink this file (using rm), the kernel can be busy in umapping
>> and freeing those extents as part of transaction processing.
>> 
>> Similarly xfs reflink remapping path also goes through
>> xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
>> in this function, so let's add cond_resched() to avoid softlockup messages
>> like these.
>
> You proposed this change two weeks ago: I said:
>
> "The problem is the number of extents being processed without
> yielding, not the time spent processing each individual deferred
> work chain to free the extent. Hence the explicit rescheduling
> should be at the top level loop where it can be easily explained and
> understand, not hidden deep inside the defer chain mechanism...."
>
> https://lore.kernel.org/linux-xfs/ZiWjbWrD60W%2F0s%2FF@dread.disaster.area/
>
>> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
>> CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S  L   6.9.0-rc5-0-default #1
>> Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
>> NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
>> LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
>> Call Trace:
>>   xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
>>   xfs_alloc_compute_aligned+0x5c/0x144
>>   xfs_alloc_ag_vextent_size+0x238/0x8d4
>>   xfs_alloc_fix_freelist+0x540/0x694
>>   xfs_free_extent_fix_freelist+0x84/0xe0
>>   __xfs_free_extent+0x74/0x1ec
>>   xfs_extent_free_finish_item+0xcc/0x214
>>   xfs_defer_finish_one+0x194/0x388
>>   xfs_defer_finish_noroll+0x1b4/0x5c8
>>   xfs_defer_finish+0x2c/0xc4
>>   xfs_bunmapi_range+0xa4/0x100
>>   xfs_itruncate_extents_flags+0x1b8/0x2f4
>>   xfs_inactive_truncate+0xe0/0x124
>>   xfs_inactive+0x30c/0x3e0
>
> This is the same issue as you originally reported two weeks ago. My
> comments about it still stand.
>
>>   xfs_inodegc_worker+0x140/0x234
>>   process_scheduled_works+0x240/0x57c
>>   worker_thread+0x198/0x468
>>   kthread+0x138/0x140
>>   start_kernel_thread+0x14/0x18
>> 
>> run fstests generic/175 at 2024-02-02 04:40:21
>> [   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>>  watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>>  CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
>>  NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
>>  LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
>>  Call Trace:
>>   0xc000000014107c00 (unreliable)
>>   __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
>>   xfs_btree_update_keys+0x150/0x170 [xfs]
>>   xfs_btree_lshift+0x534/0x660 [xfs]
>>   xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
>>   xfs_btree_insrec+0x4e4/0x630 [xfs]
>>   xfs_btree_insert+0x104/0x2d0 [xfs]
>>   xfs_rmap_insert+0xc4/0x260 [xfs]
>>   xfs_rmap_map_shared+0x228/0x630 [xfs]
>>   xfs_rmap_finish_one+0x2d4/0x350 [xfs]
>>   xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
>>   xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
>>   __xfs_trans_commit+0x1f4/0x400 [xfs]
>>   xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
>>   xfs_reflink_remap_blocks+0x154/0x320 [xfs]
>
> And this is the same case of iterating millions of extents without
> ever blocking on a lock, a buffer cache miss, transaction
> reservation, etc. IOWs, xfs_reflink_remap_blocks() is another high
> level extent iterator that needs the cond_resched() calls.

ok. I was thinking if we add this to xfs_defer_finish_noroll(), then we
can just take care of all such cases. But I agree with your point here.

I will test adding cond_resched() to xfs_relink_remap_blocks() loop
and will re-submit this patch.

>
> I can't wait for the kernel to always be pre-emptible so this whole
> class of nasty hacks can go away forever. But in the mean time, they
> still need to be placed appropriately.

Sure Dave. Thanks for the review!

-ritesh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-07  7:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-07  7:22 [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani
  -- strict thread matches above, loose matches on Subject: below --
2024-05-07  5:13 [PATCHv3 0/1] xfs: soft lockups in unmapping and reflink remapping path Ritesh Harjani (IBM)
2024-05-07  5:13 ` [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani (IBM)
2024-05-07  6:06   ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).