Linux-XFS Archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: linux-xfs@vger.kernel.org, "Darrick J . Wong" <djwong@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>
Subject: Re: [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
Date: Tue, 7 May 2024 16:06:46 +1000	[thread overview]
Message-ID: <ZjnE9huCrF61JpTn@dread.disaster.area> (raw)
In-Reply-To: <fe5326ac64e9ba3e10e5521fa97061b45632e516.1715057896.git.ritesh.list@gmail.com>

On Tue, May 07, 2024 at 10:43:07AM +0530, Ritesh Harjani (IBM) wrote:
> An async dio write to a sparse file can generate a lot of extents
> and when we unlink this file (using rm), the kernel can be busy in umapping
> and freeing those extents as part of transaction processing.
> 
> Similarly xfs reflink remapping path also goes through
> xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
> in this function, so let's add cond_resched() to avoid softlockup messages
> like these.

You proposed this change two weeks ago: I said:

"The problem is the number of extents being processed without
yielding, not the time spent processing each individual deferred
work chain to free the extent. Hence the explicit rescheduling
should be at the top level loop where it can be easily explained and
understand, not hidden deep inside the defer chain mechanism...."

https://lore.kernel.org/linux-xfs/ZiWjbWrD60W%2F0s%2FF@dread.disaster.area/

> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
> CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S  L   6.9.0-rc5-0-default #1
> Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
> NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
> LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
> Call Trace:
>   xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
>   xfs_alloc_compute_aligned+0x5c/0x144
>   xfs_alloc_ag_vextent_size+0x238/0x8d4
>   xfs_alloc_fix_freelist+0x540/0x694
>   xfs_free_extent_fix_freelist+0x84/0xe0
>   __xfs_free_extent+0x74/0x1ec
>   xfs_extent_free_finish_item+0xcc/0x214
>   xfs_defer_finish_one+0x194/0x388
>   xfs_defer_finish_noroll+0x1b4/0x5c8
>   xfs_defer_finish+0x2c/0xc4
>   xfs_bunmapi_range+0xa4/0x100
>   xfs_itruncate_extents_flags+0x1b8/0x2f4
>   xfs_inactive_truncate+0xe0/0x124
>   xfs_inactive+0x30c/0x3e0

This is the same issue as you originally reported two weeks ago. My
comments about it still stand.

>   xfs_inodegc_worker+0x140/0x234
>   process_scheduled_works+0x240/0x57c
>   worker_thread+0x198/0x468
>   kthread+0x138/0x140
>   start_kernel_thread+0x14/0x18
> 
> run fstests generic/175 at 2024-02-02 04:40:21
> [   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>  watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>  CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
>  NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
>  LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
>  Call Trace:
>   0xc000000014107c00 (unreliable)
>   __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
>   xfs_btree_update_keys+0x150/0x170 [xfs]
>   xfs_btree_lshift+0x534/0x660 [xfs]
>   xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
>   xfs_btree_insrec+0x4e4/0x630 [xfs]
>   xfs_btree_insert+0x104/0x2d0 [xfs]
>   xfs_rmap_insert+0xc4/0x260 [xfs]
>   xfs_rmap_map_shared+0x228/0x630 [xfs]
>   xfs_rmap_finish_one+0x2d4/0x350 [xfs]
>   xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
>   xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
>   __xfs_trans_commit+0x1f4/0x400 [xfs]
>   xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
>   xfs_reflink_remap_blocks+0x154/0x320 [xfs]

And this is the same case of iterating millions of extents without
ever blocking on a lock, a buffer cache miss, transaction
reservation, etc. IOWs, xfs_reflink_remap_blocks() is another high
level extent iterator that needs the cond_resched() calls.

I can't wait for the kernel to always be pre-emptible so this whole
class of nasty hacks can go away forever. But in the mean time, they
still need to be placed appropriately.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-05-07  6:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  5:13 [PATCHv3 0/1] xfs: soft lockups in unmapping and reflink remapping path Ritesh Harjani (IBM)
2024-05-07  5:13 ` [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani (IBM)
2024-05-07  6:06   ` Dave Chinner [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-05-07  7:22 Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjnE9huCrF61JpTn@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).