From: Zhang Yi <yi.zhang@huaweicloud.com>
To: Chandan Babu R <chandanbabu@kernel.org>,
Dave Chinner <david@fromorbit.com>,
djwong@kernel.org, Christoph Hellwig <hch@infradead.org>
Cc: brauner@kernel.org, Linux-XFS mailing list <linux-xfs@vger.kernel.org>
Subject: Re: [BUG REPORT] generic/561 fails when testing xfs on next-20240506 kernel
Date: Sat, 11 May 2024 11:11:32 +0800 [thread overview]
Message-ID: <6c2c5235-d19e-202c-67cf-2609db932d5a@huaweicloud.com> (raw)
In-Reply-To: <87ttj8ircu.fsf@debian-BULLSEYE-live-builder-AMD64>
On 2024/5/8 17:01, Chandan Babu R wrote:
> Hi,
>
> generic/561 fails when testing XFS on a next-20240506 kernel as shown below,
>
> # ./check generic/561
> FSTYP -- xfs (debug)
> PLATFORM -- Linux/x86_64 xfs-crc-rtdev-extsize-28k 6.9.0-rc7-next-20240506+ #1 SMP PREEMPT_DYNAMIC Mon May 6 07:53:46 GMT 2024
> MKFS_OPTIONS -- -f -rrtdev=/dev/loop14 -f -m reflink=0,rmapbt=0, -d rtinherit=1 -r extsize=28k /dev/loop5
> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 -ortdev=/dev/loop14 /dev/loop5 /media/scratch
>
> generic/561 - output mismatch (see /var/lib/xfstests/results/xfs-crc-rtdev-extsize-28k/6.9.0-rc7-next-20240506+/xfs_crc_rtdev_extsize_28k/generic/561.out.bad)
> --- tests/generic/561.out 2024-05-06 08:18:09.681430366 +0000
> +++ /var/lib/xfstests/results/xfs-crc-rtdev-extsize-28k/6.9.0-rc7-next-20240506+/xfs_crc_rtdev_extsize_28k/generic/561.out.bad 2024-05-08 09:14:24.908010133 +0000
> @@ -1,2 +1,5 @@
> QA output created by 561
> +/media/scratch/dir/p0/d0XXXXXXXXXXXXXXXXXXXXXXX/d486/d4bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d5bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d212XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d11XXXXXXXXX/d54/de4/d158/d27f/d895/d1307XXX/d8a4/d832XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/r112fXXXXXXXXXXX: FAILED
> +/media/scratch/dir/p0/d0XXXXXXXXXXXXXXXXXXXXXXX/d486/d4bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d5bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d212XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d11XXXXXXXXX/d54/de4/d158/d27f/d13a3XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d13c0XXXXXXXX/d2301X/d222bXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d1240XXXXXXXXXXXXXXXXXXXXXXXX/d722XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d1380XXXXXXXXXXXXXXXX/dc62XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/r10d5: FAILED
> +md5sum: WARNING: 2 computed checksums did NOT match
> Silence is golden
> ...
> (Run 'diff -u /var/lib/xfstests/tests/generic/561.out /var/lib/xfstests/results/xfs-crc-rtdev-extsize-28k/6.9.0-rc7-next-20240506+/xfs_crc_rtdev_extsize_28k/generic/561.out.bad' to see the entire diff)
> Ran: generic/561
> Failures: generic/561
> Failed 1 of 1 tests
>
Sorry about this regression. After debuging and analyzing the code, I notice
that this problem could only happens on xfs realtime inode. The real problem
is about realtime extent alignment.
Please assume that if we have a file that contains a written extent [A, D).
We unaligned truncate to the file to B, in the middle of this written extent.
A B D
+wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
After the truncate, the i_size is set to B, but due to the sb_rextsize,
xfs_itruncate_extents() truncate and aligned the written extent to C, so the
data in [B, C) doesn't zeroed and becomes stale.
A B C
+wwwwwwwwwwwwwwSSSSSS
^
EOF
The if we write [E, F) beyond this written extent, xfs_file_write_checks()->
xfs_zero_range() would zero [B, C) in page cache, but since we don't increase
i_size in iomap_zero_iter(), the writeback process doesn't write zero data
to disk. After write, the data in [B, C) is still stale so once we clear the
pagecache, this stale data is exposed.
A B C E F
+wwwwwwwwwwwwwwSSSSSS wwwwwwww
The reason this problem doesn't occur on normal inode is because normal inode
doesn't have a post EOF written extent. For realtime inode, I guess it's not
enough to just zero the EOF block (xfs_setattr_size()->xfs_truncate_page()),
we should also zero the extra blocks that aligned to realtime extent size
before updating i_size. Any suggestions?
Thanks,
Yi.
> The following was the fstest configuration used for the test run,
>
> FSTYP=xfs
> TEST_DIR=/media/test
> SCRATCH_MNT=/media/scratch
> TEST_DEV=/dev/loop16
> TEST_LOGDEV=/dev/loop13
> SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0 -lsize=1g'
> TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
> MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
> TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
> SCRATCH_LOGDEV=/dev/loop15
> USE_EXTERNAL=yes
> LOGWRITES_DEV=/dev/loop15
>
> Git bisect produced the following as the first bad commit,
>
> commit 943bc0882cebf482422640924062a7daac5a27ba
> Author: Zhang Yi <yi.zhang@huawei.com>
> Date: Wed Mar 20 19:05:45 2024 +0800
>
> iomap: don't increase i_size if it's not a write operation
>
> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
> needed, the caller should handle it. Especially, when truncate partial
> block, we should not increase i_size beyond the new EOF here. It doesn't
> affect xfs and gfs2 now because they set the new file size after zero
> out, it doesn't matter that a transient increase in i_size, but it will
> affect ext4 because it set file size before truncate. So move the i_size
> updating logic to iomap_write_iter().
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Link: https://lore.kernel.org/r/20240320110548.2200662-7-yi.zhang@huaweicloud.com
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
>
> fs/iomap/buffered-io.c | 50 +++++++++++++++++++++++++-------------------------
> 1 file changed, 25 insertions(+), 25 deletions(-)
>
>
next prev parent reply other threads:[~2024-05-11 3:11 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 9:01 [BUG REPORT] generic/561 fails when testing xfs on next-20240506 kernel Chandan Babu R
2024-05-11 3:11 ` Zhang Yi [this message]
2024-05-11 3:45 ` Dave Chinner
2024-05-11 7:43 ` Zhang Yi
2024-05-11 8:19 ` Dave Chinner
2024-05-11 9:27 ` Zhang Yi
2024-05-15 3:10 ` Zhang Yi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6c2c5235-d19e-202c-67cf-2609db932d5a@huaweicloud.com \
--to=yi.zhang@huaweicloud.com \
--cc=brauner@kernel.org \
--cc=chandanbabu@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).