Re: system drive corruption, btrfs check failure

Linux-BTRFS Archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <wqu@suse.com>
To: Jared Van Bortel <jared.e.vb@gmail.com>,
	Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: system drive corruption, btrfs check failure
Date: Sun, 19 May 2024 13:04:51 +0930	[thread overview]
Message-ID: <cb56ae39-71be-4bbc-b1c2-b9d0ab7241a4@suse.com> (raw)
In-Reply-To: <053ca275b81228acf1259047d6d8bac67efc256f.camel@gmail.com>



在 2024/5/19 11:47, Jared Van Bortel 写道:
> On Sat, 2024-03-30 at 10:12 +1030, Qu Wenruo wrote:
>>
>>
[...]
>>
>> Do you have any dmesg of that incident?
> 
> Hi, sorry for the delay. I finally got around to running the lowmem
> check on the old drives.
> 
> Firstly, there was nothing relevant in dmesg. When I first saw your
> reply, I checked the system journal from the time of the incident and
> there was nothing disk-related from the kernel between mount and
> shutdown - medium errors, btrfs I/O errors, or anything like that.

I believe the fs flipped RO due to extent tree corruption, thus nothing 
can really be recorded into the journal.

[...]
>>
>> Corrupted extent tree, this can lead to fs falling back to read-only
>> halfway.
> 
> This fs actually still mounts writable without any issue, FWIW. Although
> the error counters are not zeroed:

Only when the corrupted extent backref got modified the kernel would 
throw a lot of errors, and then flip to RO.

> 
> bdev /dev/nvme0n1p2 errs: wr 51, rd 0, flush 0, corrupt 0, gen 5

This shows two new findings:

1. There is some write failures, which may or may not be the root cause, 
but I'll be extra cautious.

And even if it's the root cause, it just exposed some error corner cases 
where btrfs kernel module didn't handle it correctly.
At least even we hit some write errors, we should not lead to some 
corrupted extent tree.

> 
> It's not clear to me when these errors occurred - wouldn't they have
> been logged to dmesg at the time?

As explained, if the root fs flipped RO, no journal can be really 
written onto disk.

[...]
>> Mind to run "btrfs check --mode=lowmem" on that fs, and save both
>> stderr
>> and stdout?
> 
> Here is the output:
> 
> $ sudo btrfs check --mode=lowmem /dev/nvme0n1p2
> Opening filesystem to check...
> Checking filesystem on /dev/nvme0n1p2
> UUID: 76721faa-8c32-4e70-8a9e-859dece0aec1
> [1/7] checking root items
> [2/7] checking extents
[...]
> ERROR: shared extent[1249401454592 16384] lost its parent (parent: 2368656916480, level: 0)
[...]
> ERROR: shared extent 1624013307904 referencer lost (parent: 1252056268800)
[...]

So yes, there is something wrong with the extent tree.

The "lost its parent" line means for the shared metadata backref item, 
the parent can not be found.

Thus although you can still mount the fs, if you delete or COW that 
extent, the fs would flip RO.

The later "referencer lost" is for data backref item. It is  mostly just 
caused by the previous "lost its parent" line.



> ERROR: errors found in extent allocation tree or chunk allocation
> [3/7] checking free space cache
> block group 1342751899648 has wrong amount of free space, free space cache has 34193408 block group has 42893312
> failed to load free space cache for block group 1342751899648
> block group 1343825641472 has wrong amount of free space, free space cache has 23257088 block group has 25903104
> failed to load free space cache for block group 1343825641472
> block group 1351341834240 has wrong amount of free space, free space cache has 22396928 block group has 42348544
> failed to load free space cache for block group 1351341834240
> block group 1566090199040 has wrong amount of free space, free space cache has 48562176 block group has 50651136
> failed to load free space cache for block group 1566090199040
> block group 1572532649984 has wrong amount of free space, free space cache has 12173312 block group has 15945728
> failed to load free space cache for block group 1572532649984
> block group 1580048842752 has wrong amount of free space, free space cache has 29745152 block group has 33087488
> failed to load free space cache for block group 1580048842752
> block group 1584343810048 has wrong amount of free space, free space cache has 56512512 block group has 58601472
> failed to load free space cache for block group 1584343810048
> block group 1602597421056 has wrong amount of free space, free space cache has 87953408 block group has 90349568
> failed to load free space cache for block group 1602597421056
> block group 1744331341824 has wrong amount of free space, free space cache has 339968 block group has 393216
> failed to load free space cache for block group 1744331341824
> block group 2666675568640 has wrong amount of free space, free space cache has 602112 block group has 782336
> failed to load free space cache for block group 2666675568640
> block group 2909341220864 has wrong amount of free space, free space cache has 65536 block group has 151552
> failed to load free space cache for block group 2909341220864
> block group 3904699891712 has wrong amount of free space, free space cache has 172032 block group has 221184
> failed to load free space cache for block group 3904699891712
> block group 3941207113728 has wrong amount of free space, free space cache has 1728512 block group has 1826816
> failed to load free space cache for block group 3941207113728
> block group 4085088518144 has wrong amount of free space, free space cache has 5697536 block group has 5754880
> failed to load free space cache for block group 4085088518144
> block group 4241854824448 has wrong amount of free space, free space cache has 23293952 block group has 28966912
> failed to load free space cache for block group 4241854824448
> block group 4838855278592 has wrong amount of free space, free space cache has 86016 block group has 118784
> failed to load free space cache for block group 4838855278592
> block group 4847445213184 has wrong amount of free space, free space cache has 49152 block group has 110592
> failed to load free space cache for block group 4847445213184
> block group 4897911078912 has wrong amount of free space, free space cache has 7475200 block group has 7577600
> failed to load free space cache for block group 4897911078912
> block group 5010008178688 has wrong amount of free space, free space cache has 69632 block group has 106496
> failed to load free space cache for block group 5010008178688
> block group 5062655082496 has wrong amount of free space, free space cache has 5836800 block group has 5890048
> failed to load free space cache for block group 5062655082496
> block group 5268813512704 has wrong amount of free space, free space cache has 135168 block group has 221184
> failed to load free space cache for block group 5268813512704

This is minor error, but I'm not sure if it would lead to more problems.

Overall recommended to go v2 space cache.

> [4/7] checking fs roots
> ERROR: root 259 EXTENT_DATA[1522634 4096] gap exists, expected: EXTENT_DATA[1522634 128]
> ERROR: root 259 EXTENT_DATA[1522636 4096] gap exists, expected: EXTENT_DATA[1522636 128]
> ERROR: root 407 EXTENT_DATA[398831 4096] gap exists, expected: EXTENT_DATA[398831 25]
> ERROR: root 407 EXTENT_DATA[398973 4096] gap exists, expected: EXTENT_DATA[398973 25]
> ERROR: root 407 EXTENT_DATA[398975 4096] gap exists, expected: EXTENT_DATA[398975 25]
> ERROR: root 407 EXTENT_DATA[398976 4096] gap exists, expected: EXTENT_DATA[398976 25]
> ERROR: root 407 EXTENT_DATA[418307 4096] gap exists, expected: EXTENT_DATA[418307 25]
> ERROR: root 407 EXTENT_DATA[418316 4096] gap exists, expected: EXTENT_DATA[418316 25]
> ERROR: root 407 EXTENT_DATA[418317 4096] gap exists, expected: EXTENT_DATA[418317 25]
> ERROR: root 407 EXTENT_DATA[420660 4096] gap exists, expected: EXTENT_DATA[420660 25]
> ERROR: root 407 EXTENT_DATA[420673 4096] gap exists, expected: EXTENT_DATA[420673 25]
> ERROR: root 407 EXTENT_DATA[439382 4096] gap exists, expected: EXTENT_DATA[439382 25]
> ERROR: root 407 EXTENT_DATA[439383 4096] gap exists, expected: EXTENT_DATA[439383 25]
> ERROR: root 407 EXTENT_DATA[451252 4096] gap exists, expected: EXTENT_DATA[451252 25]
> ERROR: root 407 EXTENT_DATA[451264 4096] gap exists, expected: EXTENT_DATA[451264 25]
> ERROR: root 407 EXTENT_DATA[451265 4096] gap exists, expected: EXTENT_DATA[451265 25]
> ERROR: root 407 EXTENT_DATA[452326 4096] gap exists, expected: EXTENT_DATA[452326 25]
> ERROR: root 407 EXTENT_DATA[452332 4096] gap exists, expected: EXTENT_DATA[452332 25]
> ERROR: root 407 EXTENT_DATA[452339 4096] gap exists, expected: EXTENT_DATA[452339 25]
> ERROR: root 407 EXTENT_DATA[4293157 4096] gap exists, expected: EXTENT_DATA[4293157 25]
> ERROR: root 407 EXTENT_DATA[4293570 4096] gap exists, expected: EXTENT_DATA[4293570 25]
> ERROR: root 407 EXTENT_DATA[4293571 4096] gap exists, expected: EXTENT_DATA[4293571 25]
> ERROR: root 407 EXTENT_DATA[4293572 4096] gap exists, expected: EXTENT_DATA[4293572 25]
> ERROR: root 407 EXTENT_DATA[4302136 4096] gap exists, expected: EXTENT_DATA[4302136 25]
> ERROR: root 407 EXTENT_DATA[4302148 4096] gap exists, expected: EXTENT_DATA[4302148 25]
> ERROR: root 407 EXTENT_DATA[4302149 4096] gap exists, expected: EXTENT_DATA[4302149 25]
> ERROR: root 407 EXTENT_DATA[4302150 4096] gap exists, expected: EXTENT_DATA[4302150 25]
> ERROR: root 407 EXTENT_DATA[5970391 4096] gap exists, expected: EXTENT_DATA[5970391 25]

Minor problems, would not cause anything wrong, kernel can easily handle it.

So far the fs looks pretty old (v1 cache, no NO_HOLES feature).
Thus the corruption may be caused by some older kernel bugs.

After backing up critical info, you can try "btrfs check --repair" to 
see if it can fix the extent tree corruption.

> ERROR: errors found in fs roots
> found 2397613547520 bytes used, error(s) found
> total csum bytes: 1840478932
> total tree bytes: 13337329664
> total fs tree bytes: 10208378880
> total extent tree bytes: 874070016
> btree space waste bytes: 2240708820
> file data blocks allocated: 24819271946240
>   referenced 2695187488768
> 
> 
> Hopefully that means something to you. I'm still curious to know to what
> degree I should still trust these drives if I were to wipe the fs and
> start over. I suppose I could run a SMART test or something, right?

So far I do not think there is something related to the device.
The only thing that can lead to the drive is the write errors.

The extent tree corruption so far looks more like a bug in (possibly 
older) kernel.

At least the minor problems would never occur for the new default mkfs 
options.

Thanks,
Qu

> 
> Thanks,
> Jared
> 
>>
>> Thanks,
>> Qu
>>
>>>
>>> Thanks,
>>> Jared
> 
>

     prev parent reply	other threads:[~2024-05-19  3:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 17:30 system drive corruption, btrfs check failure Jared Van Bortel
2024-03-29 23:42 ` Qu Wenruo
2024-05-19  2:17   ` Jared Van Bortel
2024-05-19  3:34     ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb56ae39-71be-4bbc-b1c2-b9d0ab7241a4@suse.com \
    --to=wqu@suse.com \
    --cc=jared.e.vb@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).