All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Sebastian Roller <sebastian.roller@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: All files are damaged after btrfs restore
Date: Thu, 4 Mar 2021 20:01:59 -0700	[thread overview]
Message-ID: <CAJCQCtR8pXnfVwrtBEvbvm8qrDwMyqyckZyNNgrSwO8++ShfdA@mail.gmail.com> (raw)
In-Reply-To: <CALS+qHN8cL1sQt4kjP_n_TrzqO84qV5X-hP2zhnRLjigTq0g2g@mail.gmail.com>

On Thu, Mar 4, 2021 at 8:35 AM Sebastian Roller
<sebastian.roller@gmail.com> wrote:
>
> > I don't know. The exact nature of the damage of a failing controller
> > is adding a significant unknown component to it. If it was just a
> > matter of not writing anything at all, then there'd be no problem. But
> > it sounds like it wrote spurious or corrupt data, possibly into
> > locations that weren't even supposed to be written to.
>
> Unfortunately I cannot figure out exactly what happened. Logs end
> Friday night while the backup script was running -- which also
> includes a finalizing balancing of the device. Monday morning after
> some exchange of hardware the machine came up being unable to mount
> the device.

It's probably not discernible with logs anyway. What hardware does
when it goes berserk? It's chaos. And all file systems have write
order requirements. It's fine if at a certain point writes just
abruptly stop going to stable media. But if things are written out of
order, or if the hardware acknowledges critical metadata writes are
written but were actually dropped, it's bad. For all file systems.


> OK -- I now had the chance to temporarily switch to 5.11.2. Output
> looks cleaner, but the error stays the same.
>
> root@hikitty:/mnt$ mount -o ro,rescue=all /dev/sdi1 hist/
>
> [ 3937.815083] BTRFS info (device sdi1): enabling all of the rescue options
> [ 3937.815090] BTRFS info (device sdi1): ignoring data csums
> [ 3937.815093] BTRFS info (device sdi1): ignoring bad roots
> [ 3937.815095] BTRFS info (device sdi1): disabling log replay at mount time
> [ 3937.815098] BTRFS info (device sdi1): disk space caching is enabled
> [ 3937.815100] BTRFS info (device sdi1): has skinny extents
> [ 3938.903454] BTRFS error (device sdi1): bad tree block start, want
> 122583416078336 have 0
> [ 3938.994662] BTRFS error (device sdi1): bad tree block start, want
> 99593231630336 have 0
> [ 3939.201321] BTRFS error (device sdi1): bad tree block start, want
> 124762809384960 have 0
> [ 3939.221395] BTRFS error (device sdi1): bad tree block start, want
> 124762809384960 have 0
> [ 3939.221476] BTRFS error (device sdi1): failed to read block groups: -5
> [ 3939.268928] BTRFS error (device sdi1): open_ctree failed

This looks like a super is expecting something that just isn't there
at all. If spurious behavior lasted only briefly during the hardware
failure, there's a chance of recovery. But this diminishes greatly if
the chaotic behavior was on-going for a while, many seconds or a few
minutes.


> I still hope that there might be some error in the fs created by the
> crash, which can be resolved instead of real damage to all the data in
> the FS trees. I used a lot of snapshots and deduplication on that
> device, so that I expect some damage by a hardware error. But I find
> it hard to believe that every file got damaged.

Correct. They aren't actually damaged.

However, there's maybe 5-15 MiB of critical metadata on Btrfs, and if
it gets corrupt, the keys to the maze are lost. And it becomes
difficult, sometimes impossible, to "bootstrap" the file system. There
are backup entry points, but depending on the workload, they go stale
in seconds to a few minutes, and can be subject to being overwritten.

When 'btrfs restore' is doing partial recovery that ends up with a lot
of damage and holes tells me it's found stale parts of the file system
- it's on old rails so to speak, there's nothing available to tell it
that this portion of the tree is just old and not valid anymore (or
only partially valid), but also the restore code is designed to be
more tolerant of errors because otherwise it would just do nothing at
all.

I think if you're able to find the most recent root node for a
snapshot you want to restore, along with an intact chunk tree it
should be possible to get data out of that snapshot. The difficulty is
finding it, because it could be almost anywhere.

OK so you said there's an original and backup file system, are they
both in equally bad shape, having been on the same controller? Are
they both btrfs?

What do you get for

btrfs insp dump-s -f /dev/sdXY

There might be a backup tree root in there that can be used with btrfs
restore -t

Also, sometimes easier to do this on  IRC on freenode.net in the channel #btrfs


-- 
Chris Murphy

  reply	other threads:[~2021-03-05  3:02 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23 15:45 All files are damaged after btrfs restore Sebastian Roller
2021-02-25  5:40 ` Chris Murphy
2021-02-25  5:52   ` Chris Murphy
2021-02-26 16:01     ` Sebastian Roller
2021-02-27  1:04       ` Chris Murphy
2021-03-04 15:34         ` Sebastian Roller
2021-03-05  3:01           ` Chris Murphy [this message]
2021-03-07 13:58             ` Sebastian Roller
2021-03-08  0:56               ` Chris Murphy
2021-03-09 17:02                 ` Sebastian Roller
2021-03-09 20:34                   ` Chris Murphy
2021-03-16  9:35                     ` Sebastian Roller
2021-03-16 19:34                       ` Chris Murphy
2021-03-17  1:38 ` Qu Wenruo
2021-03-17  2:59   ` Chris Murphy
2021-03-17  9:01     ` Sebastian Roller
2021-03-17  1:54 ` Dāvis Mosāns
2021-03-17 10:50   ` Sebastian Roller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtR8pXnfVwrtBEvbvm8qrDwMyqyckZyNNgrSwO8++ShfdA@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sebastian.roller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.