From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A0A7C433DB for ; Fri, 5 Mar 2021 03:02:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F145B64FDB for ; Fri, 5 Mar 2021 03:02:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229690AbhCEDCR (ORCPT ); Thu, 4 Mar 2021 22:02:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229436AbhCEDCR (ORCPT ); Thu, 4 Mar 2021 22:02:17 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33AAEC061574 for ; Thu, 4 Mar 2021 19:02:17 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id l22so142547wme.1 for ; Thu, 04 Mar 2021 19:02:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=colorremedies-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4YFBk4FBa20x6kVq/ErIcWGNcsK29m+e4y++PrwqgRU=; b=TnHojb/gpN3fg9RmY2Zy0XYI9fEuOcImi9x4t8JiP+l//+SkgWK9CaFdhW9ynQNCXw x4oaC99JqVkB8WkeEFH4UNnFs+lZSfupdnio1YJEiKE1aBh7siirvspXfQ0aGVsC4Ofm y2/5X6l3J67KaxiLlEVSIYqW1bStHjndhZrem/S+cGrbAXjLBSHu0RhpbcDV+CZPWnsF Cj4D6FBtkzgPLKz5N5OWlwRUf4h8bkhDiCxsRUZPLOC3bQej1fDV6TQylSn5NJ2FlAfe aKGM9s0FfWc5jyRL91el8avgtCZKJ624ktveM7IndNyvZ1zXgIOmP0spjIg8I5yWxspN JszQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4YFBk4FBa20x6kVq/ErIcWGNcsK29m+e4y++PrwqgRU=; b=qCuwoySIL6Izi2RUjuR1wAv/3GGHWQ4WTkv09vgTjdoOpZTbwzY0kzqYAfYHZ5TT9V TVh2ZN9kk3iR202vlRti3iTpnu6FNz35XXL6hMH7JASNK08iol5GJ0nyTbFUdaU1FjnH uRUfdTz092D+1KGT6/BJXWJ/bdBk1hOmtRuP0E5wCS7ENE5efAobPNPuGzcMww9blXXt Mwtkwqk+dOJuK71jGTLk8lTEeEsPw/ZK8S+e+FJzzHvSSXWtB3dAAs3SG/rBV49QXfVl MeYEK1hafw1lqgcdHyVU+JiSaGOiWtcPXWF7bXpvZ0btHQhYwt4dSstJITysFKkcp2CD 1PgQ== X-Gm-Message-State: AOAM532VR1i4Ilzn3MdQ0AWpR4QpvtIpDUt8wEz3S0wgN7zCp0d+YEzQ CKO48Vi7Ps+QjoqteYuU5y1XK8kXBaGFzOXXqmnKkA== X-Google-Smtp-Source: ABdhPJwGx5D27ylMHBYKHkksShKDY9ji70zhak5oQJfN9c/iiagYyZoRh5cEcGeI9mxdkSGfyO9zWOdP6AV45qC42KM= X-Received: by 2002:a1c:7d4e:: with SMTP id y75mr2580599wmc.168.1614913335909; Thu, 04 Mar 2021 19:02:15 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Chris Murphy Date: Thu, 4 Mar 2021 20:01:59 -0700 Message-ID: Subject: Re: All files are damaged after btrfs restore To: Sebastian Roller Cc: Chris Murphy , Btrfs BTRFS Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Thu, Mar 4, 2021 at 8:35 AM Sebastian Roller wrote: > > > I don't know. The exact nature of the damage of a failing controller > > is adding a significant unknown component to it. If it was just a > > matter of not writing anything at all, then there'd be no problem. But > > it sounds like it wrote spurious or corrupt data, possibly into > > locations that weren't even supposed to be written to. > > Unfortunately I cannot figure out exactly what happened. Logs end > Friday night while the backup script was running -- which also > includes a finalizing balancing of the device. Monday morning after > some exchange of hardware the machine came up being unable to mount > the device. It's probably not discernible with logs anyway. What hardware does when it goes berserk? It's chaos. And all file systems have write order requirements. It's fine if at a certain point writes just abruptly stop going to stable media. But if things are written out of order, or if the hardware acknowledges critical metadata writes are written but were actually dropped, it's bad. For all file systems. > OK -- I now had the chance to temporarily switch to 5.11.2. Output > looks cleaner, but the error stays the same. > > root@hikitty:/mnt$ mount -o ro,rescue=all /dev/sdi1 hist/ > > [ 3937.815083] BTRFS info (device sdi1): enabling all of the rescue options > [ 3937.815090] BTRFS info (device sdi1): ignoring data csums > [ 3937.815093] BTRFS info (device sdi1): ignoring bad roots > [ 3937.815095] BTRFS info (device sdi1): disabling log replay at mount time > [ 3937.815098] BTRFS info (device sdi1): disk space caching is enabled > [ 3937.815100] BTRFS info (device sdi1): has skinny extents > [ 3938.903454] BTRFS error (device sdi1): bad tree block start, want > 122583416078336 have 0 > [ 3938.994662] BTRFS error (device sdi1): bad tree block start, want > 99593231630336 have 0 > [ 3939.201321] BTRFS error (device sdi1): bad tree block start, want > 124762809384960 have 0 > [ 3939.221395] BTRFS error (device sdi1): bad tree block start, want > 124762809384960 have 0 > [ 3939.221476] BTRFS error (device sdi1): failed to read block groups: -5 > [ 3939.268928] BTRFS error (device sdi1): open_ctree failed This looks like a super is expecting something that just isn't there at all. If spurious behavior lasted only briefly during the hardware failure, there's a chance of recovery. But this diminishes greatly if the chaotic behavior was on-going for a while, many seconds or a few minutes. > I still hope that there might be some error in the fs created by the > crash, which can be resolved instead of real damage to all the data in > the FS trees. I used a lot of snapshots and deduplication on that > device, so that I expect some damage by a hardware error. But I find > it hard to believe that every file got damaged. Correct. They aren't actually damaged. However, there's maybe 5-15 MiB of critical metadata on Btrfs, and if it gets corrupt, the keys to the maze are lost. And it becomes difficult, sometimes impossible, to "bootstrap" the file system. There are backup entry points, but depending on the workload, they go stale in seconds to a few minutes, and can be subject to being overwritten. When 'btrfs restore' is doing partial recovery that ends up with a lot of damage and holes tells me it's found stale parts of the file system - it's on old rails so to speak, there's nothing available to tell it that this portion of the tree is just old and not valid anymore (or only partially valid), but also the restore code is designed to be more tolerant of errors because otherwise it would just do nothing at all. I think if you're able to find the most recent root node for a snapshot you want to restore, along with an intact chunk tree it should be possible to get data out of that snapshot. The difficulty is finding it, because it could be almost anywhere. OK so you said there's an original and backup file system, are they both in equally bad shape, having been on the same controller? Are they both btrfs? What do you get for btrfs insp dump-s -f /dev/sdXY There might be a backup tree root in there that can be used with btrfs restore -t Also, sometimes easier to do this on IRC on freenode.net in the channel #btrfs -- Chris Murphy