All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Anasal | BDSU <roman.anasal@bdsu.de>
To: "fdmanana@gmail.com" <fdmanana@gmail.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/2] btrfs: send: fix invalid commands for inodes with changed type but same gen
Date: Tue, 12 Jan 2021 13:10:57 +0000	[thread overview]
Message-ID: <0ce2d415308ab40874aff535031e9871a442cd9a.camel@bdsu.de> (raw)
In-Reply-To: <CAL3q7H6YOPgcdgJKX8OETqrKqmfz8GRkQykPOQBMmnNSsc4sxw@mail.gmail.com>

On Tue, 2021-01-12 at 11:27 +0000, Filipe Manana wrote:
> You get all these issues because you are using an incremental send in
> a way it's not supposed to.
> You are using two subvolumes that are completely unrelated.
Yes, I am aware of that and know that this is not a designed for use
case.

> My surprise here is that we actually allow a user to try that,
> instead
> of giving an error complaining that subvol1 and subvol2 aren't
> snapshots of the same subvolume.
This could be one way of solving it. But this is already harder than it
sounds, since the same issue may also happen *even when* the subvolumes
share a parent/are related, here an example reproducer:

  btrfs subvolume create subvol1
  btrfs subvolume snapshot subvol1 subvol2
  mkdir subvol1/a
  echo foo > subvol2/a
  btrfs property set subvol1 ro true
  btrfs property set subvol2 ro true
  btrfs send -p subvol1 subvol2 | btrfs receive --dump

This will produce a stream that tries to write data into the cloned
directory inode.

So we not only had to check the parent relationships but also ensure
that the descendant snapshot was not modified, i.e. was never set to
ro=false.
As far as I know there is no flag tracking this on-disk? So this would
need additional work, too.


> An incremental is supposed to work on snapshots of the same subvolume
> (hence, have the same parent uuid).
> That's what the entire code relies on to work correctly, and that's
> what makes sense - to compute and send the differences between two
> points in time of a subvolume.
> That's why the code base assumes that inodes with the same number and
> same generation must refer to the same inode in both the parent and
> the send root.
> 
> What I think that needs to be answered is:
> 
> 1) Are there actually people using incremental sends in that way?
> (It's the first time I see such use case)
Well, I did (:
But admittedly in a kind of experimental setup testing out the limits
of btrfs.

> 2) If so, why? That is completely unreliable, not only it can lead to
> failure to apply the streams, but can result in all sorts of
> weirdness
> (logical inconsistencies, etc) if applying such streams doesn't cause
> an error.
In my case I was moving around subvolumes between multiple disks and
deduplicating as much as possible. Trying to preserve already done
deduplication and purging some intermediate subvolumes I ended up using
"unrelated" subvolumes as parents.

Thinking of it, maybe just using them as clone sources would have just
worked? That would then of course produce much larger streams since
*all* meta data had to be transfered.


> 3) Making sure such use cases work reliably would require many, many
> changes to the send implementation, as it goes against what it
> currently expects.
>     Snapshot a subvolume, change the subvolume, snapshot it again,
> then use both snapshots for the incremental send, that's the expected
> scenario.
I actually don't think that it is really that much work since besides
from some edge cases it already *does* work - I tried ;)


> In other words, what I think we should have is a check that forbids
> using two roots for an incremental send that are not snapshots of the
> same subvolume (have different parent uuids).
I'd like to argue against that:
   1. I don't think allowing this requires that much work
   2. explicitly forbiding it requires work, too (and maybe even changes
      to the on-disk format?)
   3. fixing bugs for this unexpected use case will probably also fix bugs
      for the expected scenario which may only happen in very rare and
      extremly unlikely - though still possible - cases and thus make the
      code overall more resilient:

For example the assumption that inodes with the same number must refer
to the same inode doesn't even hold for direct snapshots - almost
always it does but in some rare conditions it doesn't, which is why
there already is an additional check for that.

This already caused bugs before. And the bug I hit with my unexpected
use of btrfs-send and originally set out to find you had just fixed a
few weeks before [1], i.e. if you hadn't fixed it I would - because of
the unexpected use.
The bug my patch fixes I only discovered by reading the code and having
my scenario in mind.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/commit/?id=0b3f407e6728d990ae1630a02c7b952c21c288d3

That's why I think the work fixing arising bugs for that case is better
invested than in trying to just block it completely.
But given its very rare occurence I would agree to not put any priority
on it.

> Thanks.
> 

  parent reply	other threads:[~2021-01-12 13:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-11 19:02 [PATCH 0/2] btrfs: send: correctly recreate changed inodes Roman Anasal
2021-01-11 19:02 ` [PATCH 1/2] btrfs: send: rename send_ctx.cur_inode_new_gen to cur_inode_recreated Roman Anasal
2021-01-11 19:02 ` [PATCH 2/2] btrfs: send: fix invalid commands for inodes with changed type but same gen Roman Anasal
2021-01-11 19:30   ` Andrei Borzenkov
2021-01-11 20:53     ` Roman Anasal | BDSU
2021-01-12 11:27       ` Filipe Manana
2021-01-12 12:07         ` Graham Cobb
2021-01-12 12:30           ` Filipe Manana
2021-01-12 13:10         ` Roman Anasal | BDSU [this message]
2021-01-12 13:54           ` Filipe Manana
2021-01-12 14:37             ` Roman Anasal | BDSU
2021-01-12 15:08               ` Filipe Manana
2021-01-11 21:15   ` David Sterba
2021-01-11 21:31     ` Roman Anasal | BDSU
2021-01-11 21:19   ` kernel test robot
2021-01-11 21:19     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ce2d415308ab40874aff535031e9871a442cd9a.camel@bdsu.de \
    --to=roman.anasal@bdsu.de \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.