Corrupted system due to imbalanced metadata chunks

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* Corrupted system due to imbalanced metadata chunks
@ 2016-05-17 15:45 Peter Kese
  2016-05-17 18:03 ` Austin S. Hemmelgarn
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Peter Kese @ 2016-05-17 15:45 UTC (permalink / raw
  To: linux-btrfs

I've been using btrfs on my main system for a few months. I know btrfs
is a little bit beta, but I thought not using any fancy features like
quotas, snapshotting, raid, etc. would keep me on the safe side.

Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
out that while there was more than 100 GB (45%) of free disk space,
the upgrade process broke down somewhere in the middle reporting IO
errors and lack of free disk space.

As I have learned later on, my problem was lack of available metadata
blocks and a couple of tries at btrfs-balance remedied the space
problem, but I nevertheless ended up with a broken Ubuntu distribution
(there were broken packages and apt-get/dpkg hacking failed to fix the
problem).

So there wasn't any major data loss (apart from some .deb packages
missing some files, my personal data is intact). But I'd still
consider this a major loss, because I'll end up having to reinstall
the whole system.

Now here's what I think:
 1) I may have been a bit unfortunate to experience this particular
issue but there's a large audience of people who might get bitten as
well,
 2) I find it hard to blame it on Ubuntu's upgrade process, as it does
check for free space availability before starting the upgrade,
 3) A file system should not refuse to store files (during system
upgrade or any other time), when there is 100 GB of free disk space
available,
 4) Not anywhere in any btrfs documentation (not even in btrfs
Gotchas) did I read any bold text saying *If installing btrfs, you
should always keep an eye on free space for metadata and perform
regular balances or otherwise you may corrupt your system.*

And finally my question:

 Is there a plan to detect such situation and perform an automatic
inline rebalance rather than reporting out-of-disk-space when there's
actually lots of free disk space available?

Thanks,

Peter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 15:45 Corrupted system due to imbalanced metadata chunks Peter Kese
@ 2016-05-17 18:03 ` Austin S. Hemmelgarn
  2016-05-17 20:06   ` Chris Murphy
  2016-05-17 19:52 ` Chris Murphy
  2016-05-18  5:09 ` Chris Murphy
  2 siblings, 1 reply; 9+ messages in thread
From: Austin S. Hemmelgarn @ 2016-05-17 18:03 UTC (permalink / raw
  To: Peter Kese, linux-btrfs

On 2016-05-17 11:45, Peter Kese wrote:
> I've been using btrfs on my main system for a few months. I know btrfs
> is a little bit beta, but I thought not using any fancy features like
> quotas, snapshotting, raid, etc. would keep me on the safe side.
>
> Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
> out that while there was more than 100 GB (45%) of free disk space,
> the upgrade process broke down somewhere in the middle reporting IO
> errors and lack of free disk space.
>
> As I have learned later on, my problem was lack of available metadata
> blocks and a couple of tries at btrfs-balance remedied the space
> problem, but I nevertheless ended up with a broken Ubuntu distribution
> (there were broken packages and apt-get/dpkg hacking failed to fix the
> problem).
>
> So there wasn't any major data loss (apart from some .deb packages
> missing some files, my personal data is intact). But I'd still
> consider this a major loss, because I'll end up having to reinstall
> the whole system.
>
> Now here's what I think:
>  1) I may have been a bit unfortunate to experience this particular
> issue but there's a large audience of people who might get bitten as
> well,
>  2) I find it hard to blame it on Ubuntu's upgrade process, as it does
> check for free space availability before starting the upgrade,
The upgrade process is also naive and only checks what df says about 
free space.  It could stand to be taught to pay better attention and 
check repeatedly throughout the process.
>  3) A file system should not refuse to store files (during system
> upgrade or any other time), when there is 100 GB of free disk space
> available,
If you're checking just df, then that is by no means the full story.  In 
BTRFS and some other filesystems, df is advisory, not authoritative, and 
it doesn't provide any way to say things like 'you have a bunch of free 
space, but can only store lots of really small files right now', which 
is exactly the situation you were in.
>  4) Not anywhere in any btrfs documentation (not even in btrfs
> Gotchas) did I read any bold text saying *If installing btrfs, you
> should always keep an eye on free space for metadata and perform
> regular balances or otherwise you may corrupt your system.*
>
> And finally my question:
>
>  Is there a plan to detect such situation and perform an automatic
> inline rebalance rather than reporting out-of-disk-space when there's
> actually lots of free disk space available?
There are some things already in place to try and prevent this on recent 
kernels (for example, completely empty chunks are automatically 
deallocated), but it's not easy to solve completely without making 
performance absolutely horrible.  Installing large numbers of packages 
at once (like a distro upgrade) is a particularly bad case for this, 
because most package managers unpack to a temporary location on-disk 
before copying the files in, and that tends to leave a lot of free space 
fragmentation within the chunks.  Ideally, this free space gets 
back-filled by new data, but that may not happen depending on numerous 
factors.

One thing I would suggest in the future though is to run a full balance 
just before doing the upgrade.  It's not very likely that just the 
upgrade was fully responsible for this, which would mean that the 
problem existed at least partially before the upgrade.  As such, running 
a full balance just before the upgrade should help prevent this from 
happening.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 15:45 Corrupted system due to imbalanced metadata chunks Peter Kese
  2016-05-17 18:03 ` Austin S. Hemmelgarn
@ 2016-05-17 19:52 ` Chris Murphy
  2016-05-17 21:07   ` Goni Zahavy
  2016-05-18  5:09 ` Chris Murphy
  2 siblings, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2016-05-17 19:52 UTC (permalink / raw
  To: Peter Kese; +Cc: Btrfs BTRFS

On Tue, May 17, 2016 at 9:45 AM, Peter Kese <peter.kese@viidea.com> wrote:
> I've been using btrfs on my main system for a few months. I know btrfs
> is a little bit beta, but I thought not using any fancy features like
> quotas, snapshotting, raid, etc. would keep me on the safe side.
>
> Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
> out that while there was more than 100 GB (45%) of free disk space,
> the upgrade process broke down somewhere in the middle reporting IO
> errors and lack of free disk space.

Yeah it's a weak area still that only completely unused data chunks
become unallocated free space again, from which metadata chunks can be
allocated. I think general consumption of Btrfs is difficult until
there's better behavior; maybe a trigger that can happen before such
enospc that causes the equivalent of filtered balance, e.g. -dusage=5,
which should be quite fast even on HDD, and free up a lot of space or
at least enough to allocate a metadata chunk in order to complete
something like an OS upgrade.

> As I have learned later on, my problem was lack of available metadata
> blocks and a couple of tries at btrfs-balance remedied the space
> problem, but I nevertheless ended up with a broken Ubuntu distribution
> (there were broken packages and apt-get/dpkg hacking failed to fix the
> problem).
>
> So there wasn't any major data loss (apart from some .deb packages
> missing some files, my personal data is intact). But I'd still
> consider this a major loss, because I'll end up having to reinstall
> the whole system.

The criticism is valid. But there is more than one valid criticism.
The non-atomic upgrade process is also a problem that needs
improvement for a very long time now.

Ironically a snapshot would probably have helped because then worst
case scenario  you could 'btrfs dev add' some small device like even a
2G USB stick to get out of the no space situation, and then delete the
subvolume(s) containing the failed upgrade, then later delete the
unneeded USB stick. So Btrfs can help give OS upgrades a more modern
atomic way of doing updates and upgrades, almost for free, so that a
failure can be rolled back to a known good point.

But there are other reasons why updates can fail, other than running
out of space, and that's why they need to be better designed to be
fail safe.

>
> Now here's what I think:
>  1) I may have been a bit unfortunate to experience this particular
> issue but there's a large audience of people who might get bitten as
> well,
>  2) I find it hard to blame it on Ubuntu's upgrade process, as it does
> check for free space availability before starting the upgrade,
>  3) A file system should not refuse to store files (during system
> upgrade or any other time), when there is 100 GB of free disk space
> available,
>  4) Not anywhere in any btrfs documentation (not even in btrfs
> Gotchas) did I read any bold text saying *If installing btrfs, you
> should always keep an eye on free space for metadata and perform
> regular balances or otherwise you may corrupt your system.*

1 Definitely.
2 Dual blame.
3 It's technically not free disk space, it's unused but allocated
space, and it's allocated for a specific purpose (data chunk or
metadata chunk) and right now there's no automatic migration (balance)
of extents in order to repurpose that space as needed. So, yeah Btrfs
needs to get better in this area for sure but it's a difficult problem
or it'd already be solved by now.
4 Technically the file system itself is not corrupt, it's just that
upon enospc the updater face plants and I guess has some recovery
problems identifying the system's in-between state and ability to fix
it. I agree with the part that the user shouldn't need to know about
keeping an eye on data vs metadata balancing act for a stable and
production recommended fs.

But at least as much we need better behavior in updaters. Quite a lot
of them it seems make excessive use of fsync for probably no longer
very good reasons.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 18:03 ` Austin S. Hemmelgarn
@ 2016-05-17 20:06   ` Chris Murphy
       [not found]     ` <CAJVJNe4Ek8UR3Qq+p9fAu3bU-2dCdDcothgR=S6CsQKOo6zKFg@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2016-05-17 20:06 UTC (permalink / raw
  To: Austin S. Hemmelgarn; +Cc: Peter Kese, Btrfs BTRFS

On Tue, May 17, 2016 at 12:03 PM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2016-05-17 11:45, Peter Kese wrote:
>>
>> I've been using btrfs on my main system for a few months. I know btrfs
>> is a little bit beta, but I thought not using any fancy features like
>> quotas, snapshotting, raid, etc. would keep me on the safe side.
>>
>> Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
>> out that while there was more than 100 GB (45%) of free disk space,
>> the upgrade process broke down somewhere in the middle reporting IO
>> errors and lack of free disk space.
>>
>> As I have learned later on, my problem was lack of available metadata
>> blocks and a couple of tries at btrfs-balance remedied the space
>> problem, but I nevertheless ended up with a broken Ubuntu distribution
>> (there were broken packages and apt-get/dpkg hacking failed to fix the
>> problem).
>>
>> So there wasn't any major data loss (apart from some .deb packages
>> missing some files, my personal data is intact). But I'd still
>> consider this a major loss, because I'll end up having to reinstall
>> the whole system.
>>
>> Now here's what I think:
>>  1) I may have been a bit unfortunate to experience this particular
>> issue but there's a large audience of people who might get bitten as
>> well,
>>  2) I find it hard to blame it on Ubuntu's upgrade process, as it does
>> check for free space availability before starting the upgrade,
>
> The upgrade process is also naive and only checks what df says about free
> space.  It could stand to be taught to pay better attention and check
> repeatedly throughout the process.

Yeah I don't know what the right design is to check for free space
that's fs agnostic. If only it were simple to do a fallocate for 3000
8KiB files and 4000 2MiB files and if either of those fails, don't
start the upgrade. It would have to be some kind of virtual fallocate,
I bet 7000 fallocates at once is not so fast.

>>
>>  3) A file system should not refuse to store files (during system
>> upgrade or any other time), when there is 100 GB of free disk space
>> available,
>
> If you're checking just df, then that is by no means the full story.  In
> BTRFS and some other filesystems, df is advisory, not authoritative, and it
> doesn't provide any way to say things like 'you have a bunch of free space,
> but can only store lots of really small files right now', which is exactly
> the situation you were in.

I *think* he was in the opposite where a bunch of near empty data
chunks were allocated and the metadata chunks were nearly full. So
actually a bunch of big files was no problem, but an OS upgrade tends
to leverage Btrfs inline data, which is probably why it ran out of
space. Just a guess.


>>
>>  4) Not anywhere in any btrfs documentation (not even in btrfs
>> Gotchas) did I read any bold text saying *If installing btrfs, you
>> should always keep an eye on free space for metadata and perform
>> regular balances or otherwise you may corrupt your system.*
>>
>> And finally my question:
>>
>>  Is there a plan to detect such situation and perform an automatic
>> inline rebalance rather than reporting out-of-disk-space when there's
>> actually lots of free disk space available?
>
> There are some things already in place to try and prevent this on recent
> kernels (for example, completely empty chunks are automatically
> deallocated), but it's not easy to solve completely without making
> performance absolutely horrible.  Installing large numbers of packages at
> once (like a distro upgrade) is a particularly bad case for this, because
> most package managers unpack to a temporary location on-disk before copying
> the files in, and that tends to leave a lot of free space fragmentation
> within the chunks.  Ideally, this free space gets back-filled by new data,
> but that may not happen depending on numerous factors.

Yeah there's all sorts of crusty behaviors in OS installers and
updaters on all platforms that really need to be refactored but that's
a lot of work for something that doesn't happen that often.


>
> One thing I would suggest in the future though is to run a full balance just
> before doing the upgrade.  It's not very likely that just the upgrade was
> fully responsible for this, which would mean that the problem existed at
> least partially before the upgrade.  As such, running a full balance just
> before the upgrade should help prevent this from happening.

In some sense maybe btrfs-progs should ship with an upstream
maintained version of opensuse's btrfsmaintenance-refresh.service?
That has gotten stale for example:

- snapshot aware defrag was pulled out of btrfs a while ago due to
problems, so I question the value and appropriateness of
btrfs-defrag.sh being run on a regular basis when opensuse uses
snapper by default, resulting in many dozens or hundreds of read only
snapshots in short order
- btrfs-trim.sh is obsoleted by systemd provided fstrim.timer, which
is enabled by
default, there's no good reason to run both of these;
- btrfs-balance.sh uses filters -dusage=0 and -musage=0 which is now
handled by the kernel, this should probably be something like
-dusage=5 and -musage=15 to consolidate extents from minimally used
chunks and then revert them to unallocated space.


Until such time there's an in-kernel fix for this...



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 19:52 ` Chris Murphy
@ 2016-05-17 21:07   ` Goni Zahavy
  0 siblings, 0 replies; 9+ messages in thread
From: Goni Zahavy @ 2016-05-17 21:07 UTC (permalink / raw
  To: Chris Murphy; +Cc: Peter Kese, Btrfs BTRFS

On 5/17/16, Chris Murphy <lists@colorremedies.com> wrote:
> On Tue, May 17, 2016 at 9:45 AM, Peter Kese <peter.kese@viidea.com> wrote:
>> I've been using btrfs on my main system for a few months. I know btrfs
>> is a little bit beta, but I thought not using any fancy features like
>> quotas, snapshotting, raid, etc. would keep me on the safe side.
>>
>> Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
>> out that while there was more than 100 GB (45%) of free disk space,
>> the upgrade process broke down somewhere in the middle reporting IO
>> errors and lack of free disk space.
>
> Yeah it's a weak area still that only completely unused data chunks
> become unallocated free space again, from which metadata chunks can be
> allocated. I think general consumption of Btrfs is difficult until
> there's better behavior; maybe a trigger that can happen before such
> enospc that causes the equivalent of filtered balance, e.g. -dusage=5,
> which should be quite fast even on HDD, and free up a lot of space or
> at least enough to allocate a metadata chunk in order to complete
> something like an OS upgrade.
>
>
>
>> As I have learned later on, my problem was lack of available metadata
>> blocks and a couple of tries at btrfs-balance remedied the space
>> problem, but I nevertheless ended up with a broken Ubuntu distribution
>> (there were broken packages and apt-get/dpkg hacking failed to fix the
>> problem).
>>
>> So there wasn't any major data loss (apart from some .deb packages
>> missing some files, my personal data is intact). But I'd still
>> consider this a major loss, because I'll end up having to reinstall
>> the whole system.
>
> The criticism is valid. But there is more than one valid criticism.
> The non-atomic upgrade process is also a problem that needs
> improvement for a very long time now.
>
> Ironically a snapshot would probably have helped because then worst
> case scenario  you could 'btrfs dev add' some small device like even a
> 2G USB stick to get out of the no space situation, and then delete the
> subvolume(s) containing the failed upgrade, then later delete the
> unneeded USB stick. So Btrfs can help give OS upgrades a more modern
> atomic way of doing updates and upgrades, almost for free, so that a
> failure can be rolled back to a known good point.
>
> But there are other reasons why updates can fail, other than running
> out of space, and that's why they need to be better designed to be
> fail safe.
>
>
>
>
>>
>> Now here's what I think:
>>  1) I may have been a bit unfortunate to experience this particular
>> issue but there's a large audience of people who might get bitten as
>> well,
>>  2) I find it hard to blame it on Ubuntu's upgrade process, as it does
>> check for free space availability before starting the upgrade,
>>  3) A file system should not refuse to store files (during system
>> upgrade or any other time), when there is 100 GB of free disk space
>> available,
>>  4) Not anywhere in any btrfs documentation (not even in btrfs
>> Gotchas) did I read any bold text saying *If installing btrfs, you
>> should always keep an eye on free space for metadata and perform
>> regular balances or otherwise you may corrupt your system.*
>
> 1 Definitely.
> 2 Dual blame.
> 3 It's technically not free disk space, it's unused but allocated
> space, and it's allocated for a specific purpose (data chunk or
> metadata chunk) and right now there's no automatic migration (balance)
> of extents in order to repurpose that space as needed. So, yeah Btrfs
> needs to get better in this area for sure but it's a difficult problem
> or it'd already be solved by now.
> 4 Technically the file system itself is not corrupt, it's just that
> upon enospc the updater face plants and I guess has some recovery
> problems identifying the system's in-between state and ability to fix
> it. I agree with the part that the user shouldn't need to know about
> keeping an eye on data vs metadata balancing act for a stable and
> production recommended fs.
>
> But at least as much we need better behavior in updaters. Quite a lot
> of them it seems make excessive use of fsync for probably no longer
> very good reasons.
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Guys, just remember that this crappy-installer behavior works
"as-expected" on every other filesystem.
I think we need to treat this as a trigger to unwanted/unexpected
behavior on btrfs's part, especially in this "dead-simple" setup, in
order  to gain a bug/behavior fix in the near future.
In my opinion, we simply cannot "blame" the user's actions or some
installer's code in the current state of btrfs especially in the
"considered stable enough" feature sets.

Goni Zahavy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
       [not found]     ` <CAJVJNe4Ek8UR3Qq+p9fAu3bU-2dCdDcothgR=S6CsQKOo6zKFg@mail.gmail.com>
@ 2016-05-17 21:27       ` Chris Murphy
  2016-05-17 21:58         ` Goni Zahavy
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2016-05-17 21:27 UTC (permalink / raw
  To: Goni Zahavy; +Cc: Chris Murphy, Austin S. Hemmelgarn, Peter Kese, Btrfs BTRFS

On Tue, May 17, 2016 at 2:27 PM, Goni Zahavy <goni1993@gmail.com> wrote:
> Guys, just remember that this crappy-installer behavior works "as-expected"
> on every other filesystem.

It also fails as expected on every other file system if the upgrade is
interrupted for reasons that have nothing to do with the file system;
what's in common is the lack of a fail safe upgrade design. And this
is not a ding on Ubuntu, it's a widespread and well recognized risk.
All that's happening here is Btrfs has one more vector to expose that
deficiency.

> I think we need to treat this as a trigger to unwanted/unexpected behavior
> on btrfs's part, especially in this "dead-simple" setup, in order for us to
> gain a bug/behavior fix in the near future.

There are work arounds:

use -M when creating the file system, accepting some loss of
performance and inline small file packing efficiency;

run a preemptive filtered balance on a timer.

Fixing it in the kernel is a great idea, but if that were trivial I
think it'd have been done by now. It's not like this problem is
unknown on this list.

> In my opinion, we simply cannot "blame" the user/installer code in the
> current state of btrfs especially in the "considered stable enough" feature
> sets.

OK blame was the wrong word for me to use, but the root cause of the
problem is shared.

I agree the user is least to blame, but so long as the status of the
fs overall is benchmarking and review it's difficult to say the user
has no role whatsoever in helping prevent it, as tedious as that is.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 21:27       ` Chris Murphy
@ 2016-05-17 21:58         ` Goni Zahavy
  2016-05-18  3:54           ` Chris Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Goni Zahavy @ 2016-05-17 21:58 UTC (permalink / raw
  To: Chris Murphy; +Cc: Austin S. Hemmelgarn, Peter Kese, Btrfs BTRFS

On 5/18/16, Chris Murphy <lists@colorremedies.com> wrote:
> On Tue, May 17, 2016 at 2:27 PM, Goni Zahavy <goni1993@gmail.com> wrote:
>> Guys, just remember that this crappy-installer behavior works
>> "as-expected"
>> on every other filesystem.
>
> It also fails as expected on every other file system if the upgrade is
> interrupted for reasons that have nothing to do with the file system;
> what's in common is the lack of a fail safe upgrade design. And this
> is not a ding on Ubuntu, it's a widespread and well recognized risk.
> All that's happening here is Btrfs has one more vector to expose that
> deficiency.

I fully agree, but I talked about the "many small files unpacked to
temp on disk location then copied.." which is just plain badness

>
>> I think we need to treat this as a trigger to unwanted/unexpected
>> behavior
>> on btrfs's part, especially in this "dead-simple" setup, in order for us
>> to
>> gain a bug/behavior fix in the near future.
>
> There are work arounds:
>
> use -M when creating the file system, accepting some loss of
> performance and inline small file packing efficiency;
>
> run a preemptive filtered balance on a timer.
>
> Fixing it in the kernel is a great idea, but if that were trivial I
> think it'd have been done by now. It's not like this problem is
> unknown on this list.
>
>> In my opinion, we simply cannot "blame" the user/installer code in the
>> current state of btrfs especially in the "considered stable enough"
>> feature
>> sets.
>
>
> OK blame was the wrong word for me to use, but the root cause of the
> problem is shared.
>
> I agree the user is least to blame, but so long as the status of the
> fs overall is benchmarking and review it's difficult to say the user
> has no role whatsoever in helping prevent it, as tedious as that is.
>
> --
> Chris Murphy
>

It's great that we have workarounds for the user/distribution to
apply. And If this is a known issue, I would expect that Peter Kese
would be answered here with "it's a known issue, here is how you can
work around it the next time. And it was written in underlined bold
warning on this wiki page. Stay tuned.".

In addition, I would expect bugs like this one that hit even simple
single-mode setups to be prioritized way way higher then any new
feature or even some existing advanced feature enhancements..

Just to clarify, I think that your work is devine and I'm glad for it.
I'm just trying to contribute here since I'm unable to contribute in
code/tests :)

Goni Zahavy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 21:58         ` Goni Zahavy
@ 2016-05-18  3:54           ` Chris Murphy
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2016-05-18  3:54 UTC (permalink / raw
  To: Goni Zahavy; +Cc: Chris Murphy, Austin S. Hemmelgarn, Peter Kese, Btrfs BTRFS

On Tue, May 17, 2016 at 3:58 PM, Goni Zahavy <goni1993@gmail.com> wrote:
> On 5/18/16, Chris Murphy <lists@colorremedies.com> wrote:
>> On Tue, May 17, 2016 at 2:27 PM, Goni Zahavy <goni1993@gmail.com> wrote:
>>> Guys, just remember that this crappy-installer behavior works
>>> "as-expected"
>>> on every other filesystem.
>>
>> It also fails as expected on every other file system if the upgrade is
>> interrupted for reasons that have nothing to do with the file system;
>> what's in common is the lack of a fail safe upgrade design. And this
>> is not a ding on Ubuntu, it's a widespread and well recognized risk.
>> All that's happening here is Btrfs has one more vector to expose that
>> deficiency.
>
> I fully agree, but I talked about the "many small files unpacked to
> temp on disk location then copied.." which is just plain badness

I think the idea is that if it's overwriting in the live tree as it
unpacks, there's a long time where it can be interrupted and any
interruption puts the system in an ambiguous state with a mix of old
and new binaries.

Where if the unpacking happens elsewhere and is then moved, it's not
committed until fsync of files and containing directory. While it can
still be interrupted the time frame for this is much smaller so
there's a better chance you get all old files or all updated files,
and not some in between state.

It might be a case of necessity rather than preferred design.

> It's great that we have workarounds for the user/distribution to
> apply. And If this is a known issue, I would expect that Peter Kese
> would be answered here with "it's a known issue, here is how you can
> work around it the next time. And it was written in underlined bold
> warning on this wiki page. Stay tuned.".

Known issue yes.

https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21

But it can be triggered by different workloads and it may not always
be that the fs can't allocate a new metadata chunk. It's a tricky
problem. And it doesn't always happen.

> In addition, I would expect bugs like this one that hit even simple
> single-mode setups to be prioritized way way higher then any new
> feature or even some existing advanced feature enhancements.

Well I know at least one of the three upstream maintainers has been
working on the enospc problem since at least 2009, until at least
March of this year when his "Enospc rework" patches for review popped
up on the list. That work is in linux-next right now, so it looks like
it appears in kernel 4.7.

But that's maybe orthogonal to the trickiness of getting df to report
something both semi-sane and consistently so, for the inevitable and
enduring apps that'll remain naive, as Austin aptly puts it, about the
actual space available on the file system. The problem there is that
there are in effect two free spaces on Btrfs depending on what and how
much is going to get written.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Corrupted system due to imbalanced metadata chunks
  2016-05-17 15:45 Corrupted system due to imbalanced metadata chunks Peter Kese
  2016-05-17 18:03 ` Austin S. Hemmelgarn
  2016-05-17 19:52 ` Chris Murphy
@ 2016-05-18  5:09 ` Chris Murphy
  2 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2016-05-18  5:09 UTC (permalink / raw
  To: Btrfs BTRFS

On Tue, May 17, 2016 at 9:45 AM, Peter Kese <peter.kese@viidea.com> wrote:

> Then I tried a software upgrade (Ubuntu 15.10 -> 16.04) and it turned
> out that while there was more than 100 GB (45%) of free disk space,
> the upgrade process broke down somewhere in the middle reporting IO
> errors and lack of free disk space.

Kinda interesting, this appears in kernel 4.4.4

    btrfs: statfs: report zero available if metadata are exhausted
http://lkml.iu.edu/hypermail/linux/kernel/1603.0/01148.html

I think the OS upgrade case may still fail because there was probably
a lot of small files written inline in metadata chunks. So unless
there were a constant statfs check by the upgrader (like a per package
install check) then the first statfs check probably still overstates
the free space because it has no way of knowing there's about to be a
pile of small file writes.

And I've seen the reverse happen where it was metadata chunks with
free space, the data chunks were full and the enospc happened because
no data chunk could be allocated. So it's not so simple like adjusting
the data/metadata allocation ratio.

Huh. I wonder if after say, 90 or 95% full, Btrfs just switches to
creating a mixed-bg?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-18  5:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-17 15:45 Corrupted system due to imbalanced metadata chunks Peter Kese
2016-05-17 18:03 ` Austin S. Hemmelgarn
2016-05-17 20:06   ` Chris Murphy
     [not found]     ` <CAJVJNe4Ek8UR3Qq+p9fAu3bU-2dCdDcothgR=S6CsQKOo6zKFg@mail.gmail.com>
2016-05-17 21:27       ` Chris Murphy
2016-05-17 21:58         ` Goni Zahavy
2016-05-18  3:54           ` Chris Murphy
2016-05-17 19:52 ` Chris Murphy
2016-05-17 21:07   ` Goni Zahavy
2016-05-18  5:09 ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.