replacing a disk in a btrfs multi disk array with raid10

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* replacing a disk in a btrfs multi disk array with raid10
@ 2020-08-03  5:26 Norbert Preining
  2020-08-03  6:15 ` Chris Murphy
  0 siblings, 1 reply; 4+ messages in thread
From: Norbert Preining @ 2020-08-03  5:26 UTC (permalink / raw
  To: linux-btrfs

Hi all

(please Cc)

I am running Linux 5.7 or 5.8 on a btrfs array of 7 disks, with metadata
and data both on raid1, which contains the complete system.
(btrfs balance start -dconvert=raid1 -mconvert=raid1 /)

Although btrfs device stats / doesn't show any errors, SMART warns about
one disk (reallocated sector count property) and I was pondering
replacing the device.

What is the currently suggested method given that I cannot plug in
another disk into the computer, all slots are used up (thus a btrfs
replace will not work as far as I understand).

Do I need to:
- shutdown
- pysically replace disk
- reboot into rescue system
- mount in degraded mode
- add the new device
- resize the file system (new disk would be bigger)
- start a new rebalancing
	(for the rebalance, do I need to give the
	same -dconvert=raid1 -mconvert=raid1 arguments?)

Thanks for any guidance (and please Cc)

All the best

Norbert

--
PREINING Norbert                              https://www.preining.info
Accelia Inc. + IFMGA ProGuide + TU Wien + JAIST + TeX Live + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: replacing a disk in a btrfs multi disk array with raid10
  2020-08-03  5:26 replacing a disk in a btrfs multi disk array with raid10 Norbert Preining
@ 2020-08-03  6:15 ` Chris Murphy
  2020-08-03  7:47   ` Norbert Preining
  2020-10-09  4:20   ` Norbert Preining
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Murphy @ 2020-08-03  6:15 UTC (permalink / raw
  To: Norbert Preining; +Cc: Btrfs BTRFS

On Sun, Aug 2, 2020 at 11:51 PM Norbert Preining <norbert@preining.info> wrote:
>
> Hi all
>
> (please Cc)
>
> I am running Linux 5.7 or 5.8 on a btrfs array of 7 disks, with metadata
> and data both on raid1, which contains the complete system.
> (btrfs balance start -dconvert=raid1 -mconvert=raid1 /)
>
> Although btrfs device stats / doesn't show any errors, SMART warns about
> one disk (reallocated sector count property) and I was pondering
> replacing the device.

Some of these are considered normal. I suggest making sure each
drive's SCT ERC value is less than the SCSI command timer. You want
the drive to give up on reading a sector before the kernel considers
the command "overdue" and does a link reset - losing the contents of
the command queue. Upon read error, the drive reports the sector LBA
so that Btrfs can automatically do a fixup.

More info here. It applies to mdadm, lvm, and Btrfs raid.
https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

Once you've done that, do a btrfs scrub.

>
> What is the currently suggested method given that I cannot plug in
> another disk into the computer, all slots are used up (thus a btrfs
> replace will not work as far as I understand).

btrfs replace will work whether the drive is present or not. It's just
safer to do it with the drive present because you don't have to mount
degraded.

> Do I need to:
> - shutdown
> - pysically replace disk
> - reboot into rescue system
> - mount in degraded mode
> - add the new device

Use 'btrfs replace'

> - resize the file system (new disk would be bigger)

Currently 'btrfs replace' does require a separate resize step. 'device
add' doesn't, resize is implied by the command.

> - start a new rebalancing
>         (for the rebalance, do I need to give the
>         same -dconvert=raid1 -mconvert=raid1 arguments?)

Not necessary. But it's worth checking 'btrfs fi us -T' and making
sure everything is raid1 as you expect.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: replacing a disk in a btrfs multi disk array with raid10
  2020-08-03  6:15 ` Chris Murphy
@ 2020-08-03  7:47   ` Norbert Preining
  2020-10-09  4:20   ` Norbert Preining
  1 sibling, 0 replies; 4+ messages in thread
From: Norbert Preining @ 2020-08-03  7:47 UTC (permalink / raw
  To: Btrfs BTRFS

Hi Chris,

thanks for your answer, that is very much appreciated.

On Mon, 03 Aug 2020, Chris Murphy wrote:
> Some of these are considered normal. I suggest making sure each
> https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

Thanks, will read up on that.

> Once you've done that, do a btrfs scrub.

Happening regularly, but I will kick one off anyway.

> btrfs replace will work whether the drive is present or not. It's just
> safer to do it with the drive present because you don't have to mount
> degraded.

Ok.

I wasn't sure about whether I can mount without -o degraded because all
the metadata and data is on raid1. And then, I don't know what the
Debian initramfs is doing - that is probably the more interesting
surprise.

> > - add the new device
> 
> Use 'btrfs replace'

Thanks, noted.

> Currently 'btrfs replace' does require a separate resize step. 'device
> add' doesn't, resize is implied by the command.

This is somehow a logic approach, I agree.

> > - start a new rebalancing
> >         (for the rebalance, do I need to give the
> >         same -dconvert=raid1 -mconvert=raid1 arguments?)
> 
> Not necessary. But it's worth checking 'btrfs fi us -T' and making
> sure everything is raid1 as you expect.

Thanks, good to know.


Again, thanks a lot for all the details - I couldn't deduce most of them
from the wiki page on multiple devices. Your email is extremely helpful!

All the best

Norbert

--
PREINING Norbert                              https://www.preining.info
Accelia Inc. + IFMGA ProGuide + TU Wien + JAIST + TeX Live + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: replacing a disk in a btrfs multi disk array with raid10
  2020-08-03  6:15 ` Chris Murphy
  2020-08-03  7:47   ` Norbert Preining
@ 2020-10-09  4:20   ` Norbert Preining
  1 sibling, 0 replies; 4+ messages in thread
From: Norbert Preining @ 2020-10-09  4:20 UTC (permalink / raw
  To: Chris Murphy; +Cc: Btrfs BTRFS

Hi Chris,

(please Cc)

sorry for the late reply - real life.

It turned out that the disk I use is well known to misreport this
property, and thus it can be ignored.

But I had to deal with (temporary) loss of one disk. Fortunately, 
Debian's initramfs dropped me into a proper shell where I could mount
the array in degraded mode and just remove the device.

Just one hiccup I realized: **after** some time I could re-connect the
one disc from the array that was missing (I needed a x1 NVMe extender
which I didn't have at the beginning). I though reconnecting is as
simple as 
	btrfs device add -f /dev/nvme0n1p1 /
but it turned out, because that disk has been part of the array, it was
rejected. Even using the -f option did not work. At the end I had to
fdisk the drive and trash the partition table and btrfs info to get it
ready to be re-added.
Full story https://www.preining.info/blog/2020/09/dealing-with-lost-disks-in-a-btrfs-array/

Anyway, all suprisingly smooth. Thanks to all of you.

Best

Norbert

On Mon, 03 Aug 2020, Chris Murphy wrote:
> On Sun, Aug 2, 2020 at 11:51 PM Norbert Preining <norbert@preining.info> wrote:
> >
> > Hi all
> >
> > (please Cc)
> >
> > I am running Linux 5.7 or 5.8 on a btrfs array of 7 disks, with metadata
> > and data both on raid1, which contains the complete system.
> > (btrfs balance start -dconvert=raid1 -mconvert=raid1 /)
> >
> > Although btrfs device stats / doesn't show any errors, SMART warns about
> > one disk (reallocated sector count property) and I was pondering
> > replacing the device.
> 
> Some of these are considered normal. I suggest making sure each
> drive's SCT ERC value is less than the SCSI command timer. You want
> the drive to give up on reading a sector before the kernel considers
> the command "overdue" and does a link reset - losing the contents of
> the command queue. Upon read error, the drive reports the sector LBA
> so that Btrfs can automatically do a fixup.
> 
> More info here. It applies to mdadm, lvm, and Btrfs raid.
> https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> 
> Once you've done that, do a btrfs scrub.
> 
> >
> > What is the currently suggested method given that I cannot plug in
> > another disk into the computer, all slots are used up (thus a btrfs
> > replace will not work as far as I understand).
> 
> btrfs replace will work whether the drive is present or not. It's just
> safer to do it with the drive present because you don't have to mount
> degraded.
> 
> 
> > Do I need to:
> > - shutdown
> > - pysically replace disk
> > - reboot into rescue system
> > - mount in degraded mode
> > - add the new device
> 
> Use 'btrfs replace'
> 
> > - resize the file system (new disk would be bigger)
> 
> Currently 'btrfs replace' does require a separate resize step. 'device
> add' doesn't, resize is implied by the command.
> 
> 
> > - start a new rebalancing
> >         (for the rebalance, do I need to give the
> >         same -dconvert=raid1 -mconvert=raid1 arguments?)
> 
> Not necessary. But it's worth checking 'btrfs fi us -T' and making
> sure everything is raid1 as you expect.

--
PREINING Norbert                              https://www.preining.info
Accelia Inc. + IFMGA ProGuide + TU Wien + JAIST + TeX Live + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-10-09  4:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-03  5:26 replacing a disk in a btrfs multi disk array with raid10 Norbert Preining
2020-08-03  6:15 ` Chris Murphy
2020-08-03  7:47   ` Norbert Preining
2020-10-09  4:20   ` Norbert Preining

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.