BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
@ 2015-06-17  7:16 Marc MERLIN
  2015-06-17 10:11 ` Hugo Mills
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Marc MERLIN @ 2015-06-17  7:16 UTC (permalink / raw)
  To: linux-btrfs

I had a few power offs due to a faulty power supply, and my mdadm raid5
got into fail mode after 2 drives got kicked out since their sequence
numbers didn't match due to the abrupt power offs.

I brought the swraid5 back up by force assembling it with 4 drives (one
was really only a few sequence numbers behind), and it's doing a full
parity rebuild on the 5th drive that was farther behind.

So I can understand how I may have had a few blocks that are in a bad
state.
I'm getting a few (not many) of those messages in syslog.
BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)

Filesystem looks like this:
Label: 'btrfs_pool1'  uuid: 6358304a-2234-4243-b02d-4944c9af47d7
        Total devices 1 FS bytes used 8.29TiB
        devid    1 size 14.55TiB used 8.32TiB path /dev/mapper/dshelf1

gargamel:~# btrfs fi df /mnt/btrfs_pool1
Data, single: total=8.29TiB, used=8.28TiB
System, DUP: total=8.00MiB, used=920.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=14.00GiB, used=10.58GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

Kernel 3.19.8.

Just to make sure I understand, do those messages in syslog mean that my
metadata got corrupted a bit, but because I have 2 copies, btrfs can fix
the bad copy by using the good one?

Also, if my actual data got corrupted, am I correct that btrfs will
detect the checksum failure and give me a different error message of a
read error that cannot be corrected?

I'll do a scrub later, for now I have to wait 20 hours for the raid rebuild
first.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17  7:16 BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432) Marc MERLIN
@ 2015-06-17 10:11 ` Hugo Mills
  2015-06-17 10:58   ` Sander
  2015-06-17 13:51 ` Duncan
  2015-06-17 14:58 ` Chris Murphy
  2 siblings, 1 reply; 8+ messages in thread
From: Hugo Mills @ 2015-06-17 10:11 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

On Wed, Jun 17, 2015 at 12:16:54AM -0700, Marc MERLIN wrote:
> I had a few power offs due to a faulty power supply, and my mdadm raid5
> got into fail mode after 2 drives got kicked out since their sequence
> numbers didn't match due to the abrupt power offs.
> 
> I brought the swraid5 back up by force assembling it with 4 drives (one
> was really only a few sequence numbers behind), and it's doing a full
> parity rebuild on the 5th drive that was farther behind.
> 
> So I can understand how I may have had a few blocks that are in a bad
> state.
> I'm getting a few (not many) of those messages in syslog.
> BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
> 
> Filesystem looks like this:
> Label: 'btrfs_pool1'  uuid: 6358304a-2234-4243-b02d-4944c9af47d7
>         Total devices 1 FS bytes used 8.29TiB
>         devid    1 size 14.55TiB used 8.32TiB path /dev/mapper/dshelf1
> 
> gargamel:~# btrfs fi df /mnt/btrfs_pool1
> Data, single: total=8.29TiB, used=8.28TiB
> System, DUP: total=8.00MiB, used=920.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=14.00GiB, used=10.58GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Kernel 3.19.8.
> 
> Just to make sure I understand, do those messages in syslog mean that my
> metadata got corrupted a bit, but because I have 2 copies, btrfs can fix
> the bad copy by using the good one?
> 
> Also, if my actual data got corrupted, am I correct that btrfs will
> detect the checksum failure and give me a different error message of a
> read error that cannot be corrected?

   Yes, that's my reading of the situation. Note that the 3.19 kernel
is the earliest I would expect this to be able to happen, as it's the
first kernel that actually had the full set of parity RAID repair code
in it.

> I'll do a scrub later, for now I have to wait 20 hours for the raid rebuild
> first.

   You'll probably find that the rebuild is equivalent to a scrub anyway.

   Hugo.

-- 
Hugo Mills             | If you're not part of the solution, you're part of
hugo@... carfax.org.uk | the precipiate.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17 10:11 ` Hugo Mills
@ 2015-06-17 10:58   ` Sander
  2015-06-17 11:01     ` Hugo Mills
  2015-06-17 16:19     ` Marc MERLIN
  0 siblings, 2 replies; 8+ messages in thread
From: Sander @ 2015-06-17 10:58 UTC (permalink / raw)
  To: Hugo Mills, Marc MERLIN, linux-btrfs

Hugo Mills wrote (ao):
> On Wed, Jun 17, 2015 at 12:16:54AM -0700, Marc MERLIN wrote:
> > I had a few power offs due to a faulty power supply, and my mdadm raid5
> > got into fail mode after 2 drives got kicked out since their sequence
> > numbers didn't match due to the abrupt power offs.

> > gargamel:~# btrfs fi df /mnt/btrfs_pool1
> > Data, single: total=8.29TiB, used=8.28TiB
> > System, DUP: total=8.00MiB, used=920.00KiB
> > System, single: total=4.00MiB, used=0.00B
> > Metadata, DUP: total=14.00GiB, used=10.58GiB
> > Metadata, single: total=8.00MiB, used=0.00B
> > GlobalReserve, single: total=512.00MiB, used=0.00B

> > I'll do a scrub later, for now I have to wait 20 hours for the raid
> > rebuild first.
> 
>    You'll probably find that the rebuild is equivalent to a scrub anyway.

He has mdadm raid, which is rebuilding. This is obviously not equivalent
to a btrfs scrub.

	Sander

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17 10:58   ` Sander
@ 2015-06-17 11:01     ` Hugo Mills
  2015-06-17 16:19     ` Marc MERLIN
  1 sibling, 0 replies; 8+ messages in thread
From: Hugo Mills @ 2015-06-17 11:01 UTC (permalink / raw)
  To: Sander; +Cc: Marc MERLIN, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On Wed, Jun 17, 2015 at 12:58:35PM +0200, Sander wrote:
> Hugo Mills wrote (ao):
> > On Wed, Jun 17, 2015 at 12:16:54AM -0700, Marc MERLIN wrote:
> > > I had a few power offs due to a faulty power supply, and my mdadm raid5
> > > got into fail mode after 2 drives got kicked out since their sequence
> > > numbers didn't match due to the abrupt power offs.
> 
> > > gargamel:~# btrfs fi df /mnt/btrfs_pool1
> > > Data, single: total=8.29TiB, used=8.28TiB
> > > System, DUP: total=8.00MiB, used=920.00KiB
> > > System, single: total=4.00MiB, used=0.00B
> > > Metadata, DUP: total=14.00GiB, used=10.58GiB
> > > Metadata, single: total=8.00MiB, used=0.00B
> > > GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> > > I'll do a scrub later, for now I have to wait 20 hours for the raid
> > > rebuild first.
> > 
> >    You'll probably find that the rebuild is equivalent to a scrub anyway.
> 
> He has mdadm raid, which is rebuilding. This is obviously not equivalent
> to a btrfs scrub.

   Ah, thanks for the correction. Note to self: read more carefully
before replying.

   Hugo.

-- 
Hugo Mills             | If you're not part of the solution, you're part of
hugo@... carfax.org.uk | the precipiate.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17  7:16 BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432) Marc MERLIN
  2015-06-17 10:11 ` Hugo Mills
@ 2015-06-17 13:51 ` Duncan
  2015-06-17 14:58 ` Chris Murphy
  2 siblings, 0 replies; 8+ messages in thread
From: Duncan @ 2015-06-17 13:51 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Wed, 17 Jun 2015 00:16:54 -0700 as excerpted:

> I had a few power offs due to a faulty power supply, and my mdadm raid5
> got into fail mode after 2 drives got kicked out since their sequence
> numbers didn't match due to the abrupt power offs.
> 
> I brought the swraid5 back up by force assembling it with 4 drives (one
> was really only a few sequence numbers behind), and it's doing a full
> parity rebuild on the 5th drive that was farther behind.
> 
> So I can understand how I may have had a few blocks that are in a bad
> state.
> I'm getting a few (not many) of those messages in syslog.
> BTRFS: read error corrected: ino 1 off 226840576 (dev
> /dev/mapper/dshelf1 sector 459432)
> 
> Filesystem looks like this:
> Label: 'btrfs_pool1'  uuid: 6358304a-2234-4243-b02d-4944c9af47d7
>         Total devices 1 FS bytes used 8.29TiB devid    1 size 14.55TiB
>         used 8.32TiB path /dev/mapper/dshelf1
> 
> gargamel:~# btrfs fi df /mnt/btrfs_pool1 Data, single: total=8.29TiB,
> used=8.28TiB System, DUP: total=8.00MiB, used=920.00KiB System, single:
> total=4.00MiB, used=0.00B Metadata, DUP: total=14.00GiB, used=10.58GiB
> Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single:
> total=512.00MiB, used=0.00B
> 
> Kernel 3.19.8.
> 
> Just to make sure I understand, do those messages in syslog mean that my
> metadata got corrupted a bit, but because I have 2 copies, btrfs can fix
> the bad copy by using the good one?

Yes.  Despite the confusion between btrfs raid5 and mdraid5, Hugo was 
correct there.  It's just the 3.19 kernel bit that he got wrong, since he 
was thinking btrfs raid.  Btrfs dup mode should be good going back many 
kernels.

> Also, if my actual data got corrupted, am I correct that btrfs will
> detect the checksum failure and give me a different error message of a
> read error that cannot be corrected?
> 
> I'll do a scrub later, for now I have to wait 20 hours for the raid
> rebuild first.

Yes again.

As I mentioned in a different thread a few hours ago, I have an SSD that 
is slowly going bad, relocating sectors, etc (200-some relocated at this 
point, by raw value, that attribute dropped to 100 "cooked" value on the 
first relocation and is now at 98, with a threshold of 36, so I figure it 
should be good for a few thousand relocations if I let it go that far).  
But it's in a btrfs raid1 with a reliable (no relocations yet) paired-ssd 
and I've been able to scrub-fix the errors so far, plus I have things 
backed up and a replacement ready to insert when I decide it's time, so 
I'm able to watch in more or less morbid fascination as the thing slowly 
dies, a sector at a time.  

The interesting thing is that with btrfs' checksumming and data integrity 
feature, I can continue to use the drive in raid1 even tho it's 
definitely bad enough to be all but unusable with ordinary filesystems.

Anyway, as a result of that, I'm getting lots of experience with scrubs 
and corrected errors.

One thing I'd strongly recommend.  Once the rebuild is complete and you 
do the scrub, there may well be both read/corrected errors, and 
unverified errors.  AFAIK, the unverified errors are a result of bad 
metadata blocks, so missing checksums for what they covered.  So once you 
finish the first scrub and have corrected most of the metadata block 
errors, do another scrub.  The idea is to repeat until you have no more 
unverified errors, they're either all corrected (if dup metadata) or all 
uncorrectable (the single data).  That's what I'm doing here, with both 
data and metadata as raid1 and thus correctable, tho in some instances 
the device is triggering a new relocation on the second and occasionally 
(once?) third scrub, so that's causing me to have to do more scrubs than 
I would if the problem were entirely in the past, as it sounds like yours 
is, or will-be once the mdraid rebuild is done, anyway.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17  7:16 BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432) Marc MERLIN
  2015-06-17 10:11 ` Hugo Mills
  2015-06-17 13:51 ` Duncan
@ 2015-06-17 14:58 ` Chris Murphy
  2 siblings, 0 replies; 8+ messages in thread
From: Chris Murphy @ 2015-06-17 14:58 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: linux-btrfs

On Wed, Jun 17, 2015 at 1:16 AM, Marc MERLIN <marc@merlins.org> wrote:

> So I can understand how I may have had a few blocks that are in a bad
> state.
> I'm getting a few (not many) of those messages in syslog.
> BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)

I think more information is needed at the time of this entry, maybe
the previous 20 entries or so. That Btrfs thinks there is a read error
is different than when it thinks there's a checksum error. For example
when I willfully corrupt one sector that I know contains metadata,
then read the file or do a scrub:

[48466.824770] BTRFS: checksum error at logical 20971520 on dev
/dev/sdb, sector 57344: metadata leaf (level 0) in tree 3
[48466.829900] BTRFS: checksum error at logical 20971520 on dev
/dev/sdb, sector 57344: metadata leaf (level 0) in tree 3
[48466.834944] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[48466.853589] BTRFS: fixed up error at logical 20971520 on dev /dev/sdb


I'd expect in your case you have a bdev line that reads rd 1, corrupt 0.


> Just to make sure I understand, do those messages in syslog mean that my
> metadata got corrupted a bit, but because I have 2 copies, btrfs can fix
> the bad copy by using the good one?

Probably but not enough information has been given to conclude that.


> Also, if my actual data got corrupted, am I correct that btrfs will
> detect the checksum failure and give me a different error message of a
> read error that cannot be corrected?

Yes, it looks like this when the file is directly read:

[ 1457.231316] BTRFS warning (device sdb): csum failed ino 257 off 0
csum 3703877302 expected csum 1978138932
[ 1457.231842] BTRFS warning (device sdb): csum failed ino 257 off 0
csum 3703877302 expected csum 1978138932

It looks like this during a scrub:

[ 1540.865520] BTRFS: checksum error at logical 12845056 on dev
/dev/sdb, sector 25088, root 5, inode 257, offset 0, length 4096,
links 1 (path: grub2-install)
[ 1540.865534] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 1540.866944] BTRFS: unable to fixup (regular) error at logical
12845056 on dev /dev/sdb


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17 10:58   ` Sander
  2015-06-17 11:01     ` Hugo Mills
@ 2015-06-17 16:19     ` Marc MERLIN
  2015-06-18  4:32       ` Duncan
  1 sibling, 1 reply; 8+ messages in thread
From: Marc MERLIN @ 2015-06-17 16:19 UTC (permalink / raw)
  To: Sander, Duncan, Chris Murphy; +Cc: Hugo Mills, linux-btrfs

On Wed, Jun 17, 2015 at 01:51:26PM +0000, Duncan wrote:
> > Also, if my actual data got corrupted, am I correct that btrfs will
> > detect the checksum failure and give me a different error message of a
> > read error that cannot be corrected?
> > 
> > I'll do a scrub later, for now I have to wait 20 hours for the raid
> > rebuild first.
> 
> Yes again.
 
Great, thanks for confirming.
Makes me happy to know that checksums and metadata DUP are helping me
out here.
With ext4 I'd have been worse off for sure.

> One thing I'd strongly recommend.  Once the rebuild is complete and you 
> do the scrub, there may well be both read/corrected errors, and 
> unverified errors.  AFAIK, the unverified errors are a result of bad 
> metadata blocks, so missing checksums for what they covered.  So once you 

I'm slightly confused here. If I have metadata DUP and checksums, how
can metadata blocks be unverified?
Data blocks being unverified, I understand, it would mean the data or
checksum is bad, but I expect that's a different error message I haven't
seen yet.
(I use sec.pl which Emails me any btrfs kernel log line that's not
whitelisted as being known/OK)

On Wed, Jun 17, 2015 at 08:58:18AM -0600, Chris Murphy wrote:
> I think more information is needed at the time of this entry, maybe
> the previous 20 entries or so. That Btrfs thinks there is a read error
> is different than when it thinks there's a checksum error. For example
> when I willfully corrupt one sector that I know contains metadata,
> then read the file or do a scrub:
> 
> [48466.824770] BTRFS: checksum error at logical 20971520 on dev
> /dev/sdb, sector 57344: metadata leaf (level 0) in tree 3
> [48466.829900] BTRFS: checksum error at logical 20971520 on dev
> /dev/sdb, sector 57344: metadata leaf (level 0) in tree 3
> [48466.834944] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> [48466.853589] BTRFS: fixed up error at logical 20971520 on dev /dev/sdb
> 
> I'd expect in your case you have a bdev line that reads rd 1, corrupt 0.

Thankfully I have 0 checksum errors.

I looked at all the warnings I got last night, and none are repeated, so it looks like
it's just minor corruption I got from re-adding a raid drive that wasn't
100% in sync with the others, and btrfs nicely fixing things for me.
Nice! 

BTRFS: read error corrected: ino 1 off 205565952 (dev /dev/mapper/dshelf1 sector 417880)
BTRFS: read error corrected: ino 1 off 216420352 (dev /dev/mapper/dshelf1 sector 439080)
BTRFS: read error corrected: ino 1 off 216424448 (dev /dev/mapper/dshelf1 sector 439088)
BTRFS: read error corrected: ino 1 off 216473600 (dev /dev/mapper/dshelf1 sector 439184)
BTRFS: read error corrected: ino 1 off 226656256 (dev /dev/mapper/dshelf1 sector 459072)
BTRFS: read error corrected: ino 1 off 226693120 (dev /dev/mapper/dshelf1 sector 459144)
BTRFS: read error corrected: ino 1 off 226729984 (dev /dev/mapper/dshelf1 sector 459216)
BTRFS: read error corrected: ino 1 off 226734080 (dev /dev/mapper/dshelf1 sector 459224)
BTRFS: read error corrected: ino 1 off 226742272 (dev /dev/mapper/dshelf1 sector 459240)
BTRFS: read error corrected: ino 1 off 226758656 (dev /dev/mapper/dshelf1 sector 459272)
BTRFS: read error corrected: ino 1 off 226783232 (dev /dev/mapper/dshelf1 sector 459320)
BTRFS: read error corrected: ino 1 off 226811904 (dev /dev/mapper/dshelf1 sector 459376)
BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
BTRFS: read error corrected: ino 1 off 226865152 (dev /dev/mapper/dshelf1 sector 459480)
BTRFS: read error corrected: ino 1 off 226975744 (dev /dev/mapper/dshelf1 sector 459696)
BTRFS: read error corrected: ino 1 off 228589568 (dev /dev/mapper/dshelf1 sector 462848)
BTRFS: read error corrected: ino 1 off 228601856 (dev /dev/mapper/dshelf1 sector 462872)
BTRFS: read error corrected: ino 1 off 228610048 (dev /dev/mapper/dshelf1 sector 462888)
BTRFS: read error corrected: ino 1 off 228614144 (dev /dev/mapper/dshelf1 sector 462896)
BTRFS: read error corrected: ino 1 off 228618240 (dev /dev/mapper/dshelf1 sector 462904)
BTRFS: read error corrected: ino 1 off 228626432 (dev /dev/mapper/dshelf1 sector 462920)
BTRFS: read error corrected: ino 1 off 228630528 (dev /dev/mapper/dshelf1 sector 462928)
BTRFS: read error corrected: ino 1 off 228638720 (dev /dev/mapper/dshelf1 sector 462944)
BTRFS: read error corrected: ino 1 off 228642816 (dev /dev/mapper/dshelf1 sector 462952)
BTRFS: read error corrected: ino 1 off 228646912 (dev /dev/mapper/dshelf1 sector 462960)
BTRFS: read error corrected: ino 1 off 228651008 (dev /dev/mapper/dshelf1 sector 462968)
BTRFS: read error corrected: ino 1 off 228708352 (dev /dev/mapper/dshelf1 sector 463080)
BTRFS: read error corrected: ino 1 off 228712448 (dev /dev/mapper/dshelf1 sector 463088)
BTRFS: read error corrected: ino 1 off 228716544 (dev /dev/mapper/dshelf1 sector 463096)
BTRFS: read error corrected: ino 1 off 228720640 (dev /dev/mapper/dshelf1 sector 463104)
BTRFS: read error corrected: ino 1 off 228724736 (dev /dev/mapper/dshelf1 sector 463112)
BTRFS: read error corrected: ino 1 off 228728832 (dev /dev/mapper/dshelf1 sector 463120)
BTRFS: read error corrected: ino 1 off 228732928 (dev /dev/mapper/dshelf1 sector 463128)
BTRFS: read error corrected: ino 1 off 228737024 (dev /dev/mapper/dshelf1 sector 463136)
BTRFS: read error corrected: ino 1 off 228741120 (dev /dev/mapper/dshelf1 sector 463144)
BTRFS: read error corrected: ino 1 off 228745216 (dev /dev/mapper/dshelf1 sector 463152)
BTRFS: read error corrected: ino 1 off 228749312 (dev /dev/mapper/dshelf1 sector 463160)
BTRFS: read error corrected: ino 1 off 228753408 (dev /dev/mapper/dshelf1 sector 463168)
BTRFS: read error corrected: ino 1 off 228761600 (dev /dev/mapper/dshelf1 sector 463184)
BTRFS: read error corrected: ino 1 off 228765696 (dev /dev/mapper/dshelf1 sector 463192)
BTRFS: read error corrected: ino 1 off 228769792 (dev /dev/mapper/dshelf1 sector 463200)
BTRFS: read error corrected: ino 1 off 228782080 (dev /dev/mapper/dshelf1 sector 463224)
BTRFS: read error corrected: ino 1 off 228786176 (dev /dev/mapper/dshelf1 sector 463232)
BTRFS: read error corrected: ino 1 off 228794368 (dev /dev/mapper/dshelf1 sector 463248)
BTRFS: read error corrected: ino 1 off 228802560 (dev /dev/mapper/dshelf1 sector 463264)
BTRFS: read error corrected: ino 1 off 228810752 (dev /dev/mapper/dshelf1 sector 463280)
BTRFS: read error corrected: ino 1 off 228814848 (dev /dev/mapper/dshelf1 sector 463288)
BTRFS: read error corrected: ino 1 off 228818944 (dev /dev/mapper/dshelf1 sector 463296)
BTRFS: read error corrected: ino 1 off 228941824 (dev /dev/mapper/dshelf1 sector 463536)
BTRFS: read error corrected: ino 1 off 228954112 (dev /dev/mapper/dshelf1 sector 463560)
BTRFS: read error corrected: ino 1 off 228958208 (dev /dev/mapper/dshelf1 sector 463568)
BTRFS: read error corrected: ino 1 off 228974592 (dev /dev/mapper/dshelf1 sector 463600)
BTRFS: read error corrected: ino 1 off 228978688 (dev /dev/mapper/dshelf1 sector 463608)
BTRFS: read error corrected: ino 1 off 228982784 (dev /dev/mapper/dshelf1 sector 463616)
BTRFS: read error corrected: ino 1 off 229011456 (dev /dev/mapper/dshelf1 sector 463672)
BTRFS: read error corrected: ino 1 off 229060608 (dev /dev/mapper/dshelf1 sector 463768)
BTRFS: read error corrected: ino 1 off 229068800 (dev /dev/mapper/dshelf1 sector 463784)
BTRFS: read error corrected: ino 1 off 229093376 (dev /dev/mapper/dshelf1 sector 463832)
BTRFS: read error corrected: ino 1 off 229113856 (dev /dev/mapper/dshelf1 sector 463872)
BTRFS: read error corrected: ino 1 off 229158912 (dev /dev/mapper/dshelf1 sector 463960)
BTRFS: read error corrected: ino 1 off 229371904 (dev /dev/mapper/dshelf1 sector 464376)
BTRFS: read error corrected: ino 1 off 229445632 (dev /dev/mapper/dshelf1 sector 464520)

> Yes, it looks like this when the file is directly read:
> 
> [ 1457.231316] BTRFS warning (device sdb): csum failed ino 257 off 0
> csum 3703877302 expected csum 1978138932
> [ 1457.231842] BTRFS warning (device sdb): csum failed ino 257 off 0
> csum 3703877302 expected csum 1978138932
> 
> It looks like this during a scrub:
> 
> [ 1540.865520] BTRFS: checksum error at logical 12845056 on dev
> /dev/sdb, sector 25088, root 5, inode 257, offset 0, length 4096,
> links 1 (path: grub2-install)
> [ 1540.865534] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> [ 1540.866944] BTRFS: unable to fixup (regular) error at logical
> 12845056 on dev /dev/sdb

the only csum issue I have is this one
BTRFS: csum mismatch on free space cache

and btrfs knows how to handle that one.

Thanks for the answers, it sounds like I'm all good.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432)
  2015-06-17 16:19     ` Marc MERLIN
@ 2015-06-18  4:32       ` Duncan
  0 siblings, 0 replies; 8+ messages in thread
From: Duncan @ 2015-06-18  4:32 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Wed, 17 Jun 2015 09:19:36 -0700 as excerpted:

> On Wed, Jun 17, 2015 at 01:51:26PM +0000, Duncan wrote:
>> > Also, if my actual data got corrupted, am I correct that btrfs will
>> > detect the checksum failure and give me a different error message of
>> > a read error that cannot be corrected?
>> > 
>> > I'll do a scrub later, for now I have to wait 20 hours for the raid
>> > rebuild first.
>> 
>> Yes again.
>  
> Great, thanks for confirming.
> Makes me happy to know that checksums and metadata DUP are helping me
> out here.  With ext4 I'd have been worse off for sure.
> 
>> One thing I'd strongly recommend.  Once the rebuild is complete and you
>> do the scrub, there may well be both read/corrected errors, and
>> unverified errors.  AFAIK, the unverified errors are a result of bad
>> metadata blocks, so missing checksums for what they covered.  So once
>> you
> 
> I'm slightly confused here. If I have metadata DUP and checksums, how
> can metadata blocks be unverified?
> Data blocks being unverified, I understand, it would mean the data or
> checksum is bad, but I expect that's a different error message I haven't
> seen yet.

Backing up a bit to better explain what I'm seeing here...

What I'm getting here, when the sectors go unreadable on the (slowly) 
failing SSD, is actually a SATA level timeout, which btrfs (correctly) 
interprets as a read error.  But it wouldn't really matter whether it was 
a read error or a corruption error, btrfs would respond the same -- 
because both data and metadata are btrfs raid1 here, it would fetch and 
verify the other copy of the block from the raid1 mirror device, and 
assuming it verified (which it should since the other device is still in 
great condition, zero relocations), rewrite it over the one it couldn't 
read.

Back on the failing device, the rewrite triggers a sector relocation, and 
assuming it doesn't fall in the bad area too, that block is now clean.  
(If it does fall in the defective area, I simply have to repeat the scrub 
another time or two, until there are no more errors.)

But, and this is what I was trying to explain earlier but skipped a step 
I figured was more obvious than it apparently was, btrfs works with 
trees, including a metadata tree.  So each block of metadata that has 
checksums covering actual data, is in turn itself checksummed by a 
metadata block one step closer to the metadata root block, multiple 
levels deep.

I should mention here that this is my non-coder understanding.  If a dev 
says it works differently...

It's these multiple metadata levels and the chained checksums for them, 
that I was referencing.  Suppose it's a metadata block that fails, not a 
data block.  That metadata block will be checksummed, and will in turn 
contain checksums for other blocks, which might be either data blocks, or 
other metadata blocks, a level closer to the data (and further from the 
root) than the failed block.

Because the metadata block was failed (either checksum failure or read 
error, shouldn't matter at this point), whatever checksums it contained, 
whether for data, or for other metadata blocks, will be unverified.  If 
the affected metadata block is close to the root of the tree, the effect 
could in theory domino thru to several further levels.

These checksum unverified blocks (because the block containing the 
checksums failed) will show up as unverified errors, and whatever that 
checksum was supposed to cover, whether other metadata blocks or data 
blocks, won't be checked in that scrub round, because the level above it 
can't be verified.

Given a checksum-verified raid1 copy on the mirror device, the original 
failed block will be rewritten.  But if it's metadata, whatever checksums 
it in turn contained will still not be verified in that scrub round.  
Again, these show up as unverified errors.

By running scrub repeatedly, however, now that the first error has been 
fixed by the rewrite from the good copy, the checksums it contained can 
now in turn be checked.  If they all verify, great.  If not, another 
rewrite will be triggered, fixing them, but if if those checksums were in 
turn for other metadata blocks, now /those/ will need checked and will 
show up as unverified.

So depending on what the bad metadata block was located on in the 
metadata tree, a second, third, possibly even fourth, scrub may be 
needed, in ordered to correct all the errors at all levels of the 
metadata tree, thereby fixing in turn each level of unverified errors 
exposed as the level above it (closer to root) was fixed.

Of course, if your scrub listed all corrected (metadata since it's raid1 
in your case) or uncorrectable (data since it's single in your case, or 
metadata with both copies bad) errors, no unverified errors, then at 
least in theory, a second scrub shouldn't find any further errors to 
correct.  Only if you see unverified errors should it be necessary to 
repeat the scrub, but then you might need to repeat it several times as 
each run will expose another level to checksum verification that was 
previously unverified.

Of course, an extra scrub run shouldn't hurt anything in any case.  It'll 
just have nothing it can fix, and will only cost time.  (Tho on multi-TB 
spinning rust that time could be significant!)

Hopefully it makes more sense now, given that I've included the critical 
information about multi-level metadata trees that I had skipped as 
obvious, the first time.  Again, this is my understanding as a btrfs 
using admin and list regular, not a coder.  If a dev says the code 
doesn't work that way, he's most likely correct.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-06-18  4:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-17  7:16 BTRFS: read error corrected: ino 1 off 226840576 (dev /dev/mapper/dshelf1 sector 459432) Marc MERLIN
2015-06-17 10:11 ` Hugo Mills
2015-06-17 10:58   ` Sander
2015-06-17 11:01     ` Hugo Mills
2015-06-17 16:19     ` Marc MERLIN
2015-06-18  4:32       ` Duncan
2015-06-17 13:51 ` Duncan
2015-06-17 14:58 ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.