Re: [PATCH 0/8 V2] xfs: log fixes for for-next

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/8 V2] xfs: log fixes for for-next
Date: Fri, 18 Jun 2021 09:43:08 +1000	[thread overview]
Message-ID: <20210617234308.GH664593@dread.disaster.area> (raw)
In-Reply-To: <20210617190519.GV158209@locust>

On Thu, Jun 17, 2021 at 12:05:19PM -0700, Darrick J. Wong wrote:
> On Thu, Jun 17, 2021 at 02:32:30PM -0400, Brian Foster wrote:
> > On Thu, Jun 17, 2021 at 06:26:09PM +1000, Dave Chinner wrote:
> > > Hi folks,
> > > 
> > > This is followup from the first set of log fixes for for-next that
> > > were posted here:
> > > 
> > > https://lore.kernel.org/linux-xfs/20210615175719.GD158209@locust/T/#mde2cf0bb7d2ac369815a7e9371f0303efc89f51b
> > > 
> > > The first two patches of this series are updates for those patches,
> > > change log below. The rest is the fix for the bigger issue we
> > > uncovered in investigating the generic/019 failures, being that
> > > we're triggering a zero-day bug in the way log recovery assigns LSNs
> > > to checkpoints.
> > > 
> > > The "simple" fix of using the same ordering code as the commit
> > > record for the start records in the CIL push turned into a lot of
> > > patches once I started cleaning it up, separating out all the
> > > different bits and finally realising all the things I needed to
> > > change to avoid unintentional logic/behavioural changes. Hence
> > > there's some code movement, some factoring, API changes to
> > > xlog_write(), changing where we attach callbacks to commit iclogs so
> > > they remain correctly ordered if there are multiple commit records
> > > in the one iclog and then, finally, strictly ordering the start
> > > records....
> > > 
> > > The original "simple fix" I tested last night ran almost a thousand
> > > cycles of generic/019 without a log hang or recovery failure of any
> > > kind. The refactored patchset has run a couple hundred cycles of
> > > g/019 and g/475 over the last few hours without a failure, so I'm
> > > posting this so we can get a review iteration done while I sleep so
> > > we can - hopefully - get this sorted out before the end of the week.
> > > 
> > 
> > My first spin of this included generic/019 and generic/475, ran for 18
> > or so iterations and 475 exploded with a stream of asserts followed by a
> > NULL pointer crash:
> > 
> > # grep -e Assertion -e BUG dmesg.out
> > ...
> > [ 7951.878058] XFS: Assertion failed: atomic_read(&buip->bui_refcount) > 0, file: fs/xfs/xfs_bmap_item.c, line: 57
> > [ 7952.261251] XFS: Assertion failed: atomic_read(&buip->bui_refcount) > 0, file: fs/xfs/xfs_bmap_item.c, line: 57
> > [ 7952.644444] XFS: Assertion failed: atomic_read(&buip->bui_refcount) > 0, file: fs/xfs/xfs_bmap_item.c, line: 57
> > [ 7953.027626] XFS: Assertion failed: atomic_read(&buip->bui_refcount) > 0, file: fs/xfs/xfs_bmap_item.c, line: 57
> > [ 7953.410804] BUG: kernel NULL pointer dereference, address: 000000000000031f
> > [ 7954.118973] BUG: unable to handle page fault for address: ffffa57ccf99fa98
> > 
> > I don't know if this is a regression, but I've not seen it before. I've
> > attempted to spin generic/475 since then to see if it reproduces again,
> > but so far I'm only running into some of the preexisting issues
> > associated with that test.

I've not seen anything like that. I can't see how the changes in the
patchset would affect BUI reference counting in any way. That seems
more like an underlying intent item shutdown reference count issue
to me (and we've had a *lot* of them in the past)....

> By any chance, do the two log recovery fixes I sent yesterday make those
> problems go away?
> 
> > I'll let it go a while more and probably
> > switch it back to running both sometime before the end of the day for an
> > overnight test.
> 
> Also, do the CIL livelocks go away if you apply only patches 1-2?
> 
> > A full copy of the assert and NULL pointer BUG splat is included below
> > for reference. It looks like the fault BUG splat ended up interspersed
> > or otherwise mangled, but I suspect that one is just fallout from the
> > immediately previous crash.
> 
> I have a question about the composition of this 8-patch series --
> which patches fix the new cil code, and which ones fix the out of order
> recovery problems?  I suspect that patches 1-2 are for the new CIL code,
> and 3-8 are to fix the recovery problems.

Yes. But don't think of 3-8 as fixing recovery problems - the are
fixing potential runtime data integrity issues (log force lsns for
fsync are based on start LSNs) and journal head->tail overwrite
issues (because AIL ordering is start LSN based).

So, basically, we get the reocvery fixes for free when we fix the
runtime start LSN ordering issues...

> Thinking with my distro kernel not-maintainer hat on, I'm considering
> how to backport whatever fixes emerge for the recovery ordering issue
> into existing kernels.  The way I see things right now, the CIL changes
> (+ fixes) and the ordering bug fixes are separate issues.  The log
> ordering problems should get fixed as soon as we have a practical
> solution; the CIL changes could get deferred if need be since it's a
> medium-high risk; and the real question is how to sequence all this?

The CIL changes in patches 1-2 are low risk - that's just a hang
because of a logic error and we fix that sort of thing all the time

> (Or to put it another way: I'm still stuck going "oh wowwww this is a
> lot more change" while trying to understand patch 4)

It's not unreasonable given the amount of change that was made in
the first place. Really, though, once you take the tracing and code
movement out of it, the actual logic change is much, much smaller...

/me wonders if anyone remembers that I said up front that I
considered the changes to the log code completely unreviewable and
that there would be bugs that slip through both my testing and
review?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2021-06-17 23:43 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17  8:26 [PATCH 0/8 V2] xfs: log fixes for for-next Dave Chinner
2021-06-17  8:26 ` [PATCH 1/8] xfs: add iclog state trace events Dave Chinner
2021-06-17 16:45   ` Darrick J. Wong
2021-06-18 14:09   ` Christoph Hellwig
2021-06-17  8:26 ` [PATCH 2/8] xfs: don't wait on future iclogs when pushing the CIL Dave Chinner
2021-06-17 17:49   ` Darrick J. Wong
2021-06-17 21:55     ` Dave Chinner
2021-06-17  8:26 ` [PATCH 3/8] xfs: move xlog_commit_record to xfs_log_cil.c Dave Chinner
2021-06-17 12:57   ` kernel test robot
2021-06-17 12:57     ` kernel test robot
2021-06-17 17:50   ` Darrick J. Wong
2021-06-17 21:56     ` Dave Chinner
2021-06-18 14:16   ` Christoph Hellwig
2021-06-17  8:26 ` [PATCH 4/8] xfs: pass a CIL context to xlog_write() Dave Chinner
2021-06-17 14:46   ` kernel test robot
2021-06-17 14:46     ` kernel test robot
2021-06-17 20:24   ` Darrick J. Wong
2021-06-17 22:03     ` Dave Chinner
2021-06-17 22:18       ` Darrick J. Wong
2021-06-18 14:23   ` Christoph Hellwig
2021-06-17  8:26 ` [PATCH 5/8] xfs: factor out log write ordering from xlog_cil_push_work() Dave Chinner
2021-06-17 19:59   ` Darrick J. Wong
2021-06-18 14:27     ` Christoph Hellwig
2021-06-18 22:34       ` Dave Chinner
2021-06-17  8:26 ` [PATCH 6/8] xfs: separate out setting CIL context LSNs from xlog_write Dave Chinner
2021-06-17 20:28   ` Darrick J. Wong
2021-06-17 22:10     ` Dave Chinner
2021-06-17  8:26 ` [PATCH 7/8] xfs: attached iclog callbacks in xlog_cil_set_ctx_write_state() Dave Chinner
2021-06-17 20:55   ` Darrick J. Wong
2021-06-17 22:20     ` Dave Chinner
2021-06-17  8:26 ` [PATCH 8/8] xfs: order CIL checkpoint start records Dave Chinner
2021-06-17 21:31   ` Darrick J. Wong
2021-06-17 22:49     ` Dave Chinner
2021-06-17 18:32 ` [PATCH 0/8 V2] xfs: log fixes for for-next Brian Foster
2021-06-17 19:05   ` Darrick J. Wong
2021-06-17 20:06     ` Brian Foster
2021-06-17 20:26       ` Darrick J. Wong
2021-06-17 23:31         ` Brian Foster
2021-06-17 23:43     ` Dave Chinner [this message]
2021-06-18 13:08       ` Brian Foster
2021-06-18 13:55         ` Christoph Hellwig
2021-06-18 14:02           ` Christoph Hellwig
2021-06-18 22:28           ` Dave Chinner
2021-06-18 22:15         ` Dave Chinner
2021-06-18 22:48 ` Dave Chinner
2021-06-19 20:22   ` Darrick J. Wong
2021-06-20 22:18     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2021-06-26 23:10 [PATCH 4/8] xfs: pass a CIL context to xlog_write() kernel test robot
2021-06-28  8:58 ` Dan Carpenter
2021-06-28  8:58 ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210617234308.GH664593@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.