* [PATCH 0/2] xfs: fix buffer use after free on unpin abort @ 2021-05-11 13:52 Brian Foster 2021-05-11 13:52 ` [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing Brian Foster 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster 0 siblings, 2 replies; 11+ messages in thread From: Brian Foster @ 2021-05-11 13:52 UTC (permalink / raw) To: linux-xfs Hi all, Here's a proper v1 of the previously posted RFC to address a subtle buffer use after free in the unpin abort sequence for buffer log items. Dave had previously suggested that the underlying problem here is that bli's are effectively used by the AIL unreferenced. While this makes a lot of sense, this is a long standing design detail that subtly impacts code related to log item processing, AIL processing, buffer I/O, as well as potentially log recovery. In contrast, the immediate problem that leads to the use after free is lack of a buffer hold in a context that already explicitly acquires a hold for the problematic simulated I/O failure sequence. Given the significant cost/risk vs. benefit imbalance of a design rework, I've opted to to make the minimal change to fix this bug and defer broader rework to a standalone effort. Patch 1 basically reorders the preexisting buffer hold to accommodate the flaw that the AIL does not hold a reference to the bli (and thus does not maintain the associated buffer hold). This preserves the existing isolation logic and prevents the associated UAF. This survives an fstests run and is going on 6k iterations of generic/019 (which previously reproduced the problem in 2-3k iterations) without any explosions. Thoughts, reviews, flames appreciated. Brian v1: - Rework patch 1 to hold conditionally in the abort case and document the underlying design flaw. - Add patch 2 to remove some unused code. rfc: https://lore.kernel.org/linux-xfs/20210503121816.561340-1-bfoster@redhat.com/ Brian Foster (2): xfs: hold buffer across unpin and potential shutdown processing xfs: remove dead stale buf unpin handling code fs/xfs/xfs_buf_item.c | 57 +++++++++++++++++-------------------------- 1 file changed, 22 insertions(+), 35 deletions(-) -- 2.26.3 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing 2021-05-11 13:52 [PATCH 0/2] xfs: fix buffer use after free on unpin abort Brian Foster @ 2021-05-11 13:52 ` Brian Foster 2021-05-12 1:52 ` Darrick J. Wong 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster 1 sibling, 1 reply; 11+ messages in thread From: Brian Foster @ 2021-05-11 13:52 UTC (permalink / raw) To: linux-xfs The special processing used to simulate a buffer I/O failure on fs shutdown has a difficult to reproduce race that can result in a use after free of the associated buffer. Consider a buffer that has been committed to the on-disk log and thus is AIL resident. The buffer lands on the writeback delwri queue, but is subsequently locked, committed and pinned by another transaction before submitted for I/O. At this point, the buffer is stuck on the delwri queue as it cannot be submitted for I/O until it is unpinned. A log checkpoint I/O failure occurs sometime later, which aborts the bli. The unpin handler is called with the aborted log item, drops the bli reference count, the pin count, and falls into the I/O failure simulation path. The potential problem here is that once the pin count falls to zero in ->iop_unpin(), xfsaild is free to retry delwri submission of the buffer at any time, before the unpin handler even completes. If delwri queue submission wins the race to the buffer lock, it observes the shutdown state and simulates the I/O failure itself. This releases both the bli and delwri queue holds and frees the buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to run through the same failure sequence. This problem is rare and requires many iterations of fstest generic/019 (which simulates disk I/O failures) to reproduce. To avoid this problem, grab a hold on the buffer before the log item is unpinned if the associated item has been aborted and will require a simulated I/O failure. The hold is already required for the simulated I/O failure, so the ordering simply guarantees the unpin handler access to the buffer before it is unpinned and thus processed by the AIL. This particular ordering is required so long as the AIL does not acquire a reference on the bli, which is the long term solution to this problem. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/xfs_buf_item.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index fb69879e4b2b..7ff31788512b 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -475,17 +475,8 @@ xfs_buf_item_pin( } /* - * This is called to unpin the buffer associated with the buf log - * item which was previously pinned with a call to xfs_buf_item_pin(). - * - * Also drop the reference to the buf item for the current transaction. - * If the XFS_BLI_STALE flag is set and we are the last reference, - * then free up the buf log item and unlock the buffer. - * - * If the remove flag is set we are called from uncommit in the - * forced-shutdown path. If that is true and the reference count on - * the log item is going to drop to zero we need to free the item's - * descriptor in the transaction. + * This is called to unpin the buffer associated with the buf log item which + * was previously pinned with a call to xfs_buf_item_pin(). */ STATIC void xfs_buf_item_unpin( @@ -502,12 +493,26 @@ xfs_buf_item_unpin( trace_xfs_buf_item_unpin(bip); + /* + * Drop the bli ref associated with the pin and grab the hold required + * for the I/O simulation failure in the abort case. We have to do this + * before the pin count drops because the AIL doesn't acquire a bli + * reference. Therefore if the refcount drops to zero, the bli could + * still be AIL resident and the buffer submitted for I/O (and freed on + * completion) at any point before we return. This can be removed once + * the AIL properly holds a reference on the bli. + */ freed = atomic_dec_and_test(&bip->bli_refcount); - + if (freed && !stale && remove) + xfs_buf_hold(bp); if (atomic_dec_and_test(&bp->b_pin_count)) wake_up_all(&bp->b_waiters); - if (freed && stale) { + /* nothing to do but drop the pin count if the bli is active */ + if (!freed) + return; + + if (stale) { ASSERT(bip->bli_flags & XFS_BLI_STALE); ASSERT(xfs_buf_islocked(bp)); ASSERT(bp->b_flags & XBF_STALE); @@ -550,13 +555,13 @@ xfs_buf_item_unpin( ASSERT(bp->b_log_item == NULL); } xfs_buf_relse(bp); - } else if (freed && remove) { + } else if (remove) { /* * The buffer must be locked and held by the caller to simulate - * an async I/O failure. + * an async I/O failure. We acquired the hold for this case + * before the buffer was unpinned. */ xfs_buf_lock(bp); - xfs_buf_hold(bp); bp->b_flags |= XBF_ASYNC; xfs_buf_ioend_fail(bp); } -- 2.26.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing 2021-05-11 13:52 ` [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing Brian Foster @ 2021-05-12 1:52 ` Darrick J. Wong 2021-05-12 12:22 ` Christoph Hellwig 2021-05-12 14:28 ` Brian Foster 0 siblings, 2 replies; 11+ messages in thread From: Darrick J. Wong @ 2021-05-12 1:52 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Tue, May 11, 2021 at 09:52:56AM -0400, Brian Foster wrote: > The special processing used to simulate a buffer I/O failure on fs > shutdown has a difficult to reproduce race that can result in a use > after free of the associated buffer. Consider a buffer that has been > committed to the on-disk log and thus is AIL resident. The buffer > lands on the writeback delwri queue, but is subsequently locked, > committed and pinned by another transaction before submitted for > I/O. At this point, the buffer is stuck on the delwri queue as it > cannot be submitted for I/O until it is unpinned. A log checkpoint > I/O failure occurs sometime later, which aborts the bli. The unpin > handler is called with the aborted log item, drops the bli reference > count, the pin count, and falls into the I/O failure simulation > path. > > The potential problem here is that once the pin count falls to zero > in ->iop_unpin(), xfsaild is free to retry delwri submission of the > buffer at any time, before the unpin handler even completes. If > delwri queue submission wins the race to the buffer lock, it > observes the shutdown state and simulates the I/O failure itself. > This releases both the bli and delwri queue holds and frees the > buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to > run through the same failure sequence. This problem is rare and > requires many iterations of fstest generic/019 (which simulates disk > I/O failures) to reproduce. > > To avoid this problem, grab a hold on the buffer before the log item > is unpinned if the associated item has been aborted and will require > a simulated I/O failure. The hold is already required for the > simulated I/O failure, so the ordering simply guarantees the unpin > handler access to the buffer before it is unpinned and thus > processed by the AIL. This particular ordering is required so long > as the AIL does not acquire a reference on the bli, which is the > long term solution to this problem. Are you working on that too, or are we just going to let that lie for the time being? :) > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > fs/xfs/xfs_buf_item.c | 37 +++++++++++++++++++++---------------- > 1 file changed, 21 insertions(+), 16 deletions(-) > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > index fb69879e4b2b..7ff31788512b 100644 > --- a/fs/xfs/xfs_buf_item.c > +++ b/fs/xfs/xfs_buf_item.c > @@ -475,17 +475,8 @@ xfs_buf_item_pin( > } > > /* > - * This is called to unpin the buffer associated with the buf log > - * item which was previously pinned with a call to xfs_buf_item_pin(). > - * > - * Also drop the reference to the buf item for the current transaction. > - * If the XFS_BLI_STALE flag is set and we are the last reference, > - * then free up the buf log item and unlock the buffer. > - * > - * If the remove flag is set we are called from uncommit in the > - * forced-shutdown path. If that is true and the reference count on > - * the log item is going to drop to zero we need to free the item's > - * descriptor in the transaction. > + * This is called to unpin the buffer associated with the buf log item which > + * was previously pinned with a call to xfs_buf_item_pin(). > */ > STATIC void > xfs_buf_item_unpin( > @@ -502,12 +493,26 @@ xfs_buf_item_unpin( > > trace_xfs_buf_item_unpin(bip); > > + /* > + * Drop the bli ref associated with the pin and grab the hold required > + * for the I/O simulation failure in the abort case. We have to do this > + * before the pin count drops because the AIL doesn't acquire a bli > + * reference. Therefore if the refcount drops to zero, the bli could > + * still be AIL resident and the buffer submitted for I/O (and freed on > + * completion) at any point before we return. This can be removed once > + * the AIL properly holds a reference on the bli. > + */ > freed = atomic_dec_and_test(&bip->bli_refcount); > - > + if (freed && !stale && remove) > + xfs_buf_hold(bp); > if (atomic_dec_and_test(&bp->b_pin_count)) > wake_up_all(&bp->b_waiters); > > - if (freed && stale) { > + /* nothing to do but drop the pin count if the bli is active */ > + if (!freed) > + return; Hmm, this all seems convoluted as promised, but if I'm reading the code correctly, you're moving the buffer hold above where we wake the pincount waiters, because the AIL could be in xfs_buf_wait_unpin, holding the only reference? So if we wake it and the write is quick, the AIL's ioend will nuke the buffer before this thread (which is trying to kill a transaction and shut down the system?) gets a chance to free the buffer via _buf_ioend_fail? If I got that right, Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > + > + if (stale) { > ASSERT(bip->bli_flags & XFS_BLI_STALE); > ASSERT(xfs_buf_islocked(bp)); > ASSERT(bp->b_flags & XBF_STALE); > @@ -550,13 +555,13 @@ xfs_buf_item_unpin( > ASSERT(bp->b_log_item == NULL); > } > xfs_buf_relse(bp); > - } else if (freed && remove) { > + } else if (remove) { > /* > * The buffer must be locked and held by the caller to simulate > - * an async I/O failure. > + * an async I/O failure. We acquired the hold for this case > + * before the buffer was unpinned. > */ > xfs_buf_lock(bp); > - xfs_buf_hold(bp); > bp->b_flags |= XBF_ASYNC; > xfs_buf_ioend_fail(bp); > } > -- > 2.26.3 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing 2021-05-12 1:52 ` Darrick J. Wong @ 2021-05-12 12:22 ` Christoph Hellwig 2021-05-12 14:29 ` Brian Foster 2021-05-12 14:28 ` Brian Foster 1 sibling, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2021-05-12 12:22 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Brian Foster, linux-xfs On Tue, May 11, 2021 at 06:52:44PM -0700, Darrick J. Wong wrote: > > is unpinned if the associated item has been aborted and will require > > a simulated I/O failure. The hold is already required for the > > simulated I/O failure, so the ordering simply guarantees the unpin > > handler access to the buffer before it is unpinned and thus > > processed by the AIL. This particular ordering is required so long > > as the AIL does not acquire a reference on the bli, which is the > > long term solution to this problem. > > Are you working on that too, or are we just going to let that lie for > the time being? :) Wouldn't that be as simple as something like the untested patch below? diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index fb69879e4b2b..07e08713ecd4 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -471,6 +471,7 @@ xfs_buf_item_pin( trace_xfs_buf_item_pin(bip); atomic_inc(&bip->bli_refcount); + xfs_buf_hold(bip->bli_buf); atomic_inc(&bip->bli_buf->b_pin_count); } @@ -552,14 +553,15 @@ xfs_buf_item_unpin( xfs_buf_relse(bp); } else if (freed && remove) { /* - * The buffer must be locked and held by the caller to simulate - * an async I/O failure. + * The buffer must be locked to simulate an async I/O failure. + * xfs_buf_ioend_fail will drop our buffer reference. */ xfs_buf_lock(bp); - xfs_buf_hold(bp); bp->b_flags |= XBF_ASYNC; xfs_buf_ioend_fail(bp); + return; } + xfs_buf_rele(bp); } STATIC uint ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing 2021-05-12 12:22 ` Christoph Hellwig @ 2021-05-12 14:29 ` Brian Foster 0 siblings, 0 replies; 11+ messages in thread From: Brian Foster @ 2021-05-12 14:29 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs On Wed, May 12, 2021 at 01:22:49PM +0100, Christoph Hellwig wrote: > On Tue, May 11, 2021 at 06:52:44PM -0700, Darrick J. Wong wrote: > > > is unpinned if the associated item has been aborted and will require > > > a simulated I/O failure. The hold is already required for the > > > simulated I/O failure, so the ordering simply guarantees the unpin > > > handler access to the buffer before it is unpinned and thus > > > processed by the AIL. This particular ordering is required so long > > > as the AIL does not acquire a reference on the bli, which is the > > > long term solution to this problem. > > > > Are you working on that too, or are we just going to let that lie for > > the time being? :) > > Wouldn't that be as simple as something like the untested patch below? > I actually think this is moderately less simple than the RFC I started with (see the cover letter for a reference) because there's really no need for a buffer hold per pin. I moved away from the RFC approach to this to 1. isolate the hold/rele cycle to the scenario where it's actually necessary (unpin abort) and 2. document the design flaw that Dave had pointed out that contributes to this problem. So point #1 means the explicit hold basically fills the gap that the bli reference count fails to cover to preserve buffer access by (AIL resident) log item processing code, and no more, whereas the RFC and the patch below are a bit more convoluted (even though the code might look simpler) in that they obscure that context. Brian > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > index fb69879e4b2b..07e08713ecd4 100644 > --- a/fs/xfs/xfs_buf_item.c > +++ b/fs/xfs/xfs_buf_item.c > @@ -471,6 +471,7 @@ xfs_buf_item_pin( > trace_xfs_buf_item_pin(bip); > > atomic_inc(&bip->bli_refcount); > + xfs_buf_hold(bip->bli_buf); > atomic_inc(&bip->bli_buf->b_pin_count); > } > > @@ -552,14 +553,15 @@ xfs_buf_item_unpin( > xfs_buf_relse(bp); > } else if (freed && remove) { > /* > - * The buffer must be locked and held by the caller to simulate > - * an async I/O failure. > + * The buffer must be locked to simulate an async I/O failure. > + * xfs_buf_ioend_fail will drop our buffer reference. > */ > xfs_buf_lock(bp); > - xfs_buf_hold(bp); > bp->b_flags |= XBF_ASYNC; > xfs_buf_ioend_fail(bp); > + return; > } > + xfs_buf_rele(bp); > } > > STATIC uint > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing 2021-05-12 1:52 ` Darrick J. Wong 2021-05-12 12:22 ` Christoph Hellwig @ 2021-05-12 14:28 ` Brian Foster 1 sibling, 0 replies; 11+ messages in thread From: Brian Foster @ 2021-05-12 14:28 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs On Tue, May 11, 2021 at 06:52:44PM -0700, Darrick J. Wong wrote: > On Tue, May 11, 2021 at 09:52:56AM -0400, Brian Foster wrote: > > The special processing used to simulate a buffer I/O failure on fs > > shutdown has a difficult to reproduce race that can result in a use > > after free of the associated buffer. Consider a buffer that has been > > committed to the on-disk log and thus is AIL resident. The buffer > > lands on the writeback delwri queue, but is subsequently locked, > > committed and pinned by another transaction before submitted for > > I/O. At this point, the buffer is stuck on the delwri queue as it > > cannot be submitted for I/O until it is unpinned. A log checkpoint > > I/O failure occurs sometime later, which aborts the bli. The unpin > > handler is called with the aborted log item, drops the bli reference > > count, the pin count, and falls into the I/O failure simulation > > path. > > > > The potential problem here is that once the pin count falls to zero > > in ->iop_unpin(), xfsaild is free to retry delwri submission of the > > buffer at any time, before the unpin handler even completes. If > > delwri queue submission wins the race to the buffer lock, it > > observes the shutdown state and simulates the I/O failure itself. > > This releases both the bli and delwri queue holds and frees the > > buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to > > run through the same failure sequence. This problem is rare and > > requires many iterations of fstest generic/019 (which simulates disk > > I/O failures) to reproduce. > > > > To avoid this problem, grab a hold on the buffer before the log item > > is unpinned if the associated item has been aborted and will require > > a simulated I/O failure. The hold is already required for the > > simulated I/O failure, so the ordering simply guarantees the unpin > > handler access to the buffer before it is unpinned and thus > > processed by the AIL. This particular ordering is required so long > > as the AIL does not acquire a reference on the bli, which is the > > long term solution to this problem. > > Are you working on that too, or are we just going to let that lie for > the time being? :) > It's on my todo list. I need to think about it some more to consider the functional change to the unpin code and other potential incompatibilities if the writeback completion code assumes the AIL has a reference, etc. This patch is an extremely isolated bug fix whereas the above is a bit broader of a rework to address a design flaw. I'd prefer not to conflate the two things unless absolutely necessary. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > fs/xfs/xfs_buf_item.c | 37 +++++++++++++++++++++---------------- > > 1 file changed, 21 insertions(+), 16 deletions(-) > > > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > > index fb69879e4b2b..7ff31788512b 100644 > > --- a/fs/xfs/xfs_buf_item.c > > +++ b/fs/xfs/xfs_buf_item.c > > @@ -475,17 +475,8 @@ xfs_buf_item_pin( > > } > > > > /* > > - * This is called to unpin the buffer associated with the buf log > > - * item which was previously pinned with a call to xfs_buf_item_pin(). > > - * > > - * Also drop the reference to the buf item for the current transaction. > > - * If the XFS_BLI_STALE flag is set and we are the last reference, > > - * then free up the buf log item and unlock the buffer. > > - * > > - * If the remove flag is set we are called from uncommit in the > > - * forced-shutdown path. If that is true and the reference count on > > - * the log item is going to drop to zero we need to free the item's > > - * descriptor in the transaction. > > + * This is called to unpin the buffer associated with the buf log item which > > + * was previously pinned with a call to xfs_buf_item_pin(). > > */ > > STATIC void > > xfs_buf_item_unpin( > > @@ -502,12 +493,26 @@ xfs_buf_item_unpin( > > > > trace_xfs_buf_item_unpin(bip); > > > > + /* > > + * Drop the bli ref associated with the pin and grab the hold required > > + * for the I/O simulation failure in the abort case. We have to do this > > + * before the pin count drops because the AIL doesn't acquire a bli > > + * reference. Therefore if the refcount drops to zero, the bli could > > + * still be AIL resident and the buffer submitted for I/O (and freed on > > + * completion) at any point before we return. This can be removed once > > + * the AIL properly holds a reference on the bli. > > + */ > > freed = atomic_dec_and_test(&bip->bli_refcount); > > - > > + if (freed && !stale && remove) > > + xfs_buf_hold(bp); > > if (atomic_dec_and_test(&bp->b_pin_count)) > > wake_up_all(&bp->b_waiters); > > > > - if (freed && stale) { > > + /* nothing to do but drop the pin count if the bli is active */ > > + if (!freed) > > + return; > > Hmm, this all seems convoluted as promised, but if I'm reading the code > correctly, you're moving the buffer hold above where we wake the > pincount waiters, because the AIL could be in xfs_buf_wait_unpin, > holding the only reference? So if we wake it and the write is quick, > the AIL's ioend will nuke the buffer before this thread (which is trying > to kill a transaction and shut down the system?) gets a chance to > free the buffer via _buf_ioend_fail? > Mostly.. this code isn't trying to kill a transaction, it just needs to process the buffer in the event that logging it failed. The non-failure case here is that the final bli reference drops in this unpin code, but the bli reference count does not historically govern the life cycle of the bli object. Instead, the item stays around in the AIL with refcount == 0 until the buffer is eventually written back. This can only occur when xfsaild locks an unpinned buffer, so sort of by proxy (because a pin elevates bli_refcount) this allows writeback completion to explicitly free the bli. IOW, I suspect yet another potential solution to this particular problem is to check whether the item is in the AIL in the event of an unpin abort and use that to decide who actually is responsible for the bli/buffer. I've tested something along those lines in the past as well, but it's pretty much logically equivalent to this patch so I'm not sure it's worth exploring further. Brian > If I got that right, > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > --D > > > > + > > + if (stale) { > > ASSERT(bip->bli_flags & XFS_BLI_STALE); > > ASSERT(xfs_buf_islocked(bp)); > > ASSERT(bp->b_flags & XBF_STALE); > > @@ -550,13 +555,13 @@ xfs_buf_item_unpin( > > ASSERT(bp->b_log_item == NULL); > > } > > xfs_buf_relse(bp); > > - } else if (freed && remove) { > > + } else if (remove) { > > /* > > * The buffer must be locked and held by the caller to simulate > > - * an async I/O failure. > > + * an async I/O failure. We acquired the hold for this case > > + * before the buffer was unpinned. > > */ > > xfs_buf_lock(bp); > > - xfs_buf_hold(bp); > > bp->b_flags |= XBF_ASYNC; > > xfs_buf_ioend_fail(bp); > > } > > -- > > 2.26.3 > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] xfs: remove dead stale buf unpin handling code 2021-05-11 13:52 [PATCH 0/2] xfs: fix buffer use after free on unpin abort Brian Foster 2021-05-11 13:52 ` [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing Brian Foster @ 2021-05-11 13:52 ` Brian Foster 2021-05-12 1:55 ` Darrick J. Wong ` (2 more replies) 1 sibling, 3 replies; 11+ messages in thread From: Brian Foster @ 2021-05-11 13:52 UTC (permalink / raw) To: linux-xfs This code goes back to a time when transaction commits wrote directly to iclogs. The associated log items were pinned, written to the log, and then "uncommitted" if some part of the log write had failed. This uncommit sequence called an ->iop_unpin_remove() handler that was eventually folded into ->iop_unpin() via the remove parameter. The log subsystem has since changed significantly in that transactions commit to the CIL instead of direct to iclogs, though log items must still be aborted in the event of an eventual log I/O error. However, the context for a log item abort is now asynchronous from transaction commit, which means the committing transaction has been freed by this point in time and the transaction uncommit sequence of events is no longer relevant. Further, since stale buffers remain locked at transaction commit through unpin, we can be certain that the buffer is not associated with any transaction when the unpin callback executes. Remove this unused hunk of code and replace it with an assertion that the buffer is disassociated from transaction context. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/xfs_buf_item.c | 20 +------------------- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 7ff31788512b..634abf30b5bc 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -517,28 +517,10 @@ xfs_buf_item_unpin( ASSERT(xfs_buf_islocked(bp)); ASSERT(bp->b_flags & XBF_STALE); ASSERT(bip->__bli_format.blf_flags & XFS_BLF_CANCEL); + ASSERT(list_empty(&lip->li_trans) && !bp->b_transp); trace_xfs_buf_item_unpin_stale(bip); - if (remove) { - /* - * If we are in a transaction context, we have to - * remove the log item from the transaction as we are - * about to release our reference to the buffer. If we - * don't, the unlock that occurs later in - * xfs_trans_uncommit() will try to reference the - * buffer which we no longer have a hold on. - */ - if (!list_empty(&lip->li_trans)) - xfs_trans_del_item(lip); - - /* - * Since the transaction no longer refers to the buffer, - * the buffer should no longer refer to the transaction. - */ - bp->b_transp = NULL; - } - /* * If we get called here because of an IO error, we may or may * not have the item on the AIL. xfs_trans_ail_delete() will -- 2.26.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] xfs: remove dead stale buf unpin handling code 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster @ 2021-05-12 1:55 ` Darrick J. Wong 2021-05-12 12:00 ` Christoph Hellwig 2021-05-12 14:30 ` [PATCH v1.1 " Brian Foster 2 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2021-05-12 1:55 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Tue, May 11, 2021 at 09:52:57AM -0400, Brian Foster wrote: > This code goes back to a time when transaction commits wrote > directly to iclogs. The associated log items were pinned, written to > the log, and then "uncommitted" if some part of the log write had > failed. This uncommit sequence called an ->iop_unpin_remove() > handler that was eventually folded into ->iop_unpin() via the remove > parameter. The log subsystem has since changed significantly in that > transactions commit to the CIL instead of direct to iclogs, though > log items must still be aborted in the event of an eventual log I/O > error. However, the context for a log item abort is now asynchronous > from transaction commit, which means the committing transaction has > been freed by this point in time and the transaction uncommit > sequence of events is no longer relevant. > > Further, since stale buffers remain locked at transaction commit > through unpin, we can be certain that the buffer is not associated > with any transaction when the unpin callback executes. Remove this > unused hunk of code and replace it with an assertion that the buffer > is disassociated from transaction context. > > Signed-off-by: Brian Foster <bfoster@redhat.com> <nod> my brain kinda hurts now, but I have a vague recollection of wondering how you could get a stale buffer that was also being removed and not being able to figure out how one might stumble into this chunk of code. :) Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/xfs_buf_item.c | 20 +------------------- > 1 file changed, 1 insertion(+), 19 deletions(-) > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > index 7ff31788512b..634abf30b5bc 100644 > --- a/fs/xfs/xfs_buf_item.c > +++ b/fs/xfs/xfs_buf_item.c > @@ -517,28 +517,10 @@ xfs_buf_item_unpin( > ASSERT(xfs_buf_islocked(bp)); > ASSERT(bp->b_flags & XBF_STALE); > ASSERT(bip->__bli_format.blf_flags & XFS_BLF_CANCEL); > + ASSERT(list_empty(&lip->li_trans) && !bp->b_transp); > > trace_xfs_buf_item_unpin_stale(bip); > > - if (remove) { > - /* > - * If we are in a transaction context, we have to > - * remove the log item from the transaction as we are > - * about to release our reference to the buffer. If we > - * don't, the unlock that occurs later in > - * xfs_trans_uncommit() will try to reference the > - * buffer which we no longer have a hold on. > - */ > - if (!list_empty(&lip->li_trans)) > - xfs_trans_del_item(lip); > - > - /* > - * Since the transaction no longer refers to the buffer, > - * the buffer should no longer refer to the transaction. > - */ > - bp->b_transp = NULL; > - } > - > /* > * If we get called here because of an IO error, we may or may > * not have the item on the AIL. xfs_trans_ail_delete() will > -- > 2.26.3 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] xfs: remove dead stale buf unpin handling code 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster 2021-05-12 1:55 ` Darrick J. Wong @ 2021-05-12 12:00 ` Christoph Hellwig 2021-05-12 14:29 ` Brian Foster 2021-05-12 14:30 ` [PATCH v1.1 " Brian Foster 2 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2021-05-12 12:00 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs > + ASSERT(list_empty(&lip->li_trans) && !bp->b_transp); Nit: Two separate ASSERTS are generally better than one with two conditions and a "&&", so that when the assert triggers it shows which condition caused it. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] xfs: remove dead stale buf unpin handling code 2021-05-12 12:00 ` Christoph Hellwig @ 2021-05-12 14:29 ` Brian Foster 0 siblings, 0 replies; 11+ messages in thread From: Brian Foster @ 2021-05-12 14:29 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Wed, May 12, 2021 at 01:00:25PM +0100, Christoph Hellwig wrote: > > + ASSERT(list_empty(&lip->li_trans) && !bp->b_transp); > > Nit: Two separate ASSERTS are generally better than one with two > conditions and a "&&", so that when the assert triggers it shows which > condition caused it. > In this case both checks pretty much mean the same thing so I don't see much added value, but I don't mind splitting it.. Brian ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v1.1 2/2] xfs: remove dead stale buf unpin handling code 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster 2021-05-12 1:55 ` Darrick J. Wong 2021-05-12 12:00 ` Christoph Hellwig @ 2021-05-12 14:30 ` Brian Foster 2 siblings, 0 replies; 11+ messages in thread From: Brian Foster @ 2021-05-12 14:30 UTC (permalink / raw) To: linux-xfs This code goes back to a time when transaction commits wrote directly to iclogs. The associated log items were pinned, written to the log, and then "uncommitted" if some part of the log write had failed. This uncommit sequence called an ->iop_unpin_remove() handler that was eventually folded into ->iop_unpin() via the remove parameter. The log subsystem has since changed significantly in that transactions commit to the CIL instead of direct to iclogs, though log items must still be aborted in the event of an eventual log I/O error. However, the context for a log item abort is now asynchronous from transaction commit, which means the committing transaction has been freed by this point in time and the transaction uncommit sequence of events is no longer relevant. Further, since stale buffers remain locked at transaction commit through unpin, we can be certain that the buffer is not associated with any transaction when the unpin callback executes. Remove this unused hunk of code and replace it with an assertion that the buffer is disassociated from transaction context. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/xfs_buf_item.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 7ff31788512b..672112064dac 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -517,28 +517,11 @@ xfs_buf_item_unpin( ASSERT(xfs_buf_islocked(bp)); ASSERT(bp->b_flags & XBF_STALE); ASSERT(bip->__bli_format.blf_flags & XFS_BLF_CANCEL); + ASSERT(list_empty(&lip->li_trans)); + ASSERT(!bp->b_transp); trace_xfs_buf_item_unpin_stale(bip); - if (remove) { - /* - * If we are in a transaction context, we have to - * remove the log item from the transaction as we are - * about to release our reference to the buffer. If we - * don't, the unlock that occurs later in - * xfs_trans_uncommit() will try to reference the - * buffer which we no longer have a hold on. - */ - if (!list_empty(&lip->li_trans)) - xfs_trans_del_item(lip); - - /* - * Since the transaction no longer refers to the buffer, - * the buffer should no longer refer to the transaction. - */ - bp->b_transp = NULL; - } - /* * If we get called here because of an IO error, we may or may * not have the item on the AIL. xfs_trans_ail_delete() will -- 2.26.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-05-12 14:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-11 13:52 [PATCH 0/2] xfs: fix buffer use after free on unpin abort Brian Foster 2021-05-11 13:52 ` [PATCH 1/2] xfs: hold buffer across unpin and potential shutdown processing Brian Foster 2021-05-12 1:52 ` Darrick J. Wong 2021-05-12 12:22 ` Christoph Hellwig 2021-05-12 14:29 ` Brian Foster 2021-05-12 14:28 ` Brian Foster 2021-05-11 13:52 ` [PATCH 2/2] xfs: remove dead stale buf unpin handling code Brian Foster 2021-05-12 1:55 ` Darrick J. Wong 2021-05-12 12:00 ` Christoph Hellwig 2021-05-12 14:29 ` Brian Foster 2021-05-12 14:30 ` [PATCH v1.1 " Brian Foster
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.