LKML Archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Joe Thornber <thornber@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>,
	Mike Snitzer <snitzer@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Sarthak Kukreti <sarthakkukreti@chromium.org>,
	dm-devel@redhat.com, "Michael S. Tsirkin" <mst@redhat.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Jason Wang <jasowang@redhat.com>,
	Bart Van Assche <bvanassche@google.com>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	Joe Thornber <ejt@redhat.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Alasdair Kergon <agk@redhat.com>
Subject: Re: [PATCH v7 0/5] Introduce provisioning primitives
Date: Sat, 27 May 2023 09:45:02 +1000	[thread overview]
Message-ID: <ZHFEfngPyUOqlthr@dread.disaster.area> (raw)
In-Reply-To: <CAJ0trDbspRaDKzTzTjFdPHdB9n0Q9unfu1cEk8giTWoNu3jP8g@mail.gmail.com>

On Fri, May 26, 2023 at 12:04:02PM +0100, Joe Thornber wrote:
> Here's my take:
> 
> I don't see why the filesystem cares if thinp is doing a reservation or
> provisioning under the hood.  All that matters is that a future write
> to that region will be honoured (barring device failure etc.).
> 
> I agree that the reservation/force mapped status needs to be inherited
> by snapshots.
> 
> 
> One of the few strengths of thinp is the performance of taking a snapshot.
> Most snapshots created are never activated.  Many other snapshots are
> only alive for a brief period, and used read-only.  eg, blk-archive
> (https://github.com/jthornber/blk-archive) uses snapshots to do very
> fast incremental backups.  As such I'm strongly against any scheme that
> requires provisioning as part of the snapshot operation.
> 
> Hank and I are in the middle of the range tree work which requires a
> metadata
> change.  So now is a convenient time to piggyback other metadata changes to
> support reservations.
> 
> 
> Given the above this is what I suggest:
> 
> 1) We have an api (ioctl, bio flag, whatever) that lets you
> reserve/guarantee a region:
> 
>   int reserve_region(dev, sector_t begin, sector_t end);

A C-based interface is not sufficient because the layer that must do
provsioning is not guaranteed to be directly under the filesystem.
We must be able to propagate the request down to the layers that
need to provision storage, and that includes hardware devices.

e.g. dm-thin would have to issue REQ_PROVISION on the LBA ranges it
allocates in it's backing device to guarantee that the provisioned
LBA range it allocates is also fully provisioned by the storage
below it....

>   This api should be used minimally, eg, critical FS metadata only.

Keep in mind that "critical FS metadata" in this context is any
metadata which could cause the filesystem to hang or enter a global
error state if an unexpected ENOSPC error occurs during a metadata
write IO.

Which, in pretty much every journalling filesystem, equates to all
metadata in the filesystem. For a typical root filesystem, that
might be a in the range of a 1-200MB (depending on journal size).
For larger filesytems with lots of files in them, it will be in the
range of GBs of space.

Plan for having to support tens of GBs of provisioned space in
filesystems, not tens of MBs....

[snip]

> Now this is a lot of work.  As well as the kernel changes we'll need to
> update the userland tools: thin_check, thin_ls, thin_metadata_unpack,
> thin_rmap, thin_delta, thin_metadata_pack, thin_repair, thin_trim,
> thin_dump, thin_metadata_size, thin_restore.  Are we confident that we
> have buy in from the FS teams that this will be widely adopted?  Are users
> asking for this?  I really don't want to do 6 months of work for nothing.

I think there's a 2-3 solid days of coding to fully implement
REQ_PROVISION support in XFS, including userspace tool support.
Maybe a couple of weeks more to flush the bugs out before it's
largely ready to go.

So if there's buy in from the block layer and DM people for
REQ_PROVISION as described, then I'll definitely have XFS support
ready for you to test whenever dm-thinp is ready to go.

I can't speak for other filesystems, I suspect the only one we care
about is ext4.  btrfs and f2fs don't need dm-thinp and there aren't
any other filesystems that are used in production on top of
dm-thinp, so I think only XFS and ext4 matter at this point in time.

I suspect that ext4 would be fairly easy to add support for as well.
ext4 has a lot more fixed-place metadata than XFS has so much more
of it's metadata is covered by mkfs-time provisioning. Limiting
dynamic metadata to specific fully provisioned block groups and
provisioning new block groups for metadata when they are near full
would be equivalent to how I plan to provision metadata space in
XFS. Hence the implementation for ext4 looks to be broadly similar
in scope and complexity as XFS....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2023-05-26 23:46 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-18 22:33 [PATCH v7 0/5] Introduce provisioning primitives Sarthak Kukreti
2023-05-18 22:33 ` [PATCH v7 1/5] block: Don't invalidate pagecache for invalid falloc modes Sarthak Kukreti
2023-05-19  4:09   ` Christoph Hellwig
2023-05-19 15:17   ` Darrick J. Wong
2023-05-18 22:33 ` [PATCH v7 2/5] block: Introduce provisioning primitives Sarthak Kukreti
2023-05-19  4:18   ` Christoph Hellwig
2023-06-09 20:00   ` Mike Snitzer
2023-05-18 22:33 ` [PATCH v7 3/5] dm: Add block provisioning support Sarthak Kukreti
2023-05-18 22:33 ` [PATCH v7 4/5] dm-thin: Add REQ_OP_PROVISION support Sarthak Kukreti
2023-05-19 15:23   ` Mike Snitzer
2023-06-08 21:24     ` Mike Snitzer
2023-06-09  0:28       ` Mike Snitzer
2023-05-18 22:33 ` [PATCH v7 5/5] loop: Add support for provision requests Sarthak Kukreti
2023-05-22 16:37   ` [dm-devel] " Darrick J. Wong
2023-05-22 22:09     ` Sarthak Kukreti
2023-05-23  1:22       ` Darrick J. Wong
2023-10-07  1:29         ` Sarthak Kukreti
2023-05-19  4:09 ` [PATCH v7 0/5] Introduce provisioning primitives Christoph Hellwig
2023-05-19 14:41   ` Mike Snitzer
2023-05-19 23:07     ` Dave Chinner
2023-05-22 18:27       ` Mike Snitzer
2023-05-23 14:05         ` Brian Foster
2023-05-23 15:26           ` Mike Snitzer
2023-05-24  0:40             ` Dave Chinner
2023-05-24 20:02               ` Mike Snitzer
2023-05-25 11:39                 ` Dave Chinner
2023-05-25 16:00                   ` Mike Snitzer
2023-05-25 22:47                     ` Sarthak Kukreti
2023-05-26  1:36                       ` Dave Chinner
2023-05-26  2:35                         ` Sarthak Kukreti
2023-05-26 15:56                           ` Brian Foster
2023-05-25 16:19               ` Brian Foster
2023-05-26  9:37                 ` Dave Chinner
2023-05-26 15:47                   ` Brian Foster
     [not found]                   ` <CAJ0trDbspRaDKzTzTjFdPHdB9n0Q9unfu1cEk8giTWoNu3jP8g@mail.gmail.com>
2023-05-26 23:45                     ` Dave Chinner [this message]
     [not found]                       ` <CAJ0trDZJQwvAzngZLBJ1hB0XkQ1HRHQOdNQNTw9nK-U5i-0bLA@mail.gmail.com>
2023-05-30 14:02                         ` Mike Snitzer
     [not found]                           ` <CAJ0trDaUOevfiEpXasOESrLHTCcr=oz28ywJU+s+YOiuh7iWow@mail.gmail.com>
2023-05-30 15:28                             ` Mike Snitzer
2023-06-02 18:44                               ` Sarthak Kukreti
2023-06-02 21:50                                 ` Mike Snitzer
2023-06-03  0:52                                 ` Dave Chinner
2023-06-03 15:57                                   ` Mike Snitzer
2023-06-05 21:14                                     ` Sarthak Kukreti
2023-06-07  2:15                                       ` Dave Chinner
2023-06-07 23:27                                       ` Mike Snitzer
2023-06-09 20:31                                         ` Mike Snitzer
2023-06-09 21:54                                           ` Dave Chinner
2023-10-07  1:30                                           ` Sarthak Kukreti
2023-06-07  2:01                                     ` Dave Chinner
2023-06-07 23:50                                       ` Mike Snitzer
2023-06-09  3:32                                         ` Dave Chinner
2023-06-08  2:03                                   ` Martin K. Petersen
2023-06-09  0:10                                     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHFEfngPyUOqlthr@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bfoster@redhat.com \
    --cc=bvanassche@google.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=ejt@redhat.com \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=sarthakkukreti@chromium.org \
    --cc=snitzer@kernel.org \
    --cc=stefanha@redhat.com \
    --cc=thornber@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).