DM-Devel Archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alasdair Kergon <agk@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	dm-devel@lists.linux.dev, David Teigland <teigland@redhat.com>,
	Mike Snitzer <snitzer@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>, Joe Thornber <ejt@redhat.com>
Subject: Re: [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support
Date: Thu, 28 Mar 2024 19:09:52 -0400	[thread overview]
Message-ID: <20240328230952.GB2373362@fedora> (raw)
In-Reply-To: <e2lcp3n5gpf7zmlpyn4nj7wsr36sffn23z5bmzlsghu6oapi5u@sdkcbpimi5is>

[-- Attachment #1: Type: text/plain, Size: 2845 bytes --]

On Thu, Mar 28, 2024 at 05:16:54PM -0500, Eric Blake wrote:
> On Thu, Mar 28, 2024 at 04:39:01PM -0400, Stefan Hajnoczi wrote:
> > This can speed up the process by reducing the amount of data read and it
> > preserves sparseness when writing to the output file.
> > 
> > This patch series is an initial attempt at implementing
> > llseek(SEEK_HOLE/SEEK_DATA) for block devices. I'm looking for feedback on this
> > approach and suggestions for resolving the open issues.
> 
> One of your open issues was whether adjusting the offset of the block
> device itself should also adjust the file offset of the underlying
> file (at least in the case of loopback and dm-linear).  What would the

Only the loop block driver has this issue. The dm-linear driver uses
blkdev_seek_hole_data(), which does not update the file offset because
it operates on a struct block_device instead of a struct file.

> 
> > 
> > In the block device world there are similar concepts to holes:
> > - SCSI has Logical Block Provisioning where the "mapped" state would be
> >   considered data and other states would be considered holes.
> 
> BIG caveat here: the SCSI spec does not necessarily guarantee that
> unmapped regions read as all zeroes; compare the difference between
> FALLOC_FL_ZERO_RANGE and FALLOC_FL_PUNCH_HOLE.  While lseek(SEEK_HOLE)
> on a regular file guarantees that future read() in that hole will see
> NUL bytes, I'm not sure whether we want to make that guarantee for
> block devices.  This may be yet another case where we might want to
> add new SEEK_* constants to the *seek() family of functions that lets
> the caller indicate whether they want offsets that are guaranteed to
> read as zero, vs. merely offsets that are not allocated but may or may
> not read as zero.  Skipping unallocated portions, even when you don't
> know if the contents reliably read as zero, is still a useful goal in
> some userspace programs.

SCSI initiators can check the Logical Block Provisioning Read Zeroes
(LBPRZ) field to determine whether or not zeroes are guaranteed. The sd
driver would only rely on the device when LPBRZ indicates that zeroes
will be read. Otherwise the driver would report that the device is
filled with data.

> 
> > - NBD has NBD_CMD_BLOCK_STATUS for querying whether blocks are present.
> 
> However, utilizing it in nbd.ko would require teaching the kernel to
> handle structured or extended headers (right now, that is an extension
> only supported in user-space implementations of the NBD protocol).  I
> can see why you did not tackle that in this RFC series, even though
> you mention it in the cover letter.

Yes, I'm mostly interested in dm-thin. The loop block driver and
dm-linear are useful for testing so I modified them. I didn't try SCSI
or NBD.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2024-03-28 23:10 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 20:39 [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 1/9] " Stefan Hajnoczi
2024-03-28 23:50   ` Eric Blake
2024-03-28 20:39 ` [RFC 2/9] loop: " Stefan Hajnoczi
2024-03-29  0:00   ` Eric Blake
2024-03-29 12:54     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 3/9] selftests: block_seek_hole: add loop block driver tests Stefan Hajnoczi
2024-03-29  0:11   ` Eric Blake
2024-04-03 13:50     ` Stefan Hajnoczi
2024-03-29 12:38   ` Eric Blake
2024-04-03 13:51     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  0:38   ` Eric Blake
2024-04-03 14:11     ` Stefan Hajnoczi
2024-04-03 17:02       ` Eric Blake
2024-04-03 17:58         ` Stefan Hajnoczi
2024-04-03 19:28           ` Eric Blake
2024-03-28 20:39 ` [RFC 5/9] selftests: block_seek_hole: add dm-zero test Stefan Hajnoczi
2024-03-28 22:19   ` Eric Blake
2024-03-28 22:32     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 6/9] dm-linear: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  0:54   ` Eric Blake
2024-04-03 14:22     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 7/9] selftests: block_seek_hole: add dm-linear test Stefan Hajnoczi
2024-03-29  0:59   ` Eric Blake
2024-04-03 14:23     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 8/9] dm thin: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  1:31   ` Eric Blake
2024-04-03 15:03     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 9/9] selftests: block_seek_hole: add dm-thin test Stefan Hajnoczi
2024-03-28 22:16 ` [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Eric Blake
2024-03-28 22:29   ` Eric Blake
2024-03-28 23:09   ` Stefan Hajnoczi [this message]
2024-04-02 12:26 ` Christoph Hellwig
2024-04-02 13:04   ` Stefan Hajnoczi
2024-04-05  7:02     ` Christoph Hellwig
2024-04-02 13:31   ` Eric Blake
2024-04-05  7:02     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240328230952.GB2373362@fedora \
    --to=stefanha@redhat.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@lists.linux.dev \
    --cc=eblake@redhat.com \
    --cc=ejt@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).