* [LSF/MM TOPIC] Improving DISCARD support
@ 2016-01-27 17:48 Bart Van Assche
0 siblings, 0 replies; only message in thread
From: Bart Van Assche @ 2016-01-27 17:48 UTC (permalink / raw
To: lsf-pc
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
Linux-fsdevel
Increasing the discard granularity of an SSD helps with reducing the
production cost of an SSD. This makes it likely that in the future even
more SSDs will have a discard granularity that is larger than the block
size than today. Kernel drivers like DRBD and OCFS2 need a way to
efficiently erase a data range. This is why the function
blkdev_issue_zeroout() has been introduced in the block layer. This
function uses the DISCARD operation for those devices for which
discarded ranges read back as zero (discard_zeroes_data == 1). However,
support in the block layer for devices with a discard granularity that
is larger than the block size is suboptimal. Callers of
blkdev_issue_zeroout() namely have to ensure that the start and end of
the range that has to be zeroed are aligned with discard boundaries.
This forces driver writers to write wrapper functions that work around
this limitation. An example can be found in [1]. My proposal is to
provide such functionality in the block layer instead of forcing every
driver author to reimplement this functionality. A candidate
implementation is available in [2].
[1] Lars Ellenberg, drbd: when receiving P_TRIM, zero-out partial
unaligned chunks, October 2015
(http://git.drbd.org/drbd-8.4.git/commitdiff/5204727f9dfa695fc57915c4edeb2a9f186e5fba).
[2] Bart Van Assche, Make blkdev_issue_discard() submit aligned discard
requests, December 2015
(http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/23801).
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2016-01-27 17:48 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-27 17:48 [LSF/MM TOPIC] Improving DISCARD support Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).