Linux-Fsdevel Archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: aalbersh@redhat.com, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, fsverity@lists.linux.dev
Subject: Re: [PATCH 10/13] fsverity: pass the zero-hash value to the implementation
Date: Wed, 24 Apr 2024 13:23:48 -0700	[thread overview]
Message-ID: <20240424202348.GN360919@frogsfrogsfrogs> (raw)
In-Reply-To: <20240424191950.GA749176@google.com>

On Wed, Apr 24, 2024 at 07:19:50PM +0000, Eric Biggers wrote:
> On Wed, Apr 24, 2024 at 12:02:46PM -0700, Darrick J. Wong wrote:
> > On Thu, Apr 04, 2024 at 10:57:50PM -0400, Eric Biggers wrote:
> > > On Fri, Mar 29, 2024 at 05:35:17PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > 
> > > > Compute the hash of a data block full of zeros, and then supply this to
> > > > the merkle tree read and write methods.  A subsequent xfs patch will use
> > > 
> > > This should say "hash of a block", not "hash of a data block".  What you
> > > actually care about is the hash of a Merkle tree block, not the hash of a data
> > > block.  Yet, there is no difference in how the hashes are calculated for the two
> > > types of blocks, so we should simply write "hash of a block".
> > 
> > I think I could go further with the precision of the description --
> > 
> > "Compute the hash of one filesystem block's worth of zeroes.  Any merkle
> > tree block containing only this hash can be elided at write time, and
> > its contents synthesized at read time."
> > 
> > I don't think this is going to happen very often above the leaf levels
> > of the merkle tree, but as written there's nothing to prevent the
> > elision of internal nodes.  Also note that the elision can happen for
> > internal nodes even when merkle tree blocksize != i_blocksize.
> > 
> > > > diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> > > > index de8798f141d4a..195a92f203bba 100644
> > > > --- a/fs/verity/fsverity_private.h
> > > > +++ b/fs/verity/fsverity_private.h
> > > > @@ -47,6 +47,8 @@ struct merkle_tree_params {
> > > >  	u64 tree_size;			/* Merkle tree size in bytes */
> > > >  	unsigned long tree_pages;	/* Merkle tree size in pages */
> > > >  
> > > > +	u8 zero_digest[FS_VERITY_MAX_DIGEST_SIZE]; /* hash of zeroed data block */
> > > 
> > > Similarly, "block" instead of "data block".
> > 
> > How about "the hash of an i_blocksize-sized buffer of zeroes" for all
> > three?
> 
> It's the Merkle tree block size, not the filesystem block size.  Or did you
> actually intend for this to use the filesystem block size?

I actually did intend for this to be the fs block size, not the merkle
tree block size.  It's the bottom level that I care about shrinking.
Let's say that data[0-B] are the data blocks:

root
 +-internal0
 |   +-leaf0
 |   |   +-data0
 |   |   +-data1
 |   |   `-data2
 |   `-leaf1
 |       +-data3
 |       +-data4
 |       `-data5
 `-internal1
     +-leaf2
     |   +-data6
     |   +-data7
     |   `-data8
     `-leaf3
         +-data9
         +-dataA
         `-dataB

(thanks to https://arthursonzogni.com/Diagon/#Tree )

If data[3-5] are completely zeroes (unwritten blocks, sparse holes,
etc.) then I want to skip writing leaf1 of the merkle tree to disk.

If it happens that the hashes of leaf[0-1] match hash(data3) then it's
frosting on top (as it were) that we can also skip internal0.  However,
the merkle tree has a high fanout factor (4096/32==128 in the common
case), so I care /much/ less about eliding those levels.

> In struct merkle_tree_params, the "block size" is always the Merkle tree block
> size, so the type of block size seems clear in that context.  My complaint was
> just that it used the term "data block" to mean a block that is not necessarily
> a file contents block (which is what "data block" means elsewhere).

Hm.  Given the confusion, would it help if I said that zero_digest
should only be used to elide leaf nodes of the merkle tree that hash the
contents of file content blocks?  Or is "the hash of an
i_blocksize-sized buffer of zeroes" sufficient?

What do you think of the commit message saying:

"Compute the hash of one filesystem block's worth of zeroes.  Any merkle
tree leaf block containing only this hash can be elided at write time,
and its contents synthesized at read time.

"Let's pretend that there's a file containing six data blocks and whose
merkle tree looks roughly like this:

root
 +--leaf0
 |   +--data0
 |   +--data1
 |   `--data2
 `--leaf1
     +--data3
     +--data4
     `--data5

"If data[0-2] are sparse holes, then leaf0 will contain a repeating
sequence of @zero_digest.  Therefore, leaf0 need not be written to disk
because its contents can be synthesized."

--D

> 
> - Eric
> 

  reply	other threads:[~2024-04-24 20:23 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-30  0:30 [PATCHBOMB v5.5] fs-verity support for XFS Darrick J. Wong
2024-03-30  0:32 ` [PATCHSET v5.5 1/2] fs-verity: support merkle tree access by blocks Darrick J. Wong
2024-03-30  0:32   ` [PATCH 01/13] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
2024-03-30  0:33   ` [PATCH 02/13] fsverity: pass tree_blocksize to end_enable_verity() Darrick J. Wong
2024-03-30  0:33   ` [PATCH 03/13] fsverity: support block-based Merkle tree caching Darrick J. Wong
2024-04-05  2:31     ` Eric Biggers
2024-04-24 21:25       ` Darrick J. Wong
2024-04-24 22:08         ` Eric Biggers
2024-04-25  0:27           ` Darrick J. Wong
2024-04-25  0:46             ` Eric Biggers
2024-04-25  0:53               ` Darrick J. Wong
2024-03-30  0:33   ` [PATCH 04/13] fsverity: add per-sb workqueue for post read processing Darrick J. Wong
2024-04-05  2:39     ` Eric Biggers
2024-04-24 21:33       ` Darrick J. Wong
2024-03-30  0:33   ` [PATCH 05/13] fsverity: add tracepoints Darrick J. Wong
2024-03-30  0:34   ` [PATCH 06/13] fsverity: send the level of the merkle tree block to ->read_merkle_tree_block Darrick J. Wong
2024-04-05  2:42     ` Eric Biggers
2024-04-25  0:30       ` Darrick J. Wong
2024-03-30  0:34   ` [PATCH 07/13] fsverity: pass the new tree size and block size to ->begin_enable_verity Darrick J. Wong
2024-04-05  2:46     ` Eric Biggers
2024-04-24 21:36       ` Darrick J. Wong
2024-03-30  0:34   ` [PATCH 08/13] fsverity: expose merkle tree geometry to callers Darrick J. Wong
2024-04-05  2:50     ` Eric Biggers
2024-04-25  0:45       ` Darrick J. Wong
2024-04-25  0:49         ` Eric Biggers
2024-04-25  1:01           ` Darrick J. Wong
2024-04-25  1:04             ` Eric Biggers
2024-03-30  0:35   ` [PATCH 09/13] fsverity: box up the write_merkle_tree_block parameters too Darrick J. Wong
2024-04-05  2:52     ` Eric Biggers
2024-04-25  0:46       ` Darrick J. Wong
2024-03-30  0:35   ` [PATCH 10/13] fsverity: pass the zero-hash value to the implementation Darrick J. Wong
2024-04-05  2:57     ` Eric Biggers
2024-04-24 19:02       ` Darrick J. Wong
2024-04-24 19:19         ` Eric Biggers
2024-04-24 20:23           ` Darrick J. Wong [this message]
2024-04-24 20:59             ` Eric Biggers
2024-04-24 21:43               ` Darrick J. Wong
2024-03-30  0:35   ` [PATCH 11/13] fsverity: report validation errors back to the filesystem Darrick J. Wong
2024-04-05  3:09     ` Eric Biggers
2024-04-24 18:18       ` Darrick J. Wong
2024-04-24 18:52         ` Eric Biggers
2024-04-24 19:03           ` Darrick J. Wong
2024-03-30  0:35   ` [PATCH 12/13] fsverity: remove system-wide workqueue Darrick J. Wong
2024-04-05  3:14     ` Eric Biggers
2024-04-24 18:05       ` Darrick J. Wong
2024-04-24 18:41         ` Eric Biggers
2024-04-29 10:15         ` Andrey Albershteyn
2024-04-29 16:35           ` Darrick J. Wong
2024-03-30  0:36   ` [PATCH 13/13] iomap: integrate fs-verity verification into iomap's read path Darrick J. Wong
2024-03-30  0:32 ` [PATCHSET v5.5 2/2] xfs: fs-verity support Darrick J. Wong
2024-03-30  0:36   ` [PATCH 01/29] xfs: use unsigned ints for non-negative quantities in xfs_attr_remote.c Darrick J. Wong
2024-04-02  9:51     ` Andrey Albershteyn
2024-04-02 16:25       ` Darrick J. Wong
2024-03-30  0:36   ` [PATCH 02/29] xfs: turn XFS_ATTR3_RMT_BUF_SPACE into a function Darrick J. Wong
2024-04-02 10:09     ` Andrey Albershteyn
2024-03-30  0:36   ` [PATCH 03/29] xfs: create a helper to compute the blockcount of a max sized remote value Darrick J. Wong
2024-04-02 10:09     ` Andrey Albershteyn
2024-03-30  0:37   ` [PATCH 04/29] xfs: minor cleanups of xfs_attr3_rmt_blocks Darrick J. Wong
2024-04-02 10:11     ` Andrey Albershteyn
2024-03-30  0:37   ` [PATCH 05/29] xfs: add attribute type for fs-verity Darrick J. Wong
2024-03-30  0:37   ` [PATCH 06/29] xfs: do not use xfs_attr3_rmt_hdr for remote verity value blocks Darrick J. Wong
2024-03-30  0:37   ` [PATCH 07/29] xfs: add fs-verity ro-compat flag Darrick J. Wong
2024-03-30  0:38   ` [PATCH 08/29] xfs: add inode on-disk VERITY flag Darrick J. Wong
2024-03-30  0:38   ` [PATCH 09/29] xfs: initialize fs-verity on file open and cleanup on inode destruction Darrick J. Wong
2024-03-30  0:38   ` [PATCH 10/29] xfs: don't allow to enable DAX on fs-verity sealed inode Darrick J. Wong
2024-03-30  0:38   ` [PATCH 11/29] xfs: disable direct read path for fs-verity files Darrick J. Wong
2024-03-30  0:39   ` [PATCH 12/29] xfs: widen flags argument to the xfs_iflags_* helpers Darrick J. Wong
2024-04-02 12:37     ` Andrey Albershteyn
2024-04-02 16:27       ` Darrick J. Wong
2024-03-30  0:39   ` [PATCH 13/29] xfs: add fs-verity support Darrick J. Wong
2024-04-02  8:42     ` Andrey Albershteyn
2024-04-02 16:34       ` Darrick J. Wong
2024-04-25  1:14         ` Darrick J. Wong
2024-03-30  0:39   ` [PATCH 14/29] xfs: create a per-mount shrinker for verity inodes merkle tree blocks Darrick J. Wong
2024-04-05  3:16     ` Eric Biggers
2024-04-24 17:39       ` Darrick J. Wong
2024-03-30  0:39   ` [PATCH 15/29] xfs: create an icache tag for files with cached " Darrick J. Wong
2024-03-30  0:40   ` [PATCH 16/29] xfs: shrink verity blob cache Darrick J. Wong
2024-03-30  0:40   ` [PATCH 17/29] xfs: only allow the verity iflag for regular files Darrick J. Wong
2024-04-02 12:52     ` Andrey Albershteyn
2024-03-30  0:40   ` [PATCH 18/29] xfs: don't store trailing zeroes of merkle tree blocks Darrick J. Wong
2024-03-30  0:41   ` [PATCH 19/29] xfs: use merkle tree offset as attr hash Darrick J. Wong
2024-03-30  0:41   ` [PATCH 20/29] xfs: don't bother storing merkle tree blocks for zeroed data blocks Darrick J. Wong
2024-03-30  0:41   ` [PATCH 21/29] xfs: add fs-verity ioctls Darrick J. Wong
2024-03-30  0:41   ` [PATCH 22/29] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
2024-04-02 13:44     ` Andrey Albershteyn
2024-03-30  0:42   ` [PATCH 23/29] xfs: make scrub aware of verity dinode flag Darrick J. Wong
2024-03-30  0:42   ` [PATCH 24/29] xfs: teach online repair to evaluate fsverity xattrs Darrick J. Wong
2024-04-02 15:42     ` Andrey Albershteyn
2024-04-02 16:42       ` Darrick J. Wong
2024-03-30  0:42   ` [PATCH 25/29] xfs: report verity failures through the health system Darrick J. Wong
2024-04-02 16:16     ` Andrey Albershteyn
2024-03-30  0:42   ` [PATCH 26/29] xfs: clear the verity iflag when not appropriate Darrick J. Wong
2024-04-02 16:26     ` Andrey Albershteyn
2024-03-30  0:43   ` [PATCH 27/29] xfs: make it possible to disable fsverity Darrick J. Wong
2024-04-02 17:15     ` Andrey Albershteyn
2024-04-02 23:25     ` Eric Biggers
2024-04-03  1:26       ` Darrick J. Wong
2024-03-30  0:43   ` [PATCH 28/29] xfs: allow verity files to be opened even if the fsverity metadata is damaged Darrick J. Wong
2024-04-02 18:04     ` Andrey Albershteyn
2024-04-02 20:00     ` Colin Walters
2024-04-02 22:52       ` Darrick J. Wong
2024-04-02 23:45         ` Eric Biggers
2024-04-03  1:34           ` Darrick J. Wong
2024-04-03  0:10         ` Colin Walters
2024-04-03  1:39           ` Darrick J. Wong
2024-04-03  1:59             ` Dave Chinner
2024-04-03  3:19               ` Darrick J. Wong
2024-04-03 22:22                 ` Dave Chinner
2024-04-03  8:35           ` Alexander Larsson
2024-03-30  0:43   ` [PATCH 29/29] xfs: enable ro-compat fs-verity flag Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240424202348.GN360919@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=aalbersh@redhat.com \
    --cc=ebiggers@kernel.org \
    --cc=fsverity@lists.linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).