fsverity.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Andrey Albershteyn <aalbersh@redhat.com>
To: Eric Biggers <ebiggers@kernel.org>
Cc: fsverity@lists.linux.dev
Subject: Re: [PATCH v2] fsverity: remove hash page spin lock
Date: Thu, 1 Feb 2024 11:25:18 +0100	[thread overview]
Message-ID: <kkgu2qxuuhrqctuvnttfakojfkumrl6zvdarqpqrw7xqlvqjp4@qhwox5kqi7se> (raw)
In-Reply-To: <20240201052813.68380-1-ebiggers@kernel.org>

On 2024-01-31 21:28:13, Eric Biggers wrote:
> From: Andrey Albershteyn <aalbersh@redhat.com>
> 
> The spin lock is not necessary here as it can be replaced with
> memory barrier which should be better performance-wise.
> 
> When Merkle tree block size differs from page size, in
> is_hash_block_verified() two things are modified during check - a
> bitmap and PG_checked flag of the page.
> 
> Each bit in the bitmap represent verification status of the Merkle
> tree blocks. PG_checked flag tells if page was just re-instantiated
> or was in pagecache. Both of this states are shared between
> verification threads. Page which was re-instantiated can not have
> already verified blocks (bit set in bitmap).
> 
> The spin lock was used to allow only one thread to modify both of
> these states and keep order of operations. The only requirement here
> is that PG_Checked is set strictly after bitmap is updated.
> This way other threads which see that PG_Checked=1 (page cached)
> knows that bitmap is up-to-date. Otherwise, if PG_Checked is set
> before bitmap is cleared, other threads can see bit=1 and therefore
> will not perform verification of that Merkle tree block.
> 
> However, there's still the case when one thread is setting a bit in
> verify_data_block() and other thread is clearing it in
> is_hash_block_verified(). This can happen if two threads get to
> !PageChecked branch and one of the threads is rescheduled before
> resetting the bitmap. This is fine as at worst blocks are
> re-verified in each thread.
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> [ebiggers: improved the comment and removed the 'verified' variable]
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/verity/fsverity_private.h |  1 -
>  fs/verity/open.c             |  1 -
>  fs/verity/verify.c           | 48 ++++++++++++++++++------------------
>  3 files changed, 24 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> index a6a6b27492414..b3506f56e180b 100644
> --- a/fs/verity/fsverity_private.h
> +++ b/fs/verity/fsverity_private.h
> @@ -62,21 +62,20 @@ struct merkle_tree_params {
>   * caches information about the Merkle tree that's needed to efficiently verify
>   * data read from the file.  It also caches the file digest.  The Merkle tree
>   * pages themselves are not cached here, but the filesystem may cache them.
>   */
>  struct fsverity_info {
>  	struct merkle_tree_params tree_params;
>  	u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE];
>  	u8 file_digest[FS_VERITY_MAX_DIGEST_SIZE];
>  	const struct inode *inode;
>  	unsigned long *hash_block_verified;
> -	spinlock_t hash_page_init_lock;
>  };
>  
>  #define FS_VERITY_MAX_SIGNATURE_SIZE	(FS_VERITY_MAX_DESCRIPTOR_SIZE - \
>  					 sizeof(struct fsverity_descriptor))
>  
>  /* hash_algs.c */
>  
>  extern struct fsverity_hash_alg fsverity_hash_algs[];
>  
>  const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
> diff --git a/fs/verity/open.c b/fs/verity/open.c
> index 6c31a871b84bc..fdeb95eca3af3 100644
> --- a/fs/verity/open.c
> +++ b/fs/verity/open.c
> @@ -232,21 +232,20 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
>  			vi->tree_params.tree_pages <<
>  			vi->tree_params.log_blocks_per_page;
>  
>  		vi->hash_block_verified = kvcalloc(BITS_TO_LONGS(num_bits),
>  						   sizeof(unsigned long),
>  						   GFP_KERNEL);
>  		if (!vi->hash_block_verified) {
>  			err = -ENOMEM;
>  			goto fail;
>  		}
> -		spin_lock_init(&vi->hash_page_init_lock);
>  	}
>  
>  	return vi;
>  
>  fail:
>  	fsverity_free_info(vi);
>  	return ERR_PTR(err);
>  }
>  
>  void fsverity_set_info(struct inode *inode, struct fsverity_info *vi)
> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> index 904ccd7e8e162..4fcad0825a120 100644
> --- a/fs/verity/verify.c
> +++ b/fs/verity/verify.c
> @@ -12,21 +12,20 @@
>  
>  static struct workqueue_struct *fsverity_read_workqueue;
>  
>  /*
>   * Returns true if the hash block with index @hblock_idx in the tree, located in
>   * @hpage, has already been verified.
>   */
>  static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
>  				   unsigned long hblock_idx)
>  {
> -	bool verified;
>  	unsigned int blocks_per_page;
>  	unsigned int i;
>  
>  	/*
>  	 * When the Merkle tree block size and page size are the same, then the
>  	 * ->hash_block_verified bitmap isn't allocated, and we use PG_checked
>  	 * to directly indicate whether the page's block has been verified.
>  	 *
>  	 * Using PG_checked also guarantees that we re-verify hash pages that
>  	 * get evicted and re-instantiated from the backing storage, as new
> @@ -36,53 +35,54 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
>  		return PageChecked(hpage);
>  
>  	/*
>  	 * When the Merkle tree block size and page size differ, we use a bitmap
>  	 * to indicate whether each hash block has been verified.
>  	 *
>  	 * However, we still need to ensure that hash pages that get evicted and
>  	 * re-instantiated from the backing storage are re-verified.  To do
>  	 * this, we use PG_checked again, but now it doesn't really mean
>  	 * "checked".  Instead, now it just serves as an indicator for whether
> -	 * the hash page is newly instantiated or not.
> +	 * the hash page is newly instantiated or not.  If the page is new, as
> +	 * indicated by PG_checked=0, we clear the bitmap bits for the page's
> +	 * blocks since they are untrustworthy, then set PG_checked=1.
> +	 * Otherwise we return the bitmap bit for the requested block.
>  	 *
> -	 * The first thread that sees PG_checked=0 must clear the corresponding
> -	 * bitmap bits, then set PG_checked=1.  This requires a spinlock.  To
> -	 * avoid having to take this spinlock in the common case of
> -	 * PG_checked=1, we start with an opportunistic lockless read.
> +	 * Multiple threads may execute this code concurrently on the same page.
> +	 * This is safe because we use memory barriers to ensure that if a
> +	 * thread sees PG_checked=1, then it also sees the associated bitmap
> +	 * clearing to have occurred.  Also, all writes and their corresponding
> +	 * reads are atomic, and all writes are safe to repeat in the event that
> +	 * multiple threads get into the PG_checked=0 section.  (Clearing a
> +	 * bitmap bit again at worst causes a hash block to be verified
> +	 * redundantly.  That event should be very rare, so it's not worth using
> +	 * a lock to avoid.  Setting PG_checked again has no effect.)

Thanks! That's much more clear now. LGTM
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

-- 
- Andrey


      reply	other threads:[~2024-02-01 10:25 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01  5:28 [PATCH v2] fsverity: remove hash page spin lock Eric Biggers
2024-02-01 10:25 ` Andrey Albershteyn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kkgu2qxuuhrqctuvnttfakojfkumrl6zvdarqpqrw7xqlvqjp4@qhwox5kqi7se \
    --to=aalbersh@redhat.com \
    --cc=ebiggers@kernel.org \
    --cc=fsverity@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).