All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Christian Brauner <christian@brauner.io>,
	Matthew Wilcox <willy@infradead.org>
Cc: dhowells@redhat.com, Jeff Layton <jlayton@kernel.org>,
	Gao Xiang <hsiangkao@linux.alibaba.com>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Steve French <smfrench@gmail.com>,
	Marc Dionne <marc.dionne@auristor.com>,
	Paulo Alcantara <pc@manguebit.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>,
	Eric Van Hensbergen <ericvh@kernel.org>,
	Ilya Dryomov <idryomov@gmail.com>,
	netfs@lists.linux.dev, linux-cachefs@redhat.com,
	linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org,
	linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org,
	v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Latchesar Ionkov <lucho@ionkov.net>,
	Christian Schoenebeck <linux_oss@crudebyte.com>
Subject: Re: [PATCH 23/26] netfs: Cut over to using new writeback code
Date: Mon, 08 Apr 2024 16:53:17 +0100	[thread overview]
Message-ID: <877902.1712591597@warthog.procyon.org.uk> (raw)
In-Reply-To: <20240328163424.2781320-24-dhowells@redhat.com>

David Howells <dhowells@redhat.com> wrote:

> +		/* Wait for writeback to complete.  The writeback engine owns
> +		 * the info in folio->private and may change it until it
> +		 * removes the WB mark.
> +		 */
> +		if (folio_wait_writeback_killable(folio)) {
> +			ret = written ? -EINTR : -ERESTARTSYS;
> +			goto error_folio_unlock;
> +		}
> +

It turns out that this really kills performance with fio with as many jobs as
cpus.  It's taking up to around 8x longer to complete a pwrite() on average
and perf shows a 30% of the CPU cycles are being spent in contention on the
i_rwsem.

The reason this was added here is that writeback cannot take the folio lock in
order to clean up folio->private without risking deadlock vs the truncation
routines (IIRC).

I can mitigate this by skipping the wait if folio->private is not set and if
we're not going to attach anything there (see attached).  Note that if
writeout is ongoing and there is nothing attached to ->private, then we should
not be engaging write-streaming mode and attaching a new netfs_folio (and if
we did, we'd flush the page and wait for it anyway).

The other possibility is if we have a writeback group to set.  This only
applies to ceph for the moment and is something that will need dealing with
if/when ceph is made to use this code.

David
---

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 1eff9413eb1b..279b296f8014 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -255,7 +255,8 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
 		 * the info in folio->private and may change it until it
 		 * removes the WB mark.
 		 */
-		if (folio_wait_writeback_killable(folio)) {
+		if (folio_get_private(folio) &&
+		    folio_wait_writeback_killable(folio)) {
 			ret = written ? -EINTR : -ERESTARTSYS;
 			goto error_folio_unlock;
 		}


WARNING: multiple messages have this Message-ID (diff)
From: David Howells <dhowells@redhat.com>
To: Christian Brauner <christian@brauner.io>,
	Matthew Wilcox <willy@infradead.org>
Cc: Latchesar Ionkov <lucho@ionkov.net>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Christian Schoenebeck <linux_oss@crudebyte.com>,
	dhowells@redhat.com, linux-mm@kvack.org,
	Marc Dionne <marc.dionne@auristor.com>,
	linux-afs@lists.infradead.org, Paulo Alcantara <pc@manguebit.com>,
	linux-cifs@vger.kernel.org, Steve French <smfrench@gmail.com>,
	linux-cachefs@redhat.com, Gao Xiang <hsiangkao@linux.alibaba.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>,
	ceph-devel@vger.kernel.org,
	Eric Van Hensbergen <ericvh@kernel.org>,
	linux-nfs@vger.kernel.org, netdev@vger.kernel.org,
	v9fs@lists.linux.dev, Jeff Layton <jlayton@kernel.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	netfs@lists.linux.dev, linux-erofs@lists.ozlabs.org
Subject: Re: [PATCH 23/26] netfs: Cut over to using new writeback code
Date: Mon, 08 Apr 2024 16:53:17 +0100	[thread overview]
Message-ID: <877902.1712591597@warthog.procyon.org.uk> (raw)
In-Reply-To: <20240328163424.2781320-24-dhowells@redhat.com>

David Howells <dhowells@redhat.com> wrote:

> +		/* Wait for writeback to complete.  The writeback engine owns
> +		 * the info in folio->private and may change it until it
> +		 * removes the WB mark.
> +		 */
> +		if (folio_wait_writeback_killable(folio)) {
> +			ret = written ? -EINTR : -ERESTARTSYS;
> +			goto error_folio_unlock;
> +		}
> +

It turns out that this really kills performance with fio with as many jobs as
cpus.  It's taking up to around 8x longer to complete a pwrite() on average
and perf shows a 30% of the CPU cycles are being spent in contention on the
i_rwsem.

The reason this was added here is that writeback cannot take the folio lock in
order to clean up folio->private without risking deadlock vs the truncation
routines (IIRC).

I can mitigate this by skipping the wait if folio->private is not set and if
we're not going to attach anything there (see attached).  Note that if
writeout is ongoing and there is nothing attached to ->private, then we should
not be engaging write-streaming mode and attaching a new netfs_folio (and if
we did, we'd flush the page and wait for it anyway).

The other possibility is if we have a writeback group to set.  This only
applies to ceph for the moment and is something that will need dealing with
if/when ceph is made to use this code.

David
---

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 1eff9413eb1b..279b296f8014 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -255,7 +255,8 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
 		 * the info in folio->private and may change it until it
 		 * removes the WB mark.
 		 */
-		if (folio_wait_writeback_killable(folio)) {
+		if (folio_get_private(folio) &&
+		    folio_wait_writeback_killable(folio)) {
 			ret = written ? -EINTR : -ERESTARTSYS;
 			goto error_folio_unlock;
 		}


  parent reply	other threads:[~2024-04-08 15:53 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 16:33 [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache David Howells
2024-03-28 16:33 ` David Howells
2024-03-28 16:33 ` [PATCH 01/26] cifs: Fix duplicate fscache cookie warnings David Howells
2024-03-28 16:33   ` David Howells
2024-04-15 11:25   ` Jeff Layton
2024-04-15 11:25     ` Jeff Layton
2024-04-15 13:03   ` David Howells
2024-04-15 13:03     ` David Howells
2024-04-15 22:51     ` Steve French
2024-04-15 22:51       ` Steve French
2024-04-16 22:40     ` David Howells
2024-04-16 22:40       ` David Howells
2024-03-28 16:33 ` [PATCH 02/26] 9p: Clean up some kdoc and unused var warnings David Howells
2024-03-28 16:33   ` David Howells
2024-03-28 16:33 ` [PATCH 03/26] netfs: Update i_blocks when write committed to pagecache David Howells
2024-03-28 16:33   ` David Howells
2024-04-15 11:28   ` Jeff Layton
2024-04-15 11:28     ` Jeff Layton
2024-04-16 22:47   ` David Howells
2024-04-16 22:47     ` David Howells
2024-03-28 16:33 ` [PATCH 04/26] netfs: Replace PG_fscache by setting folio->private and marking dirty David Howells
2024-03-28 16:33   ` David Howells
2024-03-28 16:33 ` [PATCH 05/26] mm: Remove the PG_fscache alias for PG_private_2 David Howells
2024-03-28 16:33   ` David Howells
2024-03-28 16:33 ` [PATCH 06/26] netfs: Remove deprecated use of PG_private_2 as a second writeback flag David Howells
2024-03-28 16:33   ` David Howells
2024-03-28 16:33 ` [PATCH 07/26] netfs: Make netfs_io_request::subreq_counter an atomic_t David Howells
2024-03-28 16:33   ` David Howells
2024-03-28 16:34 ` [PATCH 08/26] netfs: Use subreq_counter to allocate subreq debug_index values David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 09/26] mm: Provide a means of invalidation without using launder_folio David Howells
2024-03-28 16:34   ` David Howells
2024-04-15 11:41   ` Jeff Layton
2024-04-15 11:41     ` Jeff Layton
2024-04-17  9:02   ` David Howells
2024-03-28 16:34 ` [PATCH 10/26] cifs: Use alternative invalidation to " David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 11/26] 9p: " David Howells
2024-03-28 16:34   ` David Howells
2024-04-15 11:43   ` Jeff Layton
2024-04-15 11:43     ` Jeff Layton
2024-04-16 23:03   ` David Howells
2024-04-16 23:03     ` David Howells
2024-03-28 16:34 ` [PATCH 12/26] afs: " David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 13/26] netfs: Remove ->launder_folio() support David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 14/26] netfs: Use mempools for allocating requests and subrequests David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 15/26] mm: Export writeback_iter() David Howells
2024-03-28 16:34   ` David Howells
2024-04-03  8:59   ` Christoph Hellwig
2024-04-03  8:59     ` Christoph Hellwig
2024-04-03 10:10   ` David Howells
2024-04-03 10:10     ` David Howells
2024-04-03 10:14     ` Christoph Hellwig
2024-04-03 10:14       ` Christoph Hellwig
2024-04-03 10:55     ` David Howells
2024-04-03 10:55       ` David Howells
2024-04-03 12:41       ` Christoph Hellwig
2024-04-03 12:41         ` Christoph Hellwig
2024-04-03 12:58       ` David Howells
2024-04-03 12:58         ` David Howells
2024-04-05  6:53         ` Christoph Hellwig
2024-04-05  6:53           ` Christoph Hellwig
2024-04-05 10:15         ` Christian Brauner
2024-04-05 10:15           ` Christian Brauner
2024-03-28 16:34 ` [PATCH 16/26] netfs: Switch to using unsigned long long rather than loff_t David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 17/26] netfs: Fix writethrough-mode error handling David Howells
2024-03-28 16:34   ` David Howells
2024-04-15 12:40   ` Jeff Layton
2024-04-15 12:40     ` Jeff Layton
2024-04-17  9:04   ` David Howells
2024-04-17  9:04     ` David Howells
2024-03-28 16:34 ` [PATCH 18/26] netfs: Add some write-side stats and clean up some stat names David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 19/26] netfs: New writeback implementation David Howells
2024-03-28 16:34   ` David Howells
2024-03-29 10:34   ` Naveen Mamindlapalli
2024-03-29 10:34     ` Naveen Mamindlapalli
2024-03-30  1:06     ` Vadim Fedorenko
2024-03-30  1:06       ` Vadim Fedorenko
2024-03-30  1:06       ` Vadim Fedorenko
2024-03-30  1:06       ` Vadim Fedorenko
2024-03-30  1:03   ` Vadim Fedorenko
2024-03-30  1:03     ` Vadim Fedorenko
2024-03-28 16:34 ` [PATCH 20/26] netfs, afs: Implement helpers for new write code David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 21/26] netfs, 9p: " David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 22/26] netfs, cachefiles: " David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 23/26] netfs: Cut over to using new writeback code David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 24/26] netfs: Remove the old " David Howells
2024-03-28 16:34   ` David Howells
2024-04-15 12:20   ` Jeff Layton
2024-04-15 12:20     ` Jeff Layton
2024-04-17 10:36   ` David Howells
2024-04-17 10:36     ` David Howells
2024-03-28 16:34 ` [PATCH 25/26] netfs: Miscellaneous tidy ups David Howells
2024-03-28 16:34   ` David Howells
2024-03-28 16:34 ` [PATCH 26/26] netfs, afs: Use writeback retry to deal with alternate keys David Howells
2024-03-28 16:34   ` David Howells
2024-04-01 13:53   ` Simon Horman
2024-04-01 13:53     ` Simon Horman
2024-04-02  8:32   ` David Howells
2024-04-02  8:32     ` David Howells
2024-04-10 17:38     ` Simon Horman
2024-04-10 17:38       ` Simon Horman
2024-04-11  7:09     ` David Howells
2024-04-11  7:09       ` David Howells
2024-04-02  8:46 ` [PATCH 19/26] netfs: New writeback implementation David Howells
2024-04-02  8:46   ` David Howells
2024-04-02 10:48 ` [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache Christian Brauner
2024-04-02 10:48   ` Christian Brauner
2024-04-04  7:51 ` [PATCH 21/26] netfs, 9p: Implement helpers for new write code David Howells
2024-04-04  7:51   ` David Howells
2024-04-04  8:01 ` David Howells
2024-04-04  8:01   ` David Howells
2024-04-08 15:53 ` David Howells [this message]
2024-04-08 15:53   ` [PATCH 23/26] netfs: Cut over to using new writeback code David Howells
2024-04-15 12:49 ` [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache Jeff Layton
2024-04-15 12:49   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877902.1712591597@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=christian@brauner.io \
    --cc=ericvh@kernel.org \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=marc.dionne@auristor.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfs@lists.linux.dev \
    --cc=pc@manguebit.com \
    --cc=smfrench@gmail.com \
    --cc=sprasad@microsoft.com \
    --cc=tom@talpey.com \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.