All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Fox Chen <foxhlchen@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>, Al Viro <viro@zeniv.linux.org.uk>,
	Eric Sandeen <sandeen@sandeen.net>,
	Brice Goglin <brice.goglin@gmail.com>,
	Rick Lindsley <ricklind@linux.vnet.ibm.com>,
	David Howells <dhowells@redhat.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 0/5] kernfs: proposed locking and concurrency improvement
Date: Mon, 17 May 2021 09:32:03 +0800	[thread overview]
Message-ID: <da58dcefb59d2b51d95d1dfc012ba058bc77f23b.camel@themaw.net> (raw)
In-Reply-To: <CAC2o3DL1VwbLgajSYSR_UPL-53cjHDp+X63CerQsZ8tgNgO=-A@mail.gmail.com>

On Fri, 2021-05-14 at 10:34 +0800, Fox Chen wrote:
> On Fri, May 14, 2021 at 9:34 AM Ian Kent <raven@themaw.net> wrote:
> > 
> > On Thu, 2021-05-13 at 23:37 +0800, Fox Chen wrote:
> > > Hi Ian
> > > 
> > > On Thu, May 13, 2021 at 10:10 PM Ian Kent <raven@themaw.net>
> > > wrote:
> > > > 
> > > > On Wed, 2021-05-12 at 16:54 +0800, Fox Chen wrote:
> > > > > On Wed, May 12, 2021 at 4:47 PM Fox Chen
> > > > > <foxhlchen@gmail.com>
> > > > > wrote:
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > I ran it on my benchmark (
> > > > > > https://github.com/foxhlchen/sysfs_benchmark).
> > > > > > 
> > > > > > machine: aws c5 (Intel Xeon with 96 logical cores)
> > > > > > kernel: v5.12
> > > > > > benchmark: create 96 threads and bind them to each core
> > > > > > then
> > > > > > run
> > > > > > open+read+close on a sysfs file simultaneously for 1000
> > > > > > times.
> > > > > > result:
> > > > > > Without the patchset, an open+read+close operation takes
> > > > > > 550-
> > > > > > 570
> > > > > > us,
> > > > > > perf shows significant time(>40%) spending on mutex_lock.
> > > > > > After applying it, it takes 410-440 us for that operation
> > > > > > and
> > > > > > perf
> > > > > > shows only ~4% time on mutex_lock.
> > > > > > 
> > > > > > It's weird, I don't see a huge performance boost compared
> > > > > > to
> > > > > > v2,
> > > > > > even
> > > > > 
> > > > > I meant I don't see a huge performance boost here and it's
> > > > > way
> > > > > worse
> > > > > than v2.
> > > > > IIRC, for v2 fastest one only takes 40us
> > > > 
> > > > Thanks Fox,
> > > > 
> > > > I'll have a look at those reports but this is puzzling.
> > > > 
> > > > Perhaps the added overhead of the check if an update is
> > > > needed is taking more than expected and more than just
> > > > taking the lock and being done with it. Then there's
> > > > the v2 series ... I'll see if I can dig out your reports
> > > > on those too.
> > > 
> > > Apologies, I was mistaken, it's compared to V3, not V2.  The
> > > previous
> > > benchmark report is here.
> > > https://lore.kernel.org/linux-fsdevel/CAC2o3DKNc=sL2n8291Dpiyb0bRHaX=nd33ogvO_LkJqpBj-YmA@mail.gmail.com/
> > 
> > Are all these tests using a single file name in the open/read/close
> > loop?
> 
> Yes,  because It's easy to implement yet enough to trigger the
> mutex_lock.
> 
> And you are right It's not a real-life pattern, but on the bright
> side, it proves there is no original mutex_lock problem anymore. :)

I've been looking at your reports and they are quite interesting.

> 
> > That being the case the per-object inode lock will behave like a
> > mutex and once contention occurs any speed benefits of a spinlock
> > over a mutex (or rwsem) will disappear.
> > 
> > In this case changing from a write lock to a read lock in those
> > functions and adding the inode mutex will do nothing but add the
> > overhead of taking the read lock. And similarly adding the update
> > check function also just adds overhead and, as we see, once
> > contention starts it has a cumulative effect that's often not
> > linear.
> > 
> > The whole idea of a read lock/per-object spin lock was to reduce
> > the possibility of contention for paths other than the same path
> > while not impacting same path accesses too much for an overall
> > gain. Based on this I'm thinking the update check function is
> > probably not worth keeping, it just adds unnecessary churn and
> > has a negative impact for same file contention access patterns.

The reports indicate (to me anyway) that the slowdown isn't
due to kernfs. It looks more like kernfs is now putting pressure
on the VFS, mostly on the file table lock but it looks like
there's a mild amount of contention on a few other locks as well
now.

That's a whole different problem and those file table handling
functions don't appear to have any obvious problems so they are
doing what they have to do and that can't be avoided.

That's definitely out of scope for these changes.

And, as you'd expect, once any appreciable amount of contention
happens our measurements go out the window, certainly with
respect to kernfs.

It also doesn't change my option that checking if an inode
attribute update is needed in kernfs isn't useful since, IIUC
that file table lock contention would result even if you were
using different paths.

So I'll drop that patch from the series.

Ian
> > 
> > I think that using multiple paths, at least one per test process
> > (so if you are running 16 processes use at least 16 different
> > files, the same in each process), and selecting one at random
> > for each loop of the open would better simulate real world
> > access patterns.
> > 
> > 
> > Ian
> > 
> 
> 
> thanks,
> fox



  reply	other threads:[~2021-05-17  1:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12  0:38 [PATCH v4 0/5] kernfs: proposed locking and concurrency improvement Ian Kent
2021-05-12  0:38 ` [PATCH v4 1/5] kernfs: move revalidate to be near lookup Ian Kent
2021-05-12  0:38 ` [PATCH v4 2/5] kernfs: use VFS negative dentry caching Ian Kent
2021-05-12  0:39 ` [PATCH v4 3/5] kernfs: switch kernfs to use an rwsem Ian Kent
2021-05-12  0:39 ` [PATCH v4 4/5] kernfs: use i_lock to protect concurrent inode updates Ian Kent
2021-05-12  0:39 ` [PATCH v4 5/5] kernfs: add kernfs_need_inode_refresh() Ian Kent
2021-05-12  6:21 ` [PATCH v4 0/5] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
2021-05-12  7:16   ` Fox Chen
2021-05-12  8:47     ` Fox Chen
2021-05-12  8:54       ` Fox Chen
2021-05-13 14:10         ` Ian Kent
2021-05-13 15:37           ` Fox Chen
2021-05-14  1:34             ` Ian Kent
2021-05-14  2:34               ` Fox Chen
2021-05-17  1:32                 ` Ian Kent [this message]
2021-05-18  8:26                   ` Fox Chen
2021-05-27  1:23                   ` Ian Kent
2021-05-27  6:50                     ` Greg Kroah-Hartman
2021-05-28  5:45                       ` Ian Kent
2021-05-13 13:50   ` Ian Kent
2021-05-13 15:19     ` Greg Kroah-Hartman
2021-05-14  1:02       ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da58dcefb59d2b51d95d1dfc012ba058bc77f23b.camel@themaw.net \
    --to=raven@themaw.net \
    --cc=brice.goglin@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=foxhlchen@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mtosatti@redhat.com \
    --cc=ricklind@linux.vnet.ibm.com \
    --cc=sandeen@sandeen.net \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.