LKML Archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Jason Baron <jbaron@akamai.com>
Cc: Davide Libenzi <davidel@xmailserver.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Eric Wong <normalperson@yhbt.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Ingo Molnar <mingo@redhat.com>,
	Linux API <linux-api@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN
Date: Wed, 18 Feb 2015 15:12:43 -0800	[thread overview]
Message-ID: <CALCETrUfs8R-CKwwq2x9xc9z=6aqKzOOaf2ts+TOs-H=W1sfug@mail.gmail.com> (raw)
In-Reply-To: <54E4CE14.5010708@akamai.com>

On Feb 18, 2015 9:38 AM, "Jason Baron" <jbaron@akamai.com> wrote:
>
> On 02/18/2015 11:33 AM, Ingo Molnar wrote:
> > * Jason Baron <jbaron@akamai.com> wrote:
> >
> >>> This has two main advantages: firstly it solves the
> >>> O(N) (micro-)problem, but it also more evenly
> >>> distributes events both between task-lists and within
> >>> epoll groups as tasks as well.
> >> Its solving 2 issues - spurious wakeups, and more even
> >> loading of threads. The event distribution is more even
> >> between 'epoll groups' with this patch, however, if
> >> multiple threads are blocking on a single 'epoll group',
> >> this patch does not affect the the event distribution
> >> there. [...]
> > Regarding your last point, are you sure about that?
> >
> > If we have say 16 epoll threads registered, and if the list
> > is static (no register/unregister activity), then the
> > wakeup pattern is in strict order of the list: threads
> > closer to the list head will be woken more frequently, in a
> > wake-once fashion. So if threads do just quick work and go
> > back to sleep quickly, then typically only the first 2-3
> > threads will get any runtime in practice - the wakeup
> > iteration never gets 'deep' into the list.
> >
> > With the round-robin shuffling of the list, the threads get
> > shuffled to the tail on wakeup, which distributes events
> > evenly: all 16 epoll threads will accumulate an even
> > distribution of runtime, statistically.
> >
> > Have I misunderstood this somehow?
> >
> >
>
> So in the case of multiple threads per epoll set, we currently
> add to the head of wakeup queue exclusively in 'epoll_wait()',
> and then subsequently remove from the queue once
> 'epoll_wait()' returns. So I don't think this patch addresses
> balancing on a per epoll set basis.
>
> I think we could address the case you describe by simply doing
> __add_wait_queue_tail_exclusive() instead of
> __add_wait_queue_exclusive() in epoll_wait(). However, I think
> the userspace API change is less clear since epoll_wait() doesn't
> currently have an 'input' events argument as epoll_ctl() does.

FWIW there's currently discussion about adding a new epoll API for
batch epoll_ctl.  It could be with coordinating with that effort if
some variant could address both use cases.

I'm still nervous about changing the per-fd wakeup stuff to do
anything other than waking everything.  After all, epoll and poll can
be used concurrently.

What about a slightly different approach: could an epoll fd support
multiple contexts?  For example, an fd could be set (with epoll_ctl or
the new batch stuff) to wake an any epoll waiter, one specific epoll
waiter, an epoll waiter preferably on the waking cpu, etc.

This would have the benefit of keeping the wakeup changes localized to
the epoll code.

--Andy

>
> Thanks,
>
> -Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-02-18 23:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-17 19:33 [PATCH v2 0/2] Add epoll round robin wakeup mode Jason Baron
2015-02-17 19:33 ` [PATCH v2 1/2] sched/wait: add " Jason Baron
2015-02-17 19:33 ` [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN Jason Baron
2015-02-18  8:07   ` Ingo Molnar
2015-02-18 15:42     ` Jason Baron
2015-02-18 16:33       ` Ingo Molnar
2015-02-18 17:38         ` Jason Baron
2015-02-18 17:45           ` Ingo Molnar
2015-02-18 17:51             ` Ingo Molnar
2015-02-18 22:18               ` Eric Wong
2015-02-19  3:26               ` Jason Baron
2015-02-22  0:24                 ` Eric Wong
2015-02-25 15:48                   ` Jason Baron
2015-02-18 23:12           ` Andy Lutomirski [this message]
     [not found]   ` <CAPh34mcPNQELwZCDTHej+HK=bpWgJ=jb1LeCtKoUHVgoDJOJoQ@mail.gmail.com>
2015-02-27 22:24     ` Jason Baron
2015-02-17 19:46 ` [PATCH v2 0/2] Add epoll round robin wakeup mode Andy Lutomirski
2015-02-17 20:33   ` Jason Baron
2015-02-17 21:09     ` Andy Lutomirski
2015-02-18  3:15       ` Jason Baron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUfs8R-CKwwq2x9xc9z=6aqKzOOaf2ts+TOs-H=W1sfug@mail.gmail.com' \
    --to=luto@amacapital.net \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=davidel@xmailserver.org \
    --cc=jbaron@akamai.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=normalperson@yhbt.net \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).