From: Jason Baron <jbaron@akamai.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Eric Wong <normalperson@yhbt.net>,
Davide Libenzi <davidel@xmailserver.org>,
Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
edumazet@google.com
Subject: Re: [PATCH v2 0/2] Add epoll round robin wakeup mode
Date: Tue, 17 Feb 2015 22:15:51 -0500 [thread overview]
Message-ID: <54E403E7.2060209@akamai.com> (raw)
In-Reply-To: <CALCETrWg9sdyoKg0-BkwKQgyANvJybQ_wqjTfvYEGW1+S1J5Bw@mail.gmail.com>
On 02/17/2015 04:09 PM, Andy Lutomirski wrote:
> On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron <jbaron@akamai.com> wrote:
>> On 02/17/2015 02:46 PM, Andy Lutomirski wrote:
>>> On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron <jbaron@akamai.com> wrote:
>>>> When we are sharing a wakeup source among multiple epoll fds, we end up with
>>>> thundering herd wakeups, since there is currently no way to add to the
>>>> wakeup source exclusively. This series introduces 2 new epoll flags,
>>>> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN
>>>> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly
>>>> distribute the wakeups. This patch was originally motivated by a desire to
>>>> improve wakeup balance and cpu usage for a listen socket() shared amongst
>>>> multiple epoll fd sets.
>>>>
>>>> See: http://lwn.net/Articles/632590/ for previous test program and testing
>>>> resutls.
>>>>
>>>> Epoll manpage text:
>>>>
>>>> EPOLLEXCLUSIVE
>>>> Provides exclusive wakeups when attaching multiple epoll fds to a
>>>> shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation.
>>>>
>>>> EPOLLROUNDROBIN
>>>> Provides balancing for exclusive wakeups when attaching multiple epoll
>>>> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and
>>>> must be specified with an EPOLL_CTL_ADD operation.
>>>>
>>>> Thanks,
>>> What permissions do you need on the file descriptor to do this? This
>>> will be the first case where a poll-like operation has side effects,
>>> and that's rather weird IMO.
>>>
>> So in the case where you have both non-exclusive and exclusive
>> waiters, all of the non-exclusive waiters will continue to get woken
>> up. However, I think you're getting at having multiple exclusive
>> waiters and potentially 'starving' out other exclusive waiters.
>>
>> In general, I think wait queues are associated with a 'struct file',
>> so I think unless you are sharing your fd table, this isn't an issue.
>> However, there may be cases where this is not true? In which
>> case, perhaps, we could limit this to CAP_SYS_ADMIN...
> There's also SCM_RIGHTS, which can be used in conjunction with file
> sealing and such.
>
> In general, I feel like this patch series solves a problem that isn't
> well understood and does it by adding a rather strange new mechanism.
> Is there really a problem that can't be addressed by more normal epoll
> features?
>
> --Andy
hmm....so I dug through some of the Linux archives a bit and this
problem seems to crop up every so often without resolution.
So I do believe that its an issue that ppl are more generally
interested in.
See:
http://lkml.iu.edu/hypermail/linux/kernel/1201.1/02620.html
http://marc.info/?l=linux-kernel&m=128638781921073&w=2
In the latter thread, Linus suggests adding it to the "requested events"
field to poll: http://marc.info/?l=linux-kernel&m=128639416832335&w=2
So, I think that this series at least moves in that suggested direction.
Thanks,
-Jason
prev parent reply other threads:[~2015-02-18 3:15 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-17 19:33 [PATCH v2 0/2] Add epoll round robin wakeup mode Jason Baron
2015-02-17 19:33 ` [PATCH v2 1/2] sched/wait: add " Jason Baron
2015-02-17 19:33 ` [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN Jason Baron
2015-02-18 8:07 ` Ingo Molnar
2015-02-18 15:42 ` Jason Baron
2015-02-18 16:33 ` Ingo Molnar
2015-02-18 17:38 ` Jason Baron
2015-02-18 17:45 ` Ingo Molnar
2015-02-18 17:51 ` Ingo Molnar
2015-02-18 22:18 ` Eric Wong
2015-02-19 3:26 ` Jason Baron
2015-02-22 0:24 ` Eric Wong
2015-02-25 15:48 ` Jason Baron
2015-02-18 23:12 ` Andy Lutomirski
[not found] ` <CAPh34mcPNQELwZCDTHej+HK=bpWgJ=jb1LeCtKoUHVgoDJOJoQ@mail.gmail.com>
2015-02-27 22:24 ` Jason Baron
2015-02-17 19:46 ` [PATCH v2 0/2] Add epoll round robin wakeup mode Andy Lutomirski
2015-02-17 20:33 ` Jason Baron
2015-02-17 21:09 ` Andy Lutomirski
2015-02-18 3:15 ` Jason Baron [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54E403E7.2060209@akamai.com \
--to=jbaron@akamai.com \
--cc=akpm@linux-foundation.org \
--cc=davidel@xmailserver.org \
--cc=edumazet@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=normalperson@yhbt.net \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).