From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752889AbbBQVJc (ORCPT ); Tue, 17 Feb 2015 16:09:32 -0500 Received: from mail-lb0-f178.google.com ([209.85.217.178]:32994 "EHLO mail-lb0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751758AbbBQVJa (ORCPT ); Tue, 17 Feb 2015 16:09:30 -0500 MIME-Version: 1.0 In-Reply-To: <54E3A591.2050806@akamai.com> References: <54E3A591.2050806@akamai.com> From: Andy Lutomirski Date: Tue, 17 Feb 2015 13:09:08 -0800 Message-ID: Subject: Re: [PATCH v2 0/2] Add epoll round robin wakeup mode To: Jason Baron Cc: Peter Zijlstra , Ingo Molnar , Al Viro , Andrew Morton , Eric Wong , Davide Libenzi , Michael Kerrisk-manpages , "linux-kernel@vger.kernel.org" , Linux FS Devel , Linux API Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 17, 2015 at 12:33 PM, Jason Baron wrote: > On 02/17/2015 02:46 PM, Andy Lutomirski wrote: >> On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: >>> When we are sharing a wakeup source among multiple epoll fds, we end up with >>> thundering herd wakeups, since there is currently no way to add to the >>> wakeup source exclusively. This series introduces 2 new epoll flags, >>> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN >>> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly >>> distribute the wakeups. This patch was originally motivated by a desire to >>> improve wakeup balance and cpu usage for a listen socket() shared amongst >>> multiple epoll fd sets. >>> >>> See: http://lwn.net/Articles/632590/ for previous test program and testing >>> resutls. >>> >>> Epoll manpage text: >>> >>> EPOLLEXCLUSIVE >>> Provides exclusive wakeups when attaching multiple epoll fds to a >>> shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. >>> >>> EPOLLROUNDROBIN >>> Provides balancing for exclusive wakeups when attaching multiple epoll >>> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and >>> must be specified with an EPOLL_CTL_ADD operation. >>> >>> Thanks, >> What permissions do you need on the file descriptor to do this? This >> will be the first case where a poll-like operation has side effects, >> and that's rather weird IMO. >> > > So in the case where you have both non-exclusive and exclusive > waiters, all of the non-exclusive waiters will continue to get woken > up. However, I think you're getting at having multiple exclusive > waiters and potentially 'starving' out other exclusive waiters. > > In general, I think wait queues are associated with a 'struct file', > so I think unless you are sharing your fd table, this isn't an issue. > However, there may be cases where this is not true? In which > case, perhaps, we could limit this to CAP_SYS_ADMIN... There's also SCM_RIGHTS, which can be used in conjunction with file sealing and such. In general, I feel like this patch series solves a problem that isn't well understood and does it by adding a rather strange new mechanism. Is there really a problem that can't be addressed by more normal epoll features? --Andy > > Thanks, > > -Jason > -- Andy Lutomirski AMA Capital Management, LLC