From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752255AbbBQUdZ (ORCPT ); Tue, 17 Feb 2015 15:33:25 -0500 Received: from prod-mail-xrelay07.akamai.com ([72.246.2.115]:64275 "EHLO prod-mail-xrelay07.akamai.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750873AbbBQUdX (ORCPT ); Tue, 17 Feb 2015 15:33:23 -0500 Message-ID: <54E3A591.2050806@akamai.com> Date: Tue, 17 Feb 2015 15:33:21 -0500 From: Jason Baron User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Andy Lutomirski CC: Peter Zijlstra , Ingo Molnar , Al Viro , Andrew Morton , Eric Wong , Davide Libenzi , Michael Kerrisk-manpages , "linux-kernel@vger.kernel.org" , Linux FS Devel , Linux API Subject: Re: [PATCH v2 0/2] Add epoll round robin wakeup mode References: In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/17/2015 02:46 PM, Andy Lutomirski wrote: > On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron wrote: >> When we are sharing a wakeup source among multiple epoll fds, we end up with >> thundering herd wakeups, since there is currently no way to add to the >> wakeup source exclusively. This series introduces 2 new epoll flags, >> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN >> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly >> distribute the wakeups. This patch was originally motivated by a desire to >> improve wakeup balance and cpu usage for a listen socket() shared amongst >> multiple epoll fd sets. >> >> See: http://lwn.net/Articles/632590/ for previous test program and testing >> resutls. >> >> Epoll manpage text: >> >> EPOLLEXCLUSIVE >> Provides exclusive wakeups when attaching multiple epoll fds to a >> shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation. >> >> EPOLLROUNDROBIN >> Provides balancing for exclusive wakeups when attaching multiple epoll >> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and >> must be specified with an EPOLL_CTL_ADD operation. >> >> Thanks, > What permissions do you need on the file descriptor to do this? This > will be the first case where a poll-like operation has side effects, > and that's rather weird IMO. > So in the case where you have both non-exclusive and exclusive waiters, all of the non-exclusive waiters will continue to get woken up. However, I think you're getting at having multiple exclusive waiters and potentially 'starving' out other exclusive waiters. In general, I think wait queues are associated with a 'struct file', so I think unless you are sharing your fd table, this isn't an issue. However, there may be cases where this is not true? In which case, perhaps, we could limit this to CAP_SYS_ADMIN... Thanks, -Jason