From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754028AbcCJTrV (ORCPT ); Thu, 10 Mar 2016 14:47:21 -0500 Received: from mail-pa0-f42.google.com ([209.85.220.42]:36191 "EHLO mail-pa0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752736AbcCJTrM (ORCPT ); Thu, 10 Mar 2016 14:47:12 -0500 Subject: Re: [PATCH] epoll: add exclusive wakeups flag To: Daniel Borkmann , akpm@linux-foundation.org References: <56A9C03B.7020104@gmail.com> <56AA56A2.3000700@akamai.com> <56AB1F6C.7000609@gmail.com> <56E1C2B5.2040905@akamai.com> Cc: mtk.manpages@gmail.com, mingo@kernel.org, peterz@infradead.org, viro@ftp.linux.org.uk, normalperson@yhbt.net, m@silodev.com, corbet@lwn.net, luto@amacapital.net, torvalds@linux-foundation.org, hagen@jauu.net, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org From: "Michael Kerrisk (man-pages)" Message-ID: <56E1CF35.2070005@gmail.com> Date: Thu, 10 Mar 2016 20:47:01 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56E1C2B5.2040905@akamai.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jason, > Ok, here's some updated text: > > EPOLLEXCLUSIVE > > Sets an exclusive wakeup mode for the epfd file descriptor that is being > attached to the target file descriptor, fd. When a wakeup event occurs > and multiple epfd file descriptors are attached to the same target file > using EPOLLEXCLUSIVE, one or more epfds will receive an event with > epoll_wait(2). The default in this scenario (when EPOLLEXCLUSIVE is not > set) is for all epfds to receive an event. > > The events supported by EPOLLEXCLUSIVE are: EPOLLIN, EPOLLOUT, EPOLLERR, > EPOLLHUP, EPOLLWAKEUP, and EPOLLET. epoll_wait(2) will always wait for > EPOLLERR and EPOLLHUP; it is not necessary to set it in events. If > EPOLLEXCLUSIVE is set using epoll_ctl(2), then a subsequent > EPOLL_CTL_MOD on the same epfd, fd pair will retrun -EINVAL. An > epoll_ctl(2) that specifies EPOLLEXCLUSIVE in events and specifies the > target file descriptor fd as an epoll instance will return -EINVAL > as well. So, I worked that up into the following text: EPOLLEXCLUSIVE (since Linux 4.5) Sets an exclusive wakeup mode for the epoll file descriptor that is being attached to the target file descriptor, fd. When a wakeup event occurs and multiple epoll file descriptors are attached to the same target file using EPOLLEXCLUSIVE, one or more of the epoll file descriptors will receive an event with epoll_wait(2). The default in this scenario (when EPOLLEXCLUSIVE is not set) is for all epoll file descriptors to receive an event. EPOLLEXCLUSIVE is thus useful for avoiding thun‐ dering herd problems in certain scenarios. If the same file descriptor is in multiple epoll instances, some with the EPOLLEXCLUSIVE flag, and others without, then events will provided to all epoll instances that did not specify EPOLLEXCLUSIVE, and at least one of the epoll instances that did specify EPOLLEXCLUSIVE. The following values may be specified in conjunction with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and EPOLLET. EPOLLHUP and EPOLLERR can also be specified, but are ignored (as usual). Attempts to specify other values in events yield an error. EPOLLEXCLUSIVE may be used only in an EPOLL_CTL_ADD operation; attempts to employ it with EPOLL_CTL_MOD yield an error. If EPOLLEXCLUSIVE has set using epoll_ctl(2), then a subse‐ quent EPOLL_CTL_MOD on the same epfd, fd pair yields an error. An epoll_ctl(2) that specifies EPOLLEXCLUSIVE in events and specifies the target file descriptor fd as an epoll instance will likewise fail. The error in all of these cases is EINVAL. ERRORS EINVAL An invalid event type was specified along with EPOLLEX‐ CLUSIVE in events. EINVAL op was EPOLL_CTL_MOD and events included EPOLLEXCLUSIVE. EINVAL op was EPOLL_CTL_MOD and the EPOLLEXCLUSIVE flag has previously been applied to this epfd, fd pair. EINVAL EPOLLEXCLUSIVE was specified in event and fd is refers to an epoll instance. Is there anything that needs to be fixed in the above text? Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/