LKML Archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Jason Baron <jbaron@akamai.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: mtk.manpages@gmail.com, mingo@kernel.org, peterz@infradead.org,
	viro@ftp.linux.org.uk, normalperson@yhbt.net, m@silodev.com,
	corbet@lwn.net, luto@amacapital.net,
	torvalds@linux-foundation.org, hagen@jauu.net,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag
Date: Tue, 15 Mar 2016 06:47:45 +1300	[thread overview]
Message-ID: <56E6F941.9040307@gmail.com> (raw)
In-Reply-To: <56E6D0ED.20609@akamai.com>

[Restoring CC, which I see I accidentally dropped, one iteration back.]

Hi Jason,

Thanks for the review. I've tweaked one piece to respond to your
feedback. But I also have another new question below.

On 03/15/2016 03:55 AM, Jason Baron wrote:
> On 03/11/2016 06:25 PM, Michael Kerrisk (man-pages) wrote:
>> On 03/11/2016 09:51 PM, Jason Baron wrote:
>>> On 03/11/2016 03:30 PM, Michael Kerrisk (man-pages) wrote:

[...]

> Hi Michael,
> 
> Looks good. One comment below.
> 
> Thanks,
> 
>>        EPOLLEXCLUSIVE (since Linux 4.5)
>>               Sets  an  exclusive  wakeup  mode  for  the  epoll  file
>>               descriptor  that  is  being  attached to the target file
>>               descriptor, fd.  When a wakeup event occurs and multiple
>>               epoll  file  descriptors are attached to the same target
>>               file using EPOLLEXCLUSIVE, one or more of the epoll file
>>               descriptors  will  receive  an event with epoll_wait(2).
>>               The default in this scenario (when EPOLLEXCLUSIVE is not
>>               set)  is  for  all  epoll file descriptors to receive an
>>               event.  EPOLLEXCLUSIVE is thus useful for avoiding thun‐
>>               dering herd problems in certain scenarios.
>>
>>               If  the  same  file  descriptor  is  in  multiple  epoll
>>               instances, some with the EPOLLEXCLUSIVE flag, and others
>>               without,   then   events  will  provided  to  all  epoll
>>               instances that did not specify  EPOLLEXCLUSIVE,  and  at
>>               least  one  of  the  epoll  instances  that  did specify
>>               EPOLLEXCLUSIVE.
>>
>>               The following values may  be  specified  in  conjunction
>>               with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
>>               EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
>>               but  are  ignored (as usual).  Attempts to specify other
> 
> I'm not sure 'ignored' is the right wording here. 'EPOLLHUP' and
> 'EPOLERR' are always included in the set of events when something is
> added as EPOLLEXCLUSIVE. This is consistent with the non-EPOLLEXCLUSIVE
> add case. 

Yes.

> So 'EPOLLHUP' and 'EPOLERR' may be specified but will be
> included in the set of events on an add, whether they are specified or not.

Yes. I understand your discomfort with the work "ignored", but the 
problem was that, because it made special mention of EPOLLHUP and EPOLLERR,
your proposed text made it sound as though EPOLLEXCLUSIVE somehow was
special with respect to these two flags. I wanted to clarify that it is not.
How about this:

              The following values may  be  specified  in  conjunction
              with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
              EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
              but  this  is  not  required: as usual, these events are
              always reported if they  occur,  regardless  of  whether
              they are specified in events.
?

>>               values in events yield an error.  EPOLLEXCLUSIVE may  be
>>               used  only  in  an  EPOLL_CTL_ADD operation; attempts to
>>               employ  it  with  EPOLL_CTL_MOD  yield  an  error.    If
>>               EPOLLEXCLUSIVE has set using epoll_ctl(2), then a subse‐
>>               quent EPOLL_CTL_MOD on the same epfd, fd pair yields  an
b>>               error.  An epoll_ctl(2) that specifies EPOLLEXCLUSIVE in
>>               events and specifies the target file descriptor fd as an
>>               epoll  instance will likewise fail.  The error in all of
>>               these cases is EINVAL.
>>
>>    ERRORS
>>        EINVAL An invalid event type was specified along with  EPOLLEX‐
>>               CLUSIVE in events.
>>
>>        EINVAL op was EPOLL_CTL_MOD and events included EPOLLEXCLUSIVE.
>>
>>        EINVAL op  was  EPOLL_CTL_MOD  and  the EPOLLEXCLUSIVE flag has
>>               previously been applied to this epfd, fd pair.
>>
>>        EINVAL EPOLLEXCLUSIVE was specified in event and fd  is  refers
>>               to an epoll instance.

Returning to the second sentence in this description:

              When a wakeup event occurs and multiple epoll file descrip‐
              tors are attached to the same target file using EPOLLEXCLU‐
              SIVE, one or  more  of  the  epoll  file  descriptors  will
              receive  an  event with epoll_wait(2).

There is a point that is unclear to me: what does "target file" refer to?
Is it an open file description (aka open file table entry) or an inode?
I suspect the former, but it was not clear in your original text.

To make this point even clearer, here are two scenarios I'm thinking of.
In each case, we're talking of monitoring the read end of a FIFO.

===

Scenario 1:

We have three processes each of which
1. Creates an epoll instance
2. Opens the read end of the FIFO
3. Adds the read end of the FIFO to the epoll instance, specifying
   EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

===

Scenario 3

A parent process opens the read end of a FIFO and then calls
fork() three times to create three children. Each child then:

1. Creates an epoll instance
2. Adds the read end of the FIFO to the epoll instance, specifying
EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

===

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  parent reply	other threads:[~2016-03-14 17:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-08  3:23 [PATCH] epoll: add exclusive wakeups flag Jason Baron
2015-12-08  3:23 ` [PATCH] epoll: add EPOLLEXCLUSIVE flag Jason Baron
2016-01-28  7:16 ` [PATCH] epoll: add exclusive wakeups flag Michael Kerrisk (man-pages)
2016-01-28 17:57   ` Jason Baron
2016-01-29  8:14     ` Michael Kerrisk (man-pages)
2016-02-01 19:42       ` Jason Baron
2016-03-10 18:53       ` Jason Baron
2016-03-10 19:47         ` Michael Kerrisk (man-pages)
2016-03-10 19:58         ` Michael Kerrisk (man-pages)
2016-03-10 20:40           ` Jason Baron
2016-03-11 20:30             ` Michael Kerrisk (man-pages)
     [not found]               ` <56E32FC5.4030902@akamai.com>
     [not found]                 ` <56E353CF.6050503@gmail.com>
     [not found]                   ` <56E6D0ED.20609@akamai.com>
2016-03-14 17:47                     ` Michael Kerrisk (man-pages) [this message]
2016-03-14 19:32                       ` Jason Baron
2016-03-14 20:01                         ` Michael Kerrisk (man-pages)
2016-03-14 21:03                           ` Michael Kerrisk (man-pages)
2016-03-14 22:35                             ` Jason Baron
2016-03-14 23:09                               ` Madars Vitolins
2016-03-14 23:26                               ` Michael Kerrisk (man-pages)
2016-03-15  2:36                                 ` Jason Baron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56E6F941.9040307@gmail.com \
    --to=mtk.manpages@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=hagen@jauu.net \
    --cc=jbaron@akamai.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=m@silodev.com \
    --cc=mingo@kernel.org \
    --cc=normalperson@yhbt.net \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).