LKML Archive mirror
 help / color / mirror / Atom feed
From: Hagen Paul Pfeifer <hagen@jauu.net>
To: torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, Hagen Paul Pfeifer <hagen@jauu.net>
Subject: [PATCH Resend] epoll: add EPOLLEXCLUSIVE support
Date: Wed, 28 Mar 2012 15:57:40 +0200	[thread overview]
Message-ID: <1332943060-18374-1-git-send-email-hagen@jauu.net> (raw)

High performance server sometimes create one listening socket (e.g. port
80), create a epoll file descriptor and add the socket. Afterwards
create SC_NPROCESSORS_ONLN threads and wait for events. This often
result in a thundering herd problem because all CPUs are scheduled.

This patch add an additional flag to epoll_ctl(2) called EPOLLEXCLUSIVE.
If a descriptor is added with this flag only one CPU is scheduled in.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
---
Dave rejected the patch and said not network specific. Because there
is no epoll maintainer this time directly.

 fs/eventpoll.c            |    7 +++++--
 include/linux/eventpoll.h |    3 +++
 2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 629e9ed..16d787f 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -88,7 +88,7 @@
  */
 
 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET | EPOLLEXCLUSIVE)
 
 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -969,7 +969,10 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
 		init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
 		pwq->whead = whead;
 		pwq->base = epi;
-		add_wait_queue(whead, &pwq->wait);
+		if (unlikely(epi->event.events & EPOLLEXCLUSIVE))
+			add_wait_queue_exclusive(whead, &pwq->wait);
+		else
+			add_wait_queue(whead, &pwq->wait);
 		list_add_tail(&pwq->llink, &epi->pwqlist);
 		epi->nwait++;
 	} else {
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 657ab55..d334389 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -26,6 +26,9 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/* Set Exclusive wake up behaviour for the target file descriptor */
+#define EPOLLEXCLUSIVE (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 
-- 
1.7.9.1


             reply	other threads:[~2012-03-28 13:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-28 13:57 Hagen Paul Pfeifer [this message]
2012-03-28 14:09 ` [PATCH Resend] epoll: add EPOLLEXCLUSIVE support richard -rw- weinberger
2012-03-28 16:21   ` Jason Baron
2012-03-28 19:58     ` Hagen Paul Pfeifer
2012-03-29 14:16       ` Jason Baron
2012-03-29 15:05         ` Hagen Paul Pfeifer
2012-03-29 15:53           ` Jason Baron
2012-03-29 16:32             ` Hagen Paul Pfeifer
2012-03-29 18:54               ` Jason Baron
2012-03-29 21:19                 ` Hagen Paul Pfeifer
2012-04-05 22:30           ` Andy Lutomirski
2012-03-29 14:51       ` Hagen Paul Pfeifer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1332943060-18374-1-git-send-email-hagen@jauu.net \
    --to=hagen@jauu.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).