unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* [PATCH 0/6] reduce thundering herds on Linux 4.5+
@ 2021-10-01  3:09 Eric Wong
  2021-10-01  3:09 ` [PATCH 1/6] extconf.rb: get rid of unnecessary checks Eric Wong
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

1-5 are straightforward changes, with 4+5 are probably
being long overdue.

PATCH 6/6 is fairly significant (more detail in that message),
but doesn't benefit completely saturated instances, either.
I don't know if completely saturated instances are that common,
actually...

Yup, unicorn using epoll: WTF :P

Eric Wong (6):
  extconf.rb: get rid of unnecessary checks
  makefile: reduce unnecessary rebuilds
  HACKING: drop outdated information about pandoc
  http_server: get rid of Process.ppid check
  worker_loop: get rid of select() avoidance hack
  use EPOLLEXCLUSIVE on Linux 4.5+

 GNUmakefile                       |   4 +-
 HACKING                           |   7 --
 ext/unicorn_http/c_util.h         |  18 ++---
 ext/unicorn_http/epollexclusive.h | 125 ++++++++++++++++++++++++++++++
 ext/unicorn_http/ext_help.h       |  24 ------
 ext/unicorn_http/extconf.rb       |   6 +-
 ext/unicorn_http/httpdate.c       |   1 +
 ext/unicorn_http/unicorn_http.rl  |   3 +
 lib/unicorn/http_server.rb        |  45 +++++------
 lib/unicorn/select_waiter.rb      |   6 ++
 t/test-lib.sh                     |   3 +-
 test/unit/test_waiter.rb          |  34 ++++++++
 12 files changed, 199 insertions(+), 77 deletions(-)
 create mode 100644 ext/unicorn_http/epollexclusive.h
 create mode 100644 lib/unicorn/select_waiter.rb
 create mode 100644 test/unit/test_waiter.rb

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/6] extconf.rb: get rid of unnecessary checks
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  2021-10-01  3:09 ` [PATCH 2/6] makefile: reduce unnecessary rebuilds Eric Wong
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

SIZEOF_*, *2NUM and NUM2* should all be defined by ruby.h and
dependencies it pulls in since Ruby 2.0 and possibly earlier.

INT_MAX and LLONG_MAX are in limits.h which is POSIX.

HAVE_GMTIME_R is already defined by ruby/config.h, so we
shouldn't have to check for it, either.

Combined, these changes speed up extconf.rb by several seconds.
---
 ext/unicorn_http/c_util.h   | 18 +++++-------------
 ext/unicorn_http/ext_help.h | 24 ------------------------
 ext/unicorn_http/extconf.rb |  5 -----
 ext/unicorn_http/httpdate.c |  1 +
 4 files changed, 6 insertions(+), 42 deletions(-)

diff --git a/ext/unicorn_http/c_util.h b/ext/unicorn_http/c_util.h
index ab1fc0e..5774615 100644
--- a/ext/unicorn_http/c_util.h
+++ b/ext/unicorn_http/c_util.h
@@ -8,23 +8,15 @@
 
 #include <unistd.h>
 #include <assert.h>
+#include <limits.h>
 
 #define MIN(a,b) (a < b ? a : b)
 #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
 
-#ifndef SIZEOF_OFF_T
-#  define SIZEOF_OFF_T 4
-#  warning SIZEOF_OFF_T not defined, guessing 4.  Did you run extconf.rb?
-#endif
-
-#if SIZEOF_OFF_T == 4
-#  define UH_OFF_T_MAX 0x7fffffff
-#elif SIZEOF_OFF_T == 8
-#  if SIZEOF_LONG == 4
-#    define UH_OFF_T_MAX 0x7fffffffffffffffLL
-#  else
-#    define UH_OFF_T_MAX 0x7fffffffffffffff
-#  endif
+#if SIZEOF_OFF_T == SIZEOF_INT
+#  define UH_OFF_T_MAX INT_MAX
+#elif SIZEOF_OFF_T == SIZEOF_LONG_LONG
+#  define UH_OFF_T_MAX LLONG_MAX
 #else
 #  error off_t size unknown for this platform!
 #endif /* SIZEOF_OFF_T check */
diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
index 747c36c..86a187e 100644
--- a/ext/unicorn_http/ext_help.h
+++ b/ext/unicorn_http/ext_help.h
@@ -8,30 +8,6 @@
 #  define assert_frozen(f) do {} while (0)
 #endif /* !defined(OBJ_FROZEN) */
 
-#if !defined(OFFT2NUM)
-#  if SIZEOF_OFF_T == SIZEOF_LONG
-#    define OFFT2NUM(n) LONG2NUM(n)
-#  else
-#    define OFFT2NUM(n) LL2NUM(n)
-#  endif
-#endif /* ! defined(OFFT2NUM) */
-
-#if !defined(SIZET2NUM)
-#  if SIZEOF_SIZE_T == SIZEOF_LONG
-#    define SIZET2NUM(n) ULONG2NUM(n)
-#  else
-#    define SIZET2NUM(n) ULL2NUM(n)
-#  endif
-#endif /* ! defined(SIZET2NUM) */
-
-#if !defined(NUM2SIZET)
-#  if SIZEOF_SIZE_T == SIZEOF_LONG
-#    define NUM2SIZET(n) ((size_t)NUM2ULONG(n))
-#  else
-#    define NUM2SIZET(n) ((size_t)NUM2ULL(n))
-#  endif
-#endif /* ! defined(NUM2SIZET) */
-
 static inline int str_cstr_eq(VALUE val, const char *ptr, long len)
 {
   return (RSTRING_LEN(val) == len && !memcmp(ptr, RSTRING_PTR(val), len));
diff --git a/ext/unicorn_http/extconf.rb b/ext/unicorn_http/extconf.rb
index 8bdc1c9..46070a7 100644
--- a/ext/unicorn_http/extconf.rb
+++ b/ext/unicorn_http/extconf.rb
@@ -6,12 +6,7 @@
        "It might not properly work with #{RUBY_VERSION}"
 end
 
-have_macro("SIZEOF_OFF_T", "ruby.h") or check_sizeof("off_t", "sys/types.h")
-have_macro("SIZEOF_SIZE_T", "ruby.h") or check_sizeof("size_t", "sys/types.h")
-have_macro("SIZEOF_LONG", "ruby.h") or check_sizeof("long", "sys/types.h")
-have_func("rb_str_set_len", "ruby.h") or abort 'Ruby 2.0+ required'
 have_func("rb_hash_clear", "ruby.h") or abort 'Ruby 2.0+ required'
-have_func("gmtime_r", "time.h")
 
 message('checking if String#-@ (str_uminus) dedupes... ')
 begin
diff --git a/ext/unicorn_http/httpdate.c b/ext/unicorn_http/httpdate.c
index b59d038..3f512dd 100644
--- a/ext/unicorn_http/httpdate.c
+++ b/ext/unicorn_http/httpdate.c
@@ -11,6 +11,7 @@ static const char months[] = "Jan\0Feb\0Mar\0Apr\0May\0Jun\0"
 
 /* for people on wonky systems only */
 #ifndef HAVE_GMTIME_R
+# warning using fake gmtime_r
 static struct tm * my_gmtime_r(time_t *now, struct tm *tm)
 {
 	struct tm *global = gmtime(now);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/6] makefile: reduce unnecessary rebuilds
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
  2021-10-01  3:09 ` [PATCH 1/6] extconf.rb: get rid of unnecessary checks Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  2021-10-01  3:09 ` [PATCH 3/6] HACKING: drop outdated information about pandoc Eric Wong
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

For the most part, header changes shouldn't trigger extconf.rb
reruns.
---
 GNUmakefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/GNUmakefile b/GNUmakefile
index dd0a761..0e08ef0 100644
--- a/GNUmakefile
+++ b/GNUmakefile
@@ -42,13 +42,13 @@ ext_h := $(wildcard $(ext)/*/*.h $(ext)/*.h)
 ext_src := $(sort $(wildcard $(ext)/*.c) $(ext_h) $(ext)/unicorn_http.c)
 ext_pfx_src := $(addprefix $(ext_pfx)/,$(ext_src))
 ext_dir := $(ext_pfx)/$(ext)
-$(ext)/extconf.rb: $(wildcard $(ext)/*.h)
+$(ext)/extconf.rb:
 	@>>$@
 $(ext_dir) $(tmp_bin) man/man1 doc/man1 pkg t/trash:
 	@mkdir -p $@
 $(ext_pfx)/$(ext)/%: $(ext)/% | $(ext_dir)
 	$(INSTALL) -m 644 $< $@
-$(ext_pfx)/$(ext)/Makefile: $(ext)/extconf.rb $(ext_h) | $(ext_dir)
+$(ext_pfx)/$(ext)/Makefile: $(ext)/extconf.rb | $(ext_dir)
 	$(RM) -f $(@D)/*.o
 	cd $(@D) && $(RUBY) $(CURDIR)/$(ext)/extconf.rb $(EXTCONF_ARGS)
 ext_sfx := _ext.$(DLEXT)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/6] HACKING: drop outdated information about pandoc
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
  2021-10-01  3:09 ` [PATCH 1/6] extconf.rb: get rid of unnecessary checks Eric Wong
  2021-10-01  3:09 ` [PATCH 2/6] makefile: reduce unnecessary rebuilds Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  2021-10-01  3:09 ` [PATCH 4/6] http_server: get rid of Process.ppid check Eric Wong
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

It's been outdated since d9b5943af26d5df5 (doc: replace
pandoc-"Markdown" with real manpages, 2019-12-15)
---
 HACKING | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/HACKING b/HACKING
index 7c41eba..020209e 100644
--- a/HACKING
+++ b/HACKING
@@ -50,9 +50,6 @@ programming experience will come in handy (or be learned) here.
 
 === Documentation
 
-Due to the lack of RDoc-to-manpage converters we know about, we're
-writing manpages in Markdown and converting to troff/HTML with Pandoc.
-
 Please wrap documentation at 72 characters-per-line or less (long URLs
 are exempt) so it is comfortably readable from terminals.
 
@@ -102,10 +99,6 @@ don't email the git mailing list or maintainer with Unicorn patches :)
 
 == Building a Gem
 
-In order to build the gem, you must install the following components:
-
- * pandoc
-
 You can build the Unicorn gem with the following command:
 
   gmake gem

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 4/6] http_server: get rid of Process.ppid check
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
                   ` (2 preceding siblings ...)
  2021-10-01  3:09 ` [PATCH 3/6] HACKING: drop outdated information about pandoc Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  2021-10-01  3:09 ` [PATCH 5/6] worker_loop: get rid of select() avoidance hack Eric Wong
  2021-10-01  3:09 ` [PATCH 6/6] use EPOLLEXCLUSIVE on Linux 4.5+ Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

It's actually been unnecessary since commit 6f6e4115b4bb03e5
(rework master-to-worker signaling to use a pipe, 2013-12-09)
---
 lib/unicorn/http_server.rb | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index cd6e63b..09c5ec2 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -708,7 +708,6 @@ def reopen_worker_logs(worker_nr)
   # for connections and doesn't die until the parent dies (or is
   # given a INT, QUIT, or TERM signal)
   def worker_loop(worker)
-    ppid = @master_pid
     readers = init_worker_process(worker)
     nr = 0 # this becomes negative if we need to reopen logs
 
@@ -745,8 +744,6 @@ def worker_loop(worker)
         redo
       end
 
-      ppid == Process.ppid or return
-
       # timeout used so we can detect parent death:
       worker.tick = time_now.to_i
       ret = IO.select(readers, nil, nil, @timeout) and ready = ret[0]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 5/6] worker_loop: get rid of select() avoidance hack
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
                   ` (3 preceding siblings ...)
  2021-10-01  3:09 ` [PATCH 4/6] http_server: get rid of Process.ppid check Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  2021-10-01  3:09 ` [PATCH 6/6] use EPOLLEXCLUSIVE on Linux 4.5+ Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

It doesn't seem to do anything since commit 221340c4ebc15666
(prevent single listener from monopolizing a worker, 2020-04-16).
---
 lib/unicorn/http_server.rb | 25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 09c5ec2..7f33f98 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -699,6 +699,7 @@ def reopen_worker_logs(worker_nr)
     logger.info "worker=#{worker_nr} reopening logs..."
     Unicorn::Util.reopen_logs
     logger.info "worker=#{worker_nr} done reopening logs"
+    false
   rescue => e
     logger.error(e) rescue nil
     exit!(77) # EX_NOPERM in sysexits.h
@@ -709,19 +710,17 @@ def reopen_worker_logs(worker_nr)
   # given a INT, QUIT, or TERM signal)
   def worker_loop(worker)
     readers = init_worker_process(worker)
-    nr = 0 # this becomes negative if we need to reopen logs
+    reopen = false
 
     # this only works immediately if the master sent us the signal
     # (which is the normal case)
-    trap(:USR1) { nr = -65536 }
+    trap(:USR1) { reopen = true }
 
     ready = readers.dup
-    nr_listeners = readers.size
     @after_worker_ready.call(self, worker)
 
     begin
-      nr < 0 and reopen_worker_logs(worker.nr)
-      nr = 0
+      reopen = reopen_worker_logs(worker.nr) if reopen
       worker.tick = time_now.to_i
       tmp = ready.dup
       while sock = tmp.shift
@@ -729,26 +728,16 @@ def worker_loop(worker)
         # but that will return false
         if client = sock.kgio_tryaccept
           process_client(client)
-          nr += 1
           worker.tick = time_now.to_i
         end
-        break if nr < 0
+        break if reopen
       end
 
-      # make the following bet: if we accepted clients this round,
-      # we're probably reasonably busy, so avoid calling select()
-      # and do a speculative non-blocking accept() on ready listeners
-      # before we sleep again in select().
-      if nr == nr_listeners
-        tmp = ready.dup
-        redo
-      end
-
-      # timeout used so we can detect parent death:
+      # timeout so we can .tick and keep parent from SIGKILL-ing us
       worker.tick = time_now.to_i
       ret = IO.select(readers, nil, nil, @timeout) and ready = ret[0]
     rescue => e
-      redo if nr < 0 && readers[0]
+      redo if reopen && readers[0]
       Unicorn.log_error(@logger, "listen loop error", e) if readers[0]
     end while readers[0]
   end

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 6/6] use EPOLLEXCLUSIVE on Linux 4.5+
  2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
                   ` (4 preceding siblings ...)
  2021-10-01  3:09 ` [PATCH 5/6] worker_loop: get rid of select() avoidance hack Eric Wong
@ 2021-10-01  3:09 ` Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-10-01  3:09 UTC (permalink / raw)
  To: unicorn-public

While the capabilities of epoll cannot be fully exploited given
our primitive design; avoiding thundering herd wakeups on larger
SMP machines while below 100% utilization is possible with
Linux 4.5+.

With this change, only one worker wakes up per-connect(2)
(instead of all of them via select(2)), avoiding the thundering
herd effect when the system is mostly idle.

Saturated instances should not notice the difference if they
rarely had multiple workers sleeping in select(2).  This change
benefits non-saturated instances.

With 2 parallel clients and 8 workers on a nominally (:P)
8-core CPU (AMD FX-8320), the uconnect.perl test script
invocation showed a reduction from ~3.4s to ~2.5s when
reading an 11-byte response body:

  echo worker_processes 8 >u.conf.rb
  bs=11 ruby -I lib -I test/ruby-2.5.5/ext/unicorn_http/ bin/unicorn \
    test/benchmark/dd.ru -E none -l /tmp/u.sock -c u.conf.rb
  time perl -I lib -w test/benchmark/uconnect.perl \
    -n 100000 -c 2 /tmp/u.sock

Times improve less as "-c" increases for uconnect.perl (system
noise and timings are inconsistent).  The benefit of this change
should be more noticeable on systems with more workers (and
more cores).

I wanted to use EPOLLET (Edge-Triggered) to further reduce
syscalls, here, (similar to the old select()-avoidance bet) but
that would've either added too much complexity to deduplicate
wakeup sources, or run into the same starvation problem we
solved in April 2020[1].

Since the kernel already has the complexity and deduplication
built-in for Level-Triggered epoll support, we'll just let the
kernel deal with it.

Note: do NOT take this as an example of how epoll should be used
in a sophisticated server.  unicorn is primitive by design and
cannot use threads nor handle multiple clients at once, thus it
it only uses epoll in this extremely limited manner.

Linux 4.5+ users will notice a regression of one extra epoll FD
per-worker and at least two epoll watches, so
/proc/sys/fs/epoll/max_user_watches may need to be changed along
with RLIMIT_NOFILE.

This change has also been tested on Linux 3.10.x (CentOS 7.x)
and FreeBSD 11.x to ensure compatibility with systems without
EPOLLEXCLUSIVE.

Various EPOLLEXCLUSIVE discussions over the years:
  https://yhbt.net/lore/lkml/?q=s:EPOLLEXCLUSIVE+d:..20211001&x=t&o=-1

[1] https://yhbt.net/unicorn-public/CAMBWrQ=Yh42MPtzJCEO7XryVknDNetRMuA87irWfqVuLdJmiBQ@mail.gmail.com/
---
 ext/unicorn_http/epollexclusive.h | 125 ++++++++++++++++++++++++++++++
 ext/unicorn_http/extconf.rb       |   1 +
 ext/unicorn_http/unicorn_http.rl  |   3 +
 lib/unicorn/http_server.rb        |  17 +++-
 lib/unicorn/select_waiter.rb      |   6 ++
 t/test-lib.sh                     |   3 +-
 test/unit/test_waiter.rb          |  34 ++++++++
 7 files changed, 184 insertions(+), 5 deletions(-)
 create mode 100644 ext/unicorn_http/epollexclusive.h
 create mode 100644 lib/unicorn/select_waiter.rb
 create mode 100644 test/unit/test_waiter.rb

diff --git a/ext/unicorn_http/epollexclusive.h b/ext/unicorn_http/epollexclusive.h
new file mode 100644
index 0000000..2d2a589
--- /dev/null
+++ b/ext/unicorn_http/epollexclusive.h
@@ -0,0 +1,125 @@
+/*
+ * This is only intended for use inside a unicorn worker, nowhere else.
+ * EPOLLEXCLUSIVE somewhat mitigates the thundering herd problem for
+ * mostly idle processes since we can't use blocking accept4.
+ * This is NOT intended for use with multi-threaded servers, nor
+ * single-threaded multi-client ("C10K") servers or anything advanced
+ * like that.  This use of epoll is only appropriate for a primitive,
+ * single-client, single-threaded servers like unicorn that need to
+ * support SIGKILL timeouts and parent death detection.
+ */
+#if defined(HAVE_EPOLL_CREATE1)
+#  include <sys/epoll.h>
+#  include <errno.h>
+#  include <ruby/io.h>
+#  include <ruby/thread.h>
+#endif /* __linux__ */
+
+#if defined(EPOLLEXCLUSIVE) && defined(HAVE_EPOLL_CREATE1)
+#  define USE_EPOLL (1)
+#else
+#  define USE_EPOLL (0)
+#endif
+
+#if USE_EPOLL
+/*
+ * :nodoc:
+ * returns IO object if EPOLLEXCLUSIVE works and arms readers
+ */
+static VALUE prep_readers(VALUE cls, VALUE readers)
+{
+	long i;
+	int epfd = epoll_create1(EPOLL_CLOEXEC);
+	VALUE epio;
+
+	if (epfd < 0) rb_sys_fail("epoll_create1");
+
+	epio = rb_funcall(cls, rb_intern("for_fd"), 1, INT2NUM(epfd));
+
+	Check_Type(readers, T_ARRAY);
+	for (i = 0; i < RARRAY_LEN(readers); i++) {
+		int rc;
+		struct epoll_event e;
+		rb_io_t *fptr;
+		VALUE io = rb_ary_entry(readers, i);
+
+		e.data.u64 = i; /* the reason readers shouldn't change */
+
+		/*
+		 * I wanted to use EPOLLET here, but maintaining our own
+		 * equivalent of ep->rdllist in Ruby-space doesn't fit
+		 * our design at all (and the kernel already has it's own
+		 * code path for doing it).  So let the kernel spend
+		 * cycles on maintaining level-triggering.
+		 */
+		e.events = EPOLLEXCLUSIVE | EPOLLIN;
+		io = rb_io_get_io(io);
+		GetOpenFile(io, fptr);
+		rc = epoll_ctl(epfd, EPOLL_CTL_ADD, fptr->fd, &e);
+		if (rc < 0) rb_sys_fail("epoll_ctl");
+	}
+	return epio;
+}
+#endif /* USE_EPOLL */
+
+#if USE_EPOLL
+struct ep_wait {
+	struct epoll_event *events;
+	rb_io_t *fptr;
+	int maxevents;
+	int timeout_msec;
+};
+
+static void *do_wait(void *ptr) /* runs w/o GVL */
+{
+	struct ep_wait *epw = ptr;
+
+	return (void *)(long)epoll_wait(epw->fptr->fd, epw->events,
+				epw->maxevents, epw->timeout_msec);
+}
+
+/* :nodoc: */
+/* readers must not change between prepare_readers and get_readers */
+static VALUE
+get_readers(VALUE epio, VALUE ready, VALUE readers, VALUE timeout_msec)
+{
+	struct ep_wait epw;
+	long i, n;
+	VALUE buf;
+
+	Check_Type(ready, T_ARRAY);
+	Check_Type(readers, T_ARRAY);
+	epw.maxevents = RARRAY_LENINT(readers);
+	buf = rb_str_buf_new(sizeof(struct epoll_event) * epw.maxevents);
+	epw.events = (struct epoll_event *)RSTRING_PTR(buf);
+	epio = rb_io_get_io(epio);
+	GetOpenFile(epio, epw.fptr);
+
+	epw.timeout_msec = NUM2INT(timeout_msec);
+	n = (long)rb_thread_call_without_gvl(do_wait, &epw, RUBY_UBF_IO, NULL);
+	if (n < 0) {
+		if (errno != EINTR) rb_sys_fail("epoll_wait");
+		n = 0;
+	}
+	/* Linux delivers events in order received */
+	for (i = 0; i < n; i++) {
+		struct epoll_event *ev = &epw.events[i];
+		VALUE obj = rb_ary_entry(readers, ev->data.u64);
+
+		if (RTEST(obj))
+			rb_ary_push(ready, obj);
+	}
+	rb_str_resize(buf, 0);
+	rb_gc_force_recycle(buf);
+	return Qfalse;
+}
+#endif /* USE_EPOLL */
+
+static void init_epollexclusive(VALUE mUnicorn)
+{
+#if USE_EPOLL
+	VALUE cWaiter = rb_define_class_under(mUnicorn, "Waiter", rb_cIO);
+	rb_define_singleton_method(cWaiter, "prep_readers", prep_readers, 1);
+	rb_define_method(cWaiter, "get_readers", get_readers, 3);
+#endif
+}
diff --git a/ext/unicorn_http/extconf.rb b/ext/unicorn_http/extconf.rb
index 46070a7..13dec41 100644
--- a/ext/unicorn_http/extconf.rb
+++ b/ext/unicorn_http/extconf.rb
@@ -38,4 +38,5 @@
   message("no, needs Ruby 2.6+\n")
 end
 
+have_func('epoll_create1', %w(sys/epoll.h))
 create_makefile("unicorn_http")
diff --git a/ext/unicorn_http/unicorn_http.rl b/ext/unicorn_http/unicorn_http.rl
index e934a32..605b23f 100644
--- a/ext/unicorn_http/unicorn_http.rl
+++ b/ext/unicorn_http/unicorn_http.rl
@@ -12,6 +12,7 @@
 #include "common_field_optimization.h"
 #include "global_variables.h"
 #include "c_util.h"
+#include "epollexclusive.h"
 
 void init_unicorn_httpdate(void);
 
@@ -1017,5 +1018,7 @@ void Init_unicorn_http(void)
   id_clear = rb_intern("clear");
 #endif
   id_is_chunked_p = rb_intern("is_chunked?");
+
+  init_epollexclusive(mUnicorn);
 }
 #undef SET_GLOBAL
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 7f33f98..21f2a05 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -685,7 +685,6 @@ def init_worker_process(worker)
     LISTENERS.each { |sock| sock.close_on_exec = true }
 
     worker.user(*user) if user.kind_of?(Array) && ! worker.switched
-    self.timeout /= 2.0 # halve it for select()
     @config = nil
     build_app! unless preload_app
     @after_fork = @listener_opts = @orig_app = nil
@@ -705,11 +704,22 @@ def reopen_worker_logs(worker_nr)
     exit!(77) # EX_NOPERM in sysexits.h
   end
 
+  def prep_readers(readers)
+    wtr = Unicorn::Waiter.prep_readers(readers)
+    @timeout *= 500 # to milliseconds for epoll, but halved
+    wtr
+  rescue
+    require_relative 'select_waiter'
+    @timeout /= 2.0 # halved for IO.select
+    Unicorn::SelectWaiter.new
+  end
+
   # runs inside each forked worker, this sits around and waits
   # for connections and doesn't die until the parent dies (or is
   # given a INT, QUIT, or TERM signal)
   def worker_loop(worker)
     readers = init_worker_process(worker)
+    waiter = prep_readers(readers)
     reopen = false
 
     # this only works immediately if the master sent us the signal
@@ -722,8 +732,7 @@ def worker_loop(worker)
     begin
       reopen = reopen_worker_logs(worker.nr) if reopen
       worker.tick = time_now.to_i
-      tmp = ready.dup
-      while sock = tmp.shift
+      while sock = ready.shift
         # Unicorn::Worker#kgio_tryaccept is not like accept(2) at all,
         # but that will return false
         if client = sock.kgio_tryaccept
@@ -735,7 +744,7 @@ def worker_loop(worker)
 
       # timeout so we can .tick and keep parent from SIGKILL-ing us
       worker.tick = time_now.to_i
-      ret = IO.select(readers, nil, nil, @timeout) and ready = ret[0]
+      waiter.get_readers(ready, readers, @timeout)
     rescue => e
       redo if reopen && readers[0]
       Unicorn.log_error(@logger, "listen loop error", e) if readers[0]
diff --git a/lib/unicorn/select_waiter.rb b/lib/unicorn/select_waiter.rb
new file mode 100644
index 0000000..cb84aab
--- /dev/null
+++ b/lib/unicorn/select_waiter.rb
@@ -0,0 +1,6 @@
+# fallback for non-Linux and Linux <4.5 systems w/o EPOLLEXCLUSIVE
+class Unicorn::SelectWaiter # :nodoc:
+  def get_readers(ready, readers, timeout) # :nodoc:
+    ret = IO.select(readers, nil, nil, timeout) and ready.replace(ret[0])
+  end
+end
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 7f97958..e70d0c6 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -94,7 +94,8 @@ check_stderr () {
 	set +u
 	_r_err=${1-${r_err}}
 	set -u
-	if grep -v $T $_r_err | grep -i Error
+	if grep -v $T $_r_err | grep -i Error | \
+		grep -v NameError.*Unicorn::Waiter
 	then
 		die "Errors found in $_r_err"
 	elif grep SIGKILL $_r_err
diff --git a/test/unit/test_waiter.rb b/test/unit/test_waiter.rb
new file mode 100644
index 0000000..0995de2
--- /dev/null
+++ b/test/unit/test_waiter.rb
@@ -0,0 +1,34 @@
+require 'test/unit'
+require 'unicorn'
+require 'unicorn/select_waiter'
+class TestSelectWaiter < Test::Unit::TestCase
+
+  def test_select_timeout # n.b. this is level-triggered
+    sw = Unicorn::SelectWaiter.new
+    IO.pipe do |r,w|
+      sw.get_readers(ready = [], [r], 0)
+      assert_equal [], ready
+      w.syswrite '.'
+      sw.get_readers(ready, [r], 1000)
+      assert_equal [r], ready
+      sw.get_readers(ready, [r], 0)
+      assert_equal [r], ready
+    end
+  end
+
+  def test_linux # ugh, also level-triggered, unlikely to change
+    IO.pipe do |r,w|
+      wtr = Unicorn::Waiter.prep_readers([r])
+      wtr.get_readers(ready = [], [r], 0)
+      assert_equal [], ready
+      w.syswrite '.'
+      wtr.get_readers(ready = [], [r], 1000)
+      assert_equal [r], ready
+      wtr.get_readers(ready = [], [r], 1000)
+      assert_equal [r], ready, 'still ready (level-triggered :<)'
+      assert_nil wtr.close
+    end
+  rescue SystemCallError => e
+    warn "#{e.message} (#{e.class})"
+  end if Unicorn.const_defined?(:Waiter)
+end

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-01  3:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-01  3:09 [PATCH 0/6] reduce thundering herds on Linux 4.5+ Eric Wong
2021-10-01  3:09 ` [PATCH 1/6] extconf.rb: get rid of unnecessary checks Eric Wong
2021-10-01  3:09 ` [PATCH 2/6] makefile: reduce unnecessary rebuilds Eric Wong
2021-10-01  3:09 ` [PATCH 3/6] HACKING: drop outdated information about pandoc Eric Wong
2021-10-01  3:09 ` [PATCH 4/6] http_server: get rid of Process.ppid check Eric Wong
2021-10-01  3:09 ` [PATCH 5/6] worker_loop: get rid of select() avoidance hack Eric Wong
2021-10-01  3:09 ` [PATCH 6/6] use EPOLLEXCLUSIVE on Linux 4.5+ Eric Wong

Code repositories for project(s) associated with this inbox:

	../../../unicorn.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).