From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7530C1F9F4 for ; Sat, 9 Oct 2021 02:24:47 +0000 (UTC) From: Eric Wong To: yahns-public@yhbt.net Subject: [PATCH 2/3] server: workaround Linux v5.5..v5.13 epoll bug Date: Sat, 9 Oct 2021 02:24:45 +0000 Message-Id: <20211009022446.705-3-e@80x24.org> In-Reply-To: <20211009022446.705-1-e@80x24.org> References: <20211009022446.705-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: epoll_wait() wakeups from QueueQuitter got lost during graceful shutdown since there's multiple worker threads operating off the same FD. Workaround the problem by re-arming the eventfd for every worker thread reaped. Link: https://yhbt.net/lore/lkml/20210405231025.33829-1-dave@stgolabs.net/ --- lib/yahns/queue_epoll.rb | 4 ++++ lib/yahns/server.rb | 17 ++++++++++------- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb index 9e4271a..a198fbf 100644 --- a/lib/yahns/queue_epoll.rb +++ b/lib/yahns/queue_epoll.rb @@ -32,6 +32,10 @@ def queue_mod(io, flags) epoll_ctl(Epoll::CTL_MOD, io, flags) end + def queue_del(io) + epoll_ctl(Epoll::CTL_DEL, io, 0) + end + def thr_init Thread.current[:yahns_rbuf] = ''.dup Thread.current[:yahns_fdmap] = @fdmap diff --git a/lib/yahns/server.rb b/lib/yahns/server.rb index 208b5ee..74eeb7e 100644 --- a/lib/yahns/server.rb +++ b/lib/yahns/server.rb @@ -438,25 +438,28 @@ def quit_enter(alive) # This just injects the QueueQuitter object which acts like a # monkey wrench thrown into a perfectly good engine :) def quit_finish - quitter = Yahns::QueueQuitter.new + # we must not let quitters get GC-ed if we have any worker threads leftover + @quitter = Yahns::QueueQuitter.new # throw the monkey wrench into the worker threads - @queues.each { |q| q.queue_add(quitter, Yahns::Queue::QEV_QUIT) } + @queues.each { |q| q.queue_add(@quitter, Yahns::Queue::QEV_QUIT) } # watch the monkey wrench destroy all the threads! # Ugh, this may fail if we have dedicated threads trickling # response bodies out (e.g. "tail -F") Oh well, have a timeout begin @wthr.delete_if { |t| t.join(0.01) } + # Workaround Linux 5.5+ bug (fixed in 5.13+) + # https://yhbt.net/lore/lkml/20210405231025.33829-1-dave@stgolabs.net/ + @wthr[0] && @queues[0].respond_to?(:queue_del) and @queues.each do |q| + q.queue_del(@quitter) + q.queue_add(@quitter, Yahns::Queue::QEV_QUIT) + end end while @wthr[0] && Yahns.now <= @shutdown_expire # cleanup, our job is done @queues.each(&:close).clear - - # we must not let quitter get GC-ed if we have any worker threads leftover - @quitter = quitter - - quitter.close + @quitter.close # keep object around in case @wthr isn't empty rescue => e Yahns::Log.exception(@logger, "quit finish", e) ensure