From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=unavailable autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id EB4C920092 for ; Tue, 7 Jun 2016 07:39:10 +0000 (UTC) From: Eric Wong To: yahns-public@yhbt.net Subject: [PATCH 3/5] cleanup graceful shutdown handling Date: Tue, 7 Jun 2016 07:39:06 +0000 Message-Id: <20160607073908.31035-4-e@80x24.org> In-Reply-To: <20160607073908.31035-1-e@80x24.org> References: <20160607073908.31035-1-e@80x24.org> List-Id: Using a 10ms tick was too little, use 100ms instead to avoid burning CPU. Ideally, we would not tick at all during shutdown (we're normally tickless); but the common case could be slightly more expensive; and shutdowns are rare (I hope). Then, change our process title to indicate we're shutting down, and finally, cut down on repeated log spew during shutdown and only log dropping changes. This mean we could potentially take 90ms longer to notice when we can do a graceful shutdown, but oh well... While we're at it, add a test to ensure graceful shutdowns work as intended with multiple processes. --- lib/yahns/fdmap.rb | 11 +++++++---- lib/yahns/server.rb | 3 ++- lib/yahns/server_mp.rb | 2 +- test/test_server.rb | 35 +++++++++++++++++++++++++++++++++++ 4 files changed, 45 insertions(+), 6 deletions(-) diff --git a/lib/yahns/fdmap.rb b/lib/yahns/fdmap.rb index fab5d36..0aaf360 100644 --- a/lib/yahns/fdmap.rb +++ b/lib/yahns/fdmap.rb @@ -89,10 +89,10 @@ def forget(io) # We should not be calling this too frequently, it is expensive # This is called while @fdmap_mtx is held def __expire(timeout) - return if @count == 0 + return 0 if @count == 0 nr = 0 now = Yahns.now - (now - @last_expire) >= 1.0 or return # don't expire too frequently + (now - @last_expire) >= 1.0 or return @count # don't expire too frequently # @fdmap_ary may be huge, so always expire a bunch at once to # avoid getting to this method too frequently @@ -104,8 +104,11 @@ def __expire(timeout) end @last_expire = Yahns.now - msg = timeout ? "timeout=#{timeout}" : "client_timeout" - @logger.info("dropping #{nr} of #@count clients for #{msg}") + if nr != 0 + msg = timeout ? "timeout=#{timeout}" : "client_timeout" + @logger.info("dropping #{nr} of #@count clients for #{msg}") + end + @count end # used for graceful shutdown diff --git a/lib/yahns/server.rb b/lib/yahns/server.rb index ba2066b..00e5f15 100644 --- a/lib/yahns/server.rb +++ b/lib/yahns/server.rb @@ -496,7 +496,8 @@ def sp_sig_handle(alive) def dropping(fdmap) if drop_acceptors[0] || fdmap.size > 0 timeout = @shutdown_expire < Yahns.now ? -1 : @shutdown_timeout - fdmap.desperate_expire(timeout) + n = fdmap.desperate_expire(timeout) + $0 = "yahns quitting, #{n} FD(s) remain" true else false diff --git a/lib/yahns/server_mp.rb b/lib/yahns/server_mp.rb index fa12a0c..c9cd207 100644 --- a/lib/yahns/server_mp.rb +++ b/lib/yahns/server_mp.rb @@ -159,7 +159,7 @@ def run_mp_worker(worker) def mp_sig_handle(watch, alive) # not performance critical watch.delete_if { |io| io.to_io.closed? } - if r = IO.select(watch, nil, nil, alive ? nil : 0.01) + if r = IO.select(watch, nil, nil, alive ? nil : 0.1) r[0].each(&:yahns_step) end case @sig_queue.shift diff --git a/test/test_server.rb b/test/test_server.rb index 9f33b42..c6d70cb 100644 --- a/test/test_server.rb +++ b/test/test_server.rb @@ -725,6 +725,41 @@ def _persistent_shutdown(nr_workers) assert_nil c.read(666) end + def test_slow_shutdown_timeout; _slow_shutdown(nil); end + def test_slow_shutdown_timeout_mp; _slow_shutdown(1); end + + def _slow_shutdown(nr_workers) + err, cfg, host, port = @err, Yahns::Config.new, @srv.addr[3], @srv.addr[1] + pid = mkserver(cfg) do + ru = lambda { |e| [ 200, {'Content-Length'=>'2'}, %w(OK) ] } + cfg.instance_eval do + app(:rack, ru) { listen "#{host}:#{port}" } + stderr_path err.path + worker_processes(nr_workers) if nr_workers + end + end + c = get_tcp_client(host, port) + c.write 'G' + 100000.times { Thread.pass } + Process.kill(:QUIT, pid) + "ET / HTTP/1.1\r\nHost: example.com\r\n\r\n".each_byte do |x| + Thread.pass + c.write(x.chr) + Thread.pass + end + assert_equal c, c.wait(30) + buf = ''.dup + re = /\r\n\r\nOK\z/ + Timeout.timeout(30) do + begin + buf << c.readpartial(666) + end until re =~ buf + end + c.close + _, status = Timeout.timeout(5) { Process.waitpid2(pid) } + assert status.success?, status.inspect + end + def test_before_exec err, cfg, host, port = @err, Yahns::Config.new, @srv.addr[3], @srv.addr[1] ru = lambda { |e| [ 200, {'Content-Length'=>'2' }, %w(OK) ] }