From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0FD6B1F751; Thu, 16 Apr 2020 05:47:11 +0000 (UTC) Received: by mail-io1-xd43.google.com with SMTP id i3so19745772ioo.13; Wed, 15 Apr 2020 22:47:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LxavVuB9E7Fvd+rrhujAOIn4hrg6OLnrLKC3bgaeN30=; b=cCCVj/kXXnEX+PiiozM4aDAigly4LUMU0/3pNQ9AeBp2uZPPiVHLF52FrillVaR11L EVrQro4kM/vj54BRjnydVg4hyaBimPUjkejYZ2/Ocm20ka7TTHwGPF7WZONqjU1PsLbT 8GGIGQGwFcauSiZVNQkoX9x37yq6pug+4ypGCAtAhnue1y8Y+EXkEDiWpzwwVEDDIK1M vER+12GeBTt0UlKhI3SDQGzCvWUE10DOLrnBWKj2D8VwXhFzQSGi9GlJ9rNQZyKwZxK/ xkFHFnXXbVvQLHs4jwjOkcDAty2VoQLLQGBT9Zh3Fknjl02SOWW9HA/830M3fPW3x6H0 tDfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LxavVuB9E7Fvd+rrhujAOIn4hrg6OLnrLKC3bgaeN30=; b=XXqFWOQnZ75msH4Q4FCSPfOPsZDSu/UGA28pcuiVY0dX5poT2Kg4pUuMwhYoArX9VU pP9gDdBt+70rDx+fnEOQs5395EfARbrukGmbq6vos9Buh6x7vAjHSORRIctqAIBn1FQO ob2TSDRxvujf3PZael8h/bRf0WtUWMjf7Vi0lECgOmbilIStpg9y+6Y9tYqd9nsnahjq y4taPaDT+sf6QPIENyH4/XCMaf2KZraql75Nnx48ZKWc0zIKUjHc+FQqoL2M+MZQcWny +MUzOXXZRm20hlXTBF79NTEas76KEFvRFGCe5xLNHnM1wyCezFYbz1zXN7EcJ6fO9Ulw tP6Q== X-Gm-Message-State: AGi0Pub3Qu2/I9pAIn7GKlRgEVm94xfXaG/93wD95PtkzhvMQ+N3+pRk KM51WuArReM97bpMcGnOz+nKEdMi5NbBe4g+XHRs++HfoXM= X-Google-Smtp-Source: APiQypKrn28Eq6QkIbUX/FpaxsQmq5V78xGnh15CPhc6YQCCfVwxFfesuDWm07q7CC4W+pLQvHre4nrW7zMmIhlXObk= X-Received: by 2002:a02:84ee:: with SMTP id f101mr13423984jai.95.1587016029222; Wed, 15 Apr 2020 22:47:09 -0700 (PDT) MIME-Version: 1.0 References: <20200415052623.GA24409@dcvr> In-Reply-To: <20200415052623.GA24409@dcvr> From: Stan Hu Date: Wed, 15 Apr 2020 22:46:58 -0700 Message-ID: Subject: Re: Sustained queuing on one listener can block requests from other listeners To: Eric Wong Cc: unicorn-public@yhbt.net Content-Type: text/plain; charset="UTF-8" List-Id: Thanks, Eric. That patch didn't work; it spun the CPU. I think this worked? diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index a52931a..aaa4955 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -708,7 +708,7 @@ def worker_loop(worker) # we're probably reasonably busy, so avoid calling select() # and do a speculative non-blocking accept() on ready listeners # before we sleep again in select(). - unless nr == 0 + if nr == readers.size tmp = ready.dup redo end On Tue, Apr 14, 2020 at 10:26 PM Eric Wong wrote: > > Stan Hu wrote: > > My unicorn.rb has two listeners: > > > > listen "127.0.0.1:8080", :tcp_nopush => false > > listen "/var/run/unicorn.socket", :backlog => 1024 > > Fwiw, lowering :backlog may make sense if you got other > hosts/instances. More below.. > > > We found that because of the greedy attempt to accept new connections > > before calling select() in > > https://github.com/defunkt/unicorn/blob/981f561a726bb4307d01e4a09a308edba8d69fe3/lib/unicorn/http_server.rb#L707-L714, > > listeners on another socket stall out until the first listener is > > drained. We would expect Unicorn to round-robin between the two > > listeners, but that doesn't happen as long as there is work to be done > > for the first listener. We've verified that deleting that `redo` block > > fixes the problem. > > > > What do you think about the various options? > > > > 1. Only running that redo block if there is one listener > > That seems reasonable, or if ready.size == nr_listeners > (proposed patch below) > > > 2. Removing the redo block entirely > > From what I recall ages ago, select() entry cost is pretty high > and I remember that redo helping a fair bit even in 2009 with > simple apps. Syscall cost is even higher now with CPU > vulnerability mitigations, and Ruby 1.9+ GVL release+reacquire > is also a penalty I didn't have when developing this on 1.8. > > Do you have time+hardware to benchmark either approach on a > simple app? I no longer have stable/reliable hardware for > benchmarking. Thanks. > > Totally untested patch to try approach #1 > > diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb > index a52931a..69f1f60 100644 > --- a/lib/unicorn/http_server.rb > +++ b/lib/unicorn/http_server.rb > @@ -686,6 +686,7 @@ def worker_loop(worker) > trap(:USR1) { nr = -65536 } > > ready = readers.dup > + nr_listeners = readers.size > @after_worker_ready.call(self, worker) > > begin > @@ -698,7 +699,6 @@ def worker_loop(worker) > # but that will return false > if client = sock.kgio_tryaccept > process_client(client) > - nr += 1 > worker.tick = time_now.to_i > end > break if nr < 0 > @@ -708,7 +708,7 @@ def worker_loop(worker) > # we're probably reasonably busy, so avoid calling select() > # and do a speculative non-blocking accept() on ready listeners > # before we sleep again in select(). > - unless nr == 0 > + if ready.size == nr_listeners > tmp = ready.dup > redo > end > > > > And `nr' can probably just be a boolean `reopen' flag if we're > not overloading it as a counter.