From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-2.9 required=3.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: unicorn-public@bogomips.org Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 62F791F49F; Thu, 5 Mar 2015 21:12:13 +0000 (UTC) Date: Thu, 5 Mar 2015 21:12:13 +0000 From: Eric Wong To: Sarkis Varozian Cc: =?utf-8?Q?Br=C3=A1ulio?= Bhavamitra , Michael Fischer , unicorn-public Subject: Re: Request Queueing after deploy + USR2 restart Message-ID: <20150305211213.GA21611@dcvr.yhbt.net> References: <20150304203514.GA17826@dcvr.yhbt.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: Sarkis Varozian wrote: > Braulio, > > Are you referring to the vertical grey line? That is the deployment event. > The part that spikes in the first graph is request queue which is a bit > different on newrelic: > http://blog.newrelic.com/2013/01/22/understanding-new-relic-queuing/ I'm not about to open images/graphs, but managed to read that. Now I'm still unsure if they are actually using raindrops or not to measure your stats, but at least they mention it in that post. Setting the timestamp header in nginx is a good idea, but you need to be completely certain clocks are synchronized between machines for accuracy (no using monotonic clock between multiple hosts, either, must be real-time). Have you tried using raindrops standalone to confirm queueing in the kernel? raindrops inspects the listen queue in the kernel directly, so it's as accurate as possible as far as the local machine is concerned. (it will not measure internal network latency). I recommend checking raindrops (or inspecting /proc/net/{unix,tcp} or running "ss -lx" / "ss -lt" to check listen queues). You can also simulate TCP socket queueing in a standalone Ruby script by doing something like: -----------------------------8<--------------------------- require 'socket' host = '127.0.0.1' port = 1234 re = Regexp.escape("#{host}:#{port}") check = lambda do |desc| puts desc # use "ss -lx" instead for UNIXServer/UNIXSocket puts `ss -lt`.split(/\n/).grep(/LISTEN\s.*\b#{re}\b/io) puts end puts "Creating new server" s = TCPServer.new(host, port) check.call "2nd column should initially be zero:" puts "Queueing up one client:" c1 = TCPSocket.new(host, port) check.call "2nd column should be one, since accept is not yet called:" puts "Accepting one client to clear the queue" a1 = s.accept check.call "2nd column should be back to zero after calling accept:" puts "Queueing up two clients:" c2 = TCPSocket.new(host, port) c3 = TCPSocket.new(host, port) check.call "2nd column should show two queued clients" a2 = s.accept check.call "2nd column should be down to one after calling accept:" -----------------------------8<--------------------------- Disclaimer: I'm a Free Software extremist and would not touch New Relic with a ten-foot pole... > We are using HAProxy to load balance (round robin) to 4 physical hosts > running unicorn with 6 workers. I assume there's nginx somewhere? Where is it? If not, you're not protected from slow uploads with giant request bodies. I'm not up-to-date about current haproxy versions, but AFAIK only nginx buffers request bodies in full. With nginx, I'm not sure what the point of haproxy is if you're just going to do round-robin; nginx already does round-robin. I'd only use haproxy for a "smarter" load balancing scheme.