From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS33070 50.56.128.0/17 X-Spam-Status: No, score=0.5 required=5.0 tests=AWL,RDNS_NONE shortcircuit=no autolearn=no version=3.3.2 X-Original-To: archivist@yhbt.net Delivered-To: archivist@dcvr.yhbt.net Received: from rubyforge.org (unknown [50.56.192.79]) by dcvr.yhbt.net (Postfix) with ESMTP id C4D0F1F703 for ; Tue, 20 Aug 2013 20:42:48 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id 9C2482E192; Tue, 20 Aug 2013 20:42:49 +0000 (UTC) X-Original-To: mongrel-unicorn@rubyforge.org Delivered-To: mongrel-unicorn@rubyforge.org Received: from dcvr.yhbt.net (dcvr.yhbt.net [64.71.152.64]) by rubyforge.org (Postfix) with ESMTP id B3A482E17D for ; Tue, 20 Aug 2013 20:42:46 +0000 (UTC) Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 60F561F464; Tue, 20 Aug 2013 20:42:44 +0000 (UTC) Date: Tue, 20 Aug 2013 20:42:44 +0000 From: Eric Wong To: unicorn list Subject: Re: A barrage of unexplained timeouts Message-ID: <20130820204244.GA3846@dcvr.yhbt.net> References: <1377010076.87748476@apps.rackspace.com> <20130820163745.GA22586@dcvr.yhbt.net> <1377019663.33997248@apps.rackspace.com> <20130820174018.GA25675@dcvr.yhbt.net> <1377022272.215912135@apps.rackspace.com> <20130820184902.GA31845@dcvr.yhbt.net> <1377029010.22657225@apps.rackspace.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1377029010.22657225@apps.rackspace.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: mongrel-unicorn@rubyforge.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: mongrel-unicorn-bounces@rubyforge.org Errors-To: mongrel-unicorn-bounces@rubyforge.org nick@auger.net wrote: > "Eric Wong" said: > > This is really strange. This was only really bad for a 7s period? > > It was a 7 minute period. All of the workers would become busy and > exceed their >120s timeout. Master would kill and re-spawn them, > they'd start to respond to a handful of requests (anywhere from 5-50) > after which they'd become "busy" again, and get force killed by > master. This pattern happened 3 times over a 7 minute period. > > > Has it happened again? > > No > > > Anything else going on with the system at that time? Swapping, > > particularly... > > No swap activity or high load. Our munin graphs indicate a peak of > web/app server disk latency around that time, although our graphs show > many other similar peaks, without incident. I'm stumped :< Do you have any background threads running that could be hanging the workers? This is Ruby 1.8, after all, so there's more likely to be some blocking call hanging the entire process. AFAIK, some monitoring software runs a background thread in the unicorn worker and maybe the OpenSSL extension doesn't work as well if it encountered network problems under Ruby 1.8 Otherwise, I don't know... _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying