Rainbows! Rack HTTP server user/dev discussion
 help / color / mirror / code / Atom feed
From: Samuel Kadolph <samuel.kadolph-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org>
To: "Rainbows! list" <rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org>
Cc: Cody Fauser <cody.fauser-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org>,
	ops <ops-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org>,
	Harry Brundage
	<harry.brundage-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org>,
	Jonathan Rudenberg
	<jonathan.rudenberg-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org>
Subject: Re: Unicorn is killing our rainbows workers
Date: Thu, 19 Jul 2012 16:57:31 -0400	[thread overview]
Message-ID: <CAFFC5+NiPhu3oyEZ8woDdmH1zdPDDy9-fK3FhWPqv-6u=yFxgg@mail.gmail.com> (raw)
In-Reply-To: <20120719201633.GA8203-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>

On Thu, Jul 19, 2012 at 4:16 PM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
>
> Samuel Kadolph <samuel.kadolph-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org> wrote:
> > On Wed, Jul 18, 2012 at 8:26 PM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > > Samuel Kadolph <samuel.kadolph-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org> wrote:
> > >> On Wed, Jul 18, 2012 at 5:52 PM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > >> > Samuel Kadolph <samuel.kadolph-/3HedJEncLlQ0OI7PeSoCw@public.gmane.org> wrote:
> > >> >> https://gist.github.com/9ec96922e55a59753997. Any insight into why
> > >> >> unicorn is killing our ThreadPool workers would help us greatly. If
> > >> >> you require additional info I would be happy to provide it.
> > >
> > > Also, are you using "preload_app true" ?
> >
> > Yes we are using preload_app true.
> >
> > > I'm a bit curious how these messages are happening, too:
> > > D, [2012-07-18T15:12:43.185808 #17213] DEBUG -- : waiting 151.5s after
> > > suspend/hibernation
> >
> > They are strange. My current hunch is the killing and that message are
> > symptoms of the same issue. Since it always follows a killing.
>
> I wonder if there's some background thread one of your gems spawns on
> load that causes the master to stall.  I'm not seeing how else unicorn
> could think it was in suspend/hibernation.
>
> > > Can you tell (from Rails logs) if the to-be-killed workers are still
> > > processing requests/responses the 300s before when the unicorn timeout
> > > hits it?  AFAIK, Rails logs the PID of each worker processing the
> > > request.
> >
> > rails doesn't log the pid but it would seem that after upgrading to
> > mysql 0.2.18 it is no longer killing workers that are busy with
> > requests.
>
> Oops, I think I've been spoiled into thinking the Hodel3000CompliantLogger
> is the default Rails logger :)
>
> > > If anything, I'd lower the unicorn timeout to something low (maybe
> > > 5-10s) since that detects hard lockups at the VM level.  Individual
> > > requests in Rainbows! _are_ allowed to take longer than the unicorn
> > > timeout.
> >
> > We lowered the unicorn timeout to 5 seconds and but that did not
> > change the killings but they seem to be happening less often. I have
> > some of our stderr logs after setting the timeout to 5 seconds at
> > https://gist.github.com/3144250.
>
> Thanks for trying that!
>
> > > Newer versions of mysql2 should avoid potential issues with
> > > ThreadTimeout/Timeout (or anything that hits Thread#kill).  I think
> > > mysql2 0.2.9 fixed a fairly important bug, and 0.2.18 fixed a very rare
> > > (but possibly related to your issue) bug,
> >
> > Upgrading mysql2 seems to have stopped unicorn from killing workers
> > that are currently busy. We were stress testing it last night and
> > after we upgraded to 0.2.18 we had no more 502s from the app but this
> > could be a coincidence since the killings are still happen.
>
> Alright, good to know 0.2.18 solved your problems.  Btw, have you
> noticed any general connectivity issues to your MySQL server?
> There were quite a few bugfixes from 0.2.6..0.2.18, though.
>
> Anyways, I'm happy your problem seems to be fixed with the mysql2
> upgrade :)

Unfortunately that didn't fix the problem. We had a large sale today
and had 2 502s. We're going to try p194 on next week and I'll let you
know if that fixes it.

> > Our ops guys say we had this problem before we were using ThreadTimeout.
>
> OK.  That's somewhat reassuring to know (especially since the culprit
> seems to be an old mysql2 gem).  I've had other users (privately) report
> issues with recursive locking because of ensure clauses (e.g.
> Mutex#synchronize) that I forgot to document.

We're going to try going without ThreadTimeout again to make sure
that's not the issue.
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


  parent reply	other threads:[~2012-07-19 20:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-18 18:52 Unicorn is killing our rainbows workers Samuel Kadolph
     [not found] ` <CAFFC5+MUdUoXhBXvw8VnnVAZsQpN1idELr0nc_Xm0HYcdtQVhA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-18 19:20   ` Jason Lewis
2012-07-18 21:52   ` Eric Wong
     [not found]     ` <20120718215222.GA11539-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-18 23:06       ` Samuel Kadolph
     [not found]         ` <CAFFC5+N=_bnyM=0WbtLxPAncs0TV4wA9P8TXZ_-T3qOtW-+w3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-19  0:26           ` Eric Wong
     [not found]             ` <20120719002641.GA17210-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-19 14:29               ` Samuel Kadolph
     [not found]                 ` <CAFFC5+NfChEobr7asqPx+3-U8_mHZqOgCLjRw=w6iCZ=z0-oCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-19 20:16                   ` Eric Wong
     [not found]                     ` <20120719201633.GA8203-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-19 20:57                       ` Samuel Kadolph [this message]
     [not found]                         ` <CAFFC5+NiPhu3oyEZ8woDdmH1zdPDDy9-fK3FhWPqv-6u=yFxgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-19 21:31                           ` Eric Wong
     [not found]                             ` <20120719213125.GA17708-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-20  0:23                               ` Samuel Kadolph
     [not found]                                 ` <CAFFC5+MKdkmLknbLeRzMNzfTVoyj9JDahFSd1Nb90vsbgS4fuQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-26 23:48                                   ` Eric Wong
     [not found]                                     ` <20120726234845.GA29453-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-27  0:00                                       ` Samuel Kadolph
     [not found]                                         ` <CAFFC5+PvKhbRWH9aLKgc3k-z+2tEPpqLrMa5+6mEUnO2K_X+9Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-27  0:11                                           ` Eric Wong
     [not found]                                             ` <20120727001125.GA30957-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-27 20:01                                               ` Samuel Kadolph
     [not found]                                                 ` <CAFFC5+MqyVEfLJN2rxae7_NPOT=8+X4cBbTz6YYgLzuC8ySXjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-27 20:40                                                   ` Eric Wong
     [not found]                                                     ` <20120727204040.GA2192-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2012-07-31 14:09                                                       ` Samuel Kadolph
     [not found]                                                         ` <CAFFC5+OYa5+nVqLFnzVkfAyq8WU57QztkvcP5tdSBDWU-2+SaQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-31 20:28                                                           ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/rainbows/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFFC5+NiPhu3oyEZ8woDdmH1zdPDDy9-fK3FhWPqv-6u=yFxgg@mail.gmail.com' \
    --to=samuel.kadolph-bqitbotahx1bdgjk7y7tuq@public.gmane.org \
    --cc=cody.fauser-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org \
    --cc=harry.brundage-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org \
    --cc=jonathan.rudenberg-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org \
    --cc=ops-BqItboTaHx1BDgjK7y7TUQ@public.gmane.org \
    --cc=rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/rainbows.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).