unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* Unicorn_rails ignores USR2 signal
@ 2012-03-09 21:48 Yeung, Jeffrey
  2012-03-09 22:24 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-09 21:48 UTC (permalink / raw)
  To: mongrel-unicorn@rubyforge.org

Hi,

I have a problem with one of my Rails servers where the daemonized unicorn_rails master seems to completely ignores the USR2 signal (only).

First, about the environment.  I've searched for solutions to this problem for several day now, and my situation appears to differ from most in that my environment is not using RVM, bundler, Capistrano, or other sandboxing tools.  It's a fairly plain jane Rails deployment.  The OS is 64-bit Ubuntu 10.04 server.

 $ uname -a
 Linux 2.6.32-37-server #81-Ubuntu SMP Fri Dec 2 20:49:12 UTC 2011 x86_64 GNU/Linux
 $ lsb_release -a
 No LSB modules are available.
 Distributor ID: Ubuntu
 Description:    Ubuntu 10.04.4 LTS
 Release:        10.04
 Codename:       lucid
 $ ruby -v
 ruby 1.9.1p376 (2009-12-07 revision 26041) [x86_64-linux]

The Rails gem version is 2.3.14.  Unicorn gem version 4.2.0.

The Unicorn config includes "preload_app true".  We use the same Unicorn configs on similar servers and for some yet unknown reason, this one is seeing a problem with USR2.

I have done my best to debug it using strace and the stderr logs.  The unicorn_rails master process handles HUP, QUIT, and USR1 signals as expected, no problems.  However, the USR2 signal is being completely ignored.

I added a before_exec block in the Unicorn conf with some puts statements, but this hook never gets called (unlike the before_fork and after_fork hooks, which are working fine for me).

 before_exec do |server|
   $stderr.puts("DEBUG before_exec")
   $stderr.puts(ENV.inspect)
   $stderr.puts(Unicorn::HttpServer::START_CTX.inspect)
 end

While monitoring the master with strace, there are no calls observed when sending a kill -USR2.  Nothing is logged to STDERR when sending this signal, either.

This has me completely stymied with the lack of logging info.  Does anyone have any clues that can point me in the right direction?

-Jeff
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-09 21:48 Unicorn_rails ignores USR2 signal Yeung, Jeffrey
@ 2012-03-09 22:24 ` Eric Wong
  2012-03-09 22:39   ` Yeung, Jeffrey
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Wong @ 2012-03-09 22:24 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
>  $ ruby -v
>  ruby 1.9.1p376 (2009-12-07 revision 26041) [x86_64-linux]
> 
> The Rails gem version is 2.3.14.  Unicorn gem version 4.2.0.

> I have done my best to debug it using strace and the stderr logs.  The
> unicorn_rails master process handles HUP, QUIT, and USR1 signals as
> expected, no problems.  However, the USR2 signal is being completely
> ignored.

> While monitoring the master with strace, there are no calls observed
> when sending a kill -USR2.  Nothing is logged to STDERR when sending
> this signal, either.

Really strange, especially since other signals seem to work...

Since you're on Ruby (>= 1.9.0 && < 1.9.3), are you stracing with -f?

If you're not already, try using: strace -f -e '!futex'

1.9.x versions use a dedicated timer thread to accept signals, so you
won't see system signal handlers in the VM without "strace -f".
The "-e '!futex'" filters out the noise from the polling wakeup in
<1.9.3 versions of Ruby.

Can you try Ruby 1.9.3?  The signal handling got completely reworked
(for the better) and you won't need "-e '!futex'"
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Unicorn_rails ignores USR2 signal
  2012-03-09 22:24 ` Eric Wong
@ 2012-03-09 22:39   ` Yeung, Jeffrey
  2012-03-10  0:02     ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-09 22:39 UTC (permalink / raw)
  To: Eric Wong, unicorn list

Thanks Eric.  New strace capture as follows:

 $ sudo strace -f -e '!futex' -p 14255
 Process 14255 attached with 12 threads - interrupt to quit
 [pid 14322] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
 [pid 14271] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
 [pid 14267] accept(12,  <unfinished ...>
 [pid 14264] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
 [pid 14255] select(6, [5], NULL, NULL, {42, 86504} <unfinished ...>
 [pid 14322] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
 [pid 14271] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
 [pid 14264] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
 [pid 14262] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
 [pid 14262] rt_sigreturn(0x20)          = 202

The last two lines do not say much for the event.  :(


-----Original Message-----
From: Eric Wong [mailto:normalperson@yhbt.net] 
Sent: Friday, March 09, 2012 2:24 PM
To: unicorn list
Cc: Yeung, Jeffrey
Subject: Re: Unicorn_rails ignores USR2 signal

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
>  $ ruby -v
>  ruby 1.9.1p376 (2009-12-07 revision 26041) [x86_64-linux]
> 
> The Rails gem version is 2.3.14.  Unicorn gem version 4.2.0.

> I have done my best to debug it using strace and the stderr logs.  The 
> unicorn_rails master process handles HUP, QUIT, and USR1 signals as 
> expected, no problems.  However, the USR2 signal is being completely 
> ignored.

> While monitoring the master with strace, there are no calls observed 
> when sending a kill -USR2.  Nothing is logged to STDERR when sending 
> this signal, either.

Really strange, especially since other signals seem to work...

Since you're on Ruby (>= 1.9.0 && < 1.9.3), are you stracing with -f?

If you're not already, try using: strace -f -e '!futex'

1.9.x versions use a dedicated timer thread to accept signals, so you won't see system signal handlers in the VM without "strace -f".
The "-e '!futex'" filters out the noise from the polling wakeup in
<1.9.3 versions of Ruby.

Can you try Ruby 1.9.3?  The signal handling got completely reworked (for the better) and you won't need "-e '!futex'"
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-09 22:39   ` Yeung, Jeffrey
@ 2012-03-10  0:02     ` Eric Wong
  2012-03-10  1:07       ` Yeung, Jeffrey
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Wong @ 2012-03-10  0:02 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> Thanks Eric.  New strace capture as follows:
> 
>  $ sudo strace -f -e '!futex' -p 14255
>  Process 14255 attached with 12 threads - interrupt to quit

>From that, you have 5 worker processes?

For debugging this, it can cut down on noise to only use one worker
process, too.  You can check if SIGTTOU works, too :)

Also, can you reproduce this on a freshly-started master?  Or has
the master been running and handling other signals for a while?

The most common cause of USR2 failures is due to an executable or
library being moved/replaced/upgraded away (sometimes due to
Capistrano), but that should definitely get logged and doesn't seem to
be the case for you.

>  [pid 14322] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
>  [pid 14271] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
>  [pid 14267] accept(12,  <unfinished ...>
>  [pid 14264] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
>  [pid 14255] select(6, [5], NULL, NULL, {42, 86504} <unfinished ...>
>  [pid 14322] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
>  [pid 14271] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
>  [pid 14264] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
>  [pid 14262] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
>  [pid 14262] rt_sigreturn(0x20)          = 202
> 
> The last two lines do not say much for the event.  :(

Anything more after that?  What happens when you send a previously
working signal (USR1, HUP) after sending a failed USR2 to that process?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Unicorn_rails ignores USR2 signal
  2012-03-10  0:02     ` Eric Wong
@ 2012-03-10  1:07       ` Yeung, Jeffrey
  2012-03-10  1:30         ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-10  1:07 UTC (permalink / raw)
  To: Eric Wong, unicorn list

I have it configured for 4 workers.  I just turned it down to 1 worker and tested both TTIN and TTOU.  They both work, creating and killing workers, respectively.  The following is the strace capture from an TTOU signal, followed by a USR2 signal.  Yes, that is the only output from the USR2 signal.  Behavior doesn't change if the master is freshly started, or has been running for a while.  One other thing I did not mention earlier is that the <pid>.oldbin file never gets created on the USR2 signal, but that's probably obvious already.

[pid 14463] --- SIGTTOU (Stopped (tty output)) @ 0 (0) ---
[pid 14463] rt_sigreturn(0x16)          = -1 EINTR (Interrupted system call)
[pid 14483] rt_sigprocmask(SIG_SETMASK, ~[SEGV VTALRM RTMIN RT_1], NULL, 8) = 0
[pid 14483] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14483] tgkill(14456, 14456, SIGVTALRM) = 0
[pid 14456] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted)
[pid 14456] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 14456] rt_sigreturn(0x1a)          = -1 EINTR (Interrupted system call)
[pid 14456] fcntl(6, F_GETFL)           = 0x801 (flags O_WRONLY|O_NONBLOCK)
[pid 14456] write(6, ".", 1)            = 1
[pid 14456] select(6, [5], NULL, NULL, {30, 755188}) = 1 (in [5], left {30, 755184})
[pid 14456] fcntl(5, F_GETFL)           = 0x800 (flags O_RDONLY|O_NONBLOCK)
[pid 14456] read(5, ".", 11)            = 1
[pid 14456] wait4(-1, 0x7fffcb5560ec, WNOHANG, NULL) = 0
[pid 14456] wait4(-1, 0x7fffcb5560ec, WNOHANG, NULL) = 0
[pid 14456] kill(14482, SIGQUIT <unfinished ...>
[pid 14482] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted)
[pid 14456] <... kill resumed> )        = 0
[pid 14482] --- SIGQUIT (Quit) @ 0 (0) ---
[pid 14456] select(6, [5], NULL, NULL, {83, 0} <unfinished ...>
[pid 14482] rt_sigreturn(0x3)           = -1 EINTR (Interrupted system call)
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield()               = 0
[pid 14482] sched_yield( <unfinished ...>
[pid 14485] rt_sigprocmask(SIG_SETMASK, ~[SEGV VTALRM RTMIN RT_1], NULL, 8) = 0
[pid 14485] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14482] <... sched_yield resumed> ) = 0
[pid 14482] close(3)                    = 0
[pid 14482] select(6, [3], NULL, [4 5], {22, 883208}) = -1 EBADF (Bad file descriptor)
[pid 14482] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT KILL USR1 SEGV USR2 TERM CHLD STOP TTIN TTOU VTALRM WINCH RTMIN RT_1], NULL, 8) = 0
[pid 14482] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT KILL USR1 SEGV USR2 TERM CHLD STOP TTIN TTOU VTALRM WINCH RTMIN RT_1], NULL, 8) = 0
[pid 14482] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT KILL USR1 SEGV USR2 TERM CHLD STOP TTIN TTOU VTALRM WINCH RTMIN RT_1], NULL, 8) = 0
[pid 14482] getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
[pid 14485] _exit(0)                    = ?
Process 14485 detached
[pid 14482] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO, 0x7f10515428f0}, {0x48fc00, [], SA_RESTORER|SA_SIGINFO, 0x7f10515428f0}, 8) = 0
[pid 14482] rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER|SA_SIGINFO, 0x7f10515428f0}, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO, 0x7f10515428f0}, 8) = 0
[pid 14482] munmap(0x7f10517b4000, 4096) = 0
[pid 14482] close(4)                    = 0
[pid 14482] close(5)                    = 0
[pid 14482] close(7)                    = 0
[pid 14482] close(12)                   = 0
[pid 14482] unlink("/tmp/.java_pid14456") = -1 EPERM (Operation not permitted)
[pid 14482] exit_group(0)               = ?
Process 14482 detached
[pid 14463] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 14463] rt_sigreturn(0x11)          = -1 EINTR (Interrupted system call)
[pid 14483] rt_sigprocmask(SIG_SETMASK, ~[SEGV VTALRM RTMIN RT_1], NULL, 8) = 0
[pid 14483] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14483] tgkill(14456, 14456, SIGVTALRM) = 0
[pid 14456] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted)
[pid 14456] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 14456] rt_sigreturn(0x1a)          = -1 EINTR (Interrupted system call)
[pid 14456] fcntl(6, F_GETFL)           = 0x801 (flags O_WRONLY|O_NONBLOCK)
[pid 14456] write(6, ".", 1)            = 1
[pid 14456] select(6, [5], NULL, NULL, {82, 908414}) = 1 (in [5], left {82, 908410})
[pid 14456] fcntl(5, F_GETFL)           = 0x800 (flags O_RDONLY|O_NONBLOCK)
[pid 14456] read(5, ".", 11)            = 1
[pid 14456] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 14482
[pid 14456] write(2, "reaped #<Process::Status: pid 14"..., 53) = 53
[pid 14456] wait4(-1, 0x7fffcb5560ec, WNOHANG, NULL) = 0
[pid 14456] select(6, [5], NULL, NULL, {91, 0}


 <unfinished ...>
[pid 14463] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
[pid 14463] rt_sigreturn(0x20)          = 202


-----Original Message-----
From: Eric Wong [mailto:normalperson@yhbt.net] 
Sent: Friday, March 09, 2012 4:03 PM
To: unicorn list
Cc: Yeung, Jeffrey
Subject: Re: Unicorn_rails ignores USR2 signal

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> Thanks Eric.  New strace capture as follows:
> 
>  $ sudo strace -f -e '!futex' -p 14255  Process 14255 attached with 12 
> threads - interrupt to quit

>From that, you have 5 worker processes?

For debugging this, it can cut down on noise to only use one worker process, too.  You can check if SIGTTOU works, too :)

Also, can you reproduce this on a freshly-started master?  Or has the master been running and handling other signals for a while?

The most common cause of USR2 failures is due to an executable or library being moved/replaced/upgraded away (sometimes due to Capistrano), but that should definitely get logged and doesn't seem to be the case for you.

>  [pid 14322] restart_syscall(<... resuming interrupted call ...> 
> <unfinished ...>  [pid 14271] restart_syscall(<... resuming 
> interrupted call ...> <unfinished ...>  [pid 14267] accept(12,  
> <unfinished ...>  [pid 14264] restart_syscall(<... resuming 
> interrupted call ...> <unfinished ...>  [pid 14255] select(6, [5], 
> NULL, NULL, {42, 86504} <unfinished ...>  [pid 14322] <... 
> restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)  [pid 
> 14271] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection 
> timed out)  [pid 14264] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)  [pid 14262] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
>  [pid 14262] rt_sigreturn(0x20)          = 202
> 
> The last two lines do not say much for the event.  :(

Anything more after that?  What happens when you send a previously working signal (USR1, HUP) after sending a failed USR2 to that process?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-10  1:07       ` Yeung, Jeffrey
@ 2012-03-10  1:30         ` Eric Wong
  2012-03-12 21:21           ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Wong @ 2012-03-10  1:30 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> I have it configured for 4 workers.  I just turned it down to 1 worker
> and tested both TTIN and TTOU.  They both work, creating and killing
> workers, respectively.  The following is the strace capture from an
> TTOU signal, followed by a USR2 signal.  Yes, that is the only output
> from the USR2 signal.  Behavior doesn't change if the master is
> freshly started, or has been running for a while.  One other thing I
> did not mention earlier is that the <pid>.oldbin file never gets
> created on the USR2 signal, but that's probably obvious already.

I also asked for sending another signal after USR2, can you try that?

However, there's another possibility I hadn't considered, what if you
disable preload_app?  Your app or some libs it uses may be intercepting
USR2 for something it does.

Maybe this patch can work, but this may also silently break your
application/lib, too...

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 7d2c623..1b9d693 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -128,8 +128,7 @@ class Unicorn::HttpServer
     # setup signal handlers before writing pid file in case people get
     # trigger happy and send signals as soon as the pid file exists.
     # Note that signals don't actually get handled until the #join method
-    QUEUE_SIGS.each { |sig| trap(sig) { SIG_QUEUE << sig; awaken_master } }
-    trap(:CHLD) { awaken_master }
+    master_siginit
     self.pid = config[:pid]
 
     self.master_pid = $$
@@ -689,6 +688,9 @@ class Unicorn::HttpServer
         Gem.refresh
       end
       self.app = app.call
+
+      # override signal handlers the app may have set
+      master_siginit if preload_app
     end
   end
 
@@ -736,4 +738,9 @@ class Unicorn::HttpServer
     config_listeners.each { |addr| listen(addr) }
     raise ArgumentError, "no listeners" if LISTENERS.empty?
   end
+
+  def master_siginit
+    QUEUE_SIGS.each { |sig| trap(sig) { SIG_QUEUE << sig; awaken_master } }
+    trap(:CHLD) { awaken_master }
+  end
 end
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-10  1:30         ` Eric Wong
@ 2012-03-12 21:21           ` Eric Wong
  2012-03-12 22:39             ` Yeung, Jeffrey
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Wong @ 2012-03-12 21:21 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

Eric Wong <normalperson@yhbt.net> wrote:
> However, there's another possibility I hadn't considered, what if you
> disable preload_app?  Your app or some libs it uses may be intercepting
> USR2 for something it does.

Ping?  Was this it?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Unicorn_rails ignores USR2 signal
  2012-03-12 21:21           ` Eric Wong
@ 2012-03-12 22:39             ` Yeung, Jeffrey
  2012-03-12 22:44               ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-12 22:39 UTC (permalink / raw)
  To: Eric Wong, unicorn list

Hi Eric,

Sorry for the delay.  It looks like disabling preload_app did the trick.  A new master was created after sending the USR2.  Now the $$$ question is, what in the world is intercepting the signal?  :S

And this is a little late now, but here's the strace of a USR2 signal followed by a TTOU signal, with preload_app true (latter signal working okay in this case):

[pid 14542] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
[pid 14542] rt_sigreturn(0x20)          = 202
[pid 14535] <... select resumed> )      = 0 (Timeout)
[pid 14535] wait4(-1, 0x7fff6bcccb9c, WNOHANG, NULL) = 0
[pid 14535] select(6, [5], NULL, NULL, {60, 0} <unfinished ...>
[pid 14542] --- SIGTTOU (Stopped (tty output)) @ 0 (0) ---
[pid 14542] rt_sigreturn(0x16)          = -1 EINTR (Interrupted system call)
[pid 14553] rt_sigprocmask(SIG_SETMASK, ~[SEGV VTALRM RTMIN RT_1], NULL, 8) = 0
[pid 14553] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14553] tgkill(14535, 14535, SIGVTALRM) = 0
[pid 14535] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted)
[pid 14535] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 14535] rt_sigreturn(0x1a)          = -1 EINTR (Interrupted system call)
[pid 14535] fcntl(6, F_GETFL)           = 0x1 (flags O_WRONLY)
[pid 14535] fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0
[pid 14535] write(6, ".", 1)            = 1
[pid 14535] select(6, [5], NULL, NULL, {59, 939700}) = 1 (in [5], left {59, 939696})
[pid 14535] fcntl(5, F_GETFL)           = 0 (flags O_RDONLY)
[pid 14535] fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
[pid 14535] read(5, ".", 11)            = 1
[pid 14535] wait4(-1, 0x7fff6bcccb9c, WNOHANG, NULL) = 0
[pid 14535] wait4(-1, 0x7fff6bcccb9c, WNOHANG, NULL) = 0
[pid 14535] kill(14552, SIGQUIT)        = 0
[pid 14535] select(6, [5], NULL, NULL, {60, 0} <unfinished ...>
[pid 14542] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 14542] rt_sigreturn(0x11)          = -1 EINTR (Interrupted system call)
[pid 14553] rt_sigprocmask(SIG_SETMASK, ~[SEGV VTALRM RTMIN RT_1], NULL, 8) = 0
[pid 14553] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14553] tgkill(14535, 14535, SIGVTALRM) = 0
[pid 14535] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted)
[pid 14535] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 14535] rt_sigreturn(0x1a)          = -1 EINTR (Interrupted system call)
[pid 14535] fcntl(6, F_GETFL)           = 0x801 (flags O_WRONLY|O_NONBLOCK)
[pid 14535] write(6, ".", 1)            = 1
[pid 14535] select(6, [5], NULL, NULL, {59, 880004}) = 1 (in [5], left {59, 880001})
[pid 14535] fcntl(5, F_GETFL)           = 0x800 (flags O_RDONLY|O_NONBLOCK)
[pid 14535] read(5, ".", 11)            = 1
[pid 14535] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 14552
[pid 14535] write(2, "reaped #<Process::Status: pid 14"..., 53) = 53
[pid 14535] wait4(-1, 0x7fff6bcccb9c, WNOHANG, NULL) = -1 ECHILD (No child processes)
[pid 14535] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 14535] select(6, [5], NULL, NULL, {119, 0}

-----Original Message-----
From: Eric Wong [mailto:normalperson@yhbt.net] 
Sent: Monday, March 12, 2012 2:21 PM
To: unicorn list
Cc: Yeung, Jeffrey
Subject: Re: Unicorn_rails ignores USR2 signal

Eric Wong <normalperson@yhbt.net> wrote:
> However, there's another possibility I hadn't considered, what if you 
> disable preload_app?  Your app or some libs it uses may be 
> intercepting
> USR2 for something it does.

Ping?  Was this it?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-12 22:39             ` Yeung, Jeffrey
@ 2012-03-12 22:44               ` Eric Wong
  2012-03-20 19:57                 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Wong @ 2012-03-12 22:44 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

"Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> Sorry for the delay.  It looks like disabling preload_app did the
> trick.  A new master was created after sending the USR2.  Now the $$$
> question is, what in the world is intercepting the signal?  :S

Good to know, I'd just grep the installation directories for all your
Ruby libs + gems for USR2.  I haven't seen this problem before, but it'd
be good to document the conflict, at least.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-12 22:44               ` Eric Wong
@ 2012-03-20 19:57                 ` Eric Wong
  2012-03-20 23:09                   ` Yeung, Jeffrey
  2012-03-30 22:16                   ` Yeung, Jeffrey
  0 siblings, 2 replies; 15+ messages in thread
From: Eric Wong @ 2012-03-20 19:57 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

Eric Wong <normalperson@yhbt.net> wrote:
> "Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> > Sorry for the delay.  It looks like disabling preload_app did the
> > trick.  A new master was created after sending the USR2.  Now the $$$
> > question is, what in the world is intercepting the signal?  :S
> 
> Good to know, I'd just grep the installation directories for all your
> Ruby libs + gems for USR2.  I haven't seen this problem before, but it'd
> be good to document the conflict, at least.

Btw, did you ever figure out what was causing the conflict?

Pushing this out to git://bogomips.org/unicorn.git

>From 1e13ffee3469997286e65e0563b6433e7744388a Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson@yhbt.net>
Date: Tue, 20 Mar 2012 19:51:35 +0000
Subject: [PATCH] KNOWN_ISSUES: document signal conflicts in libs/apps

Jeffrey Yeung confirmed this issue on the mailing list.

ref: <E8D9E7CCC2621343A0A3BB45E8DEDFA91C682DD23D@CRPMBOXPRD04.polycom.com>
---
 KNOWN_ISSUES |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/KNOWN_ISSUES b/KNOWN_ISSUES
index f323c68..38263e7 100644
--- a/KNOWN_ISSUES
+++ b/KNOWN_ISSUES
@@ -3,6 +3,11 @@
 Occasionally odd {issues}[link:ISSUES.html] arise without a transparent or
 acceptable solution.  Those issues are documented here.
 
+* Some libraries/applications may install signal handlers which conflict
+  with signal handlers unicorn uses.  Leaving "preload_app false"
+  (the default) will allow unicorn to always override existing signal
+  handlers.
+
 * Issues with FreeBSD jails can be worked around as documented by Tatsuya Ono:
   http://mid.gmane.org/CAHBuKRj09FdxAgzsefJWotexw-7JYZGJMtgUp_dhjPz9VbKD6Q@mail.gmail.com
 
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* RE: Unicorn_rails ignores USR2 signal
  2012-03-20 19:57                 ` Eric Wong
@ 2012-03-20 23:09                   ` Yeung, Jeffrey
  2012-03-21  2:27                     ` Devin Ben-Hur
  2012-03-30 22:16                   ` Yeung, Jeffrey
  1 sibling, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-20 23:09 UTC (permalink / raw)
  To: Eric Wong, unicorn list

Eric,

I have been unable to narrow down the cause of the conflict so far.  The list of Ruby gems (and gem versions) on the affected deployment are identical to the ones on another deployment where Unicorn is upgrading just fine (with preloaded app).  Grep'ing for USR2 in the gem installations did not reveal anything,  unfortunately.  Since then, I haven't been able to spend further time investigating.  Not sure where else to look, really, but I'm open to further suggestions.

-Jeff

-----Original Message-----
From: Eric Wong [mailto:normalperson@yhbt.net] 
Sent: Tuesday, March 20, 2012 12:58 PM
To: unicorn list
Cc: Yeung, Jeffrey
Subject: Re: Unicorn_rails ignores USR2 signal

Eric Wong <normalperson@yhbt.net> wrote:
> "Yeung, Jeffrey" <Jeffrey.Yeung@polycom.com> wrote:
> > Sorry for the delay.  It looks like disabling preload_app did the 
> > trick.  A new master was created after sending the USR2.  Now the 
> > $$$ question is, what in the world is intercepting the signal?  :S
> 
> Good to know, I'd just grep the installation directories for all your 
> Ruby libs + gems for USR2.  I haven't seen this problem before, but 
> it'd be good to document the conflict, at least.

Btw, did you ever figure out what was causing the conflict?

Pushing this out to git://bogomips.org/unicorn.git

>From 1e13ffee3469997286e65e0563b6433e7744388a Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson@yhbt.net>
Date: Tue, 20 Mar 2012 19:51:35 +0000
Subject: [PATCH] KNOWN_ISSUES: document signal conflicts in libs/apps

Jeffrey Yeung confirmed this issue on the mailing list.

ref: <E8D9E7CCC2621343A0A3BB45E8DEDFA91C682DD23D@CRPMBOXPRD04.polycom.com>
---
 KNOWN_ISSUES |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/KNOWN_ISSUES b/KNOWN_ISSUES index f323c68..38263e7 100644
--- a/KNOWN_ISSUES
+++ b/KNOWN_ISSUES
@@ -3,6 +3,11 @@
 Occasionally odd {issues}[link:ISSUES.html] arise without a transparent or  acceptable solution.  Those issues are documented here.
 
+* Some libraries/applications may install signal handlers which 
+conflict
+  with signal handlers unicorn uses.  Leaving "preload_app false"
+  (the default) will allow unicorn to always override existing signal
+  handlers.
+
 * Issues with FreeBSD jails can be worked around as documented by Tatsuya Ono:
   http://mid.gmane.org/CAHBuKRj09FdxAgzsefJWotexw-7JYZGJMtgUp_dhjPz9VbKD6Q@mail.gmail.com
 
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-20 23:09                   ` Yeung, Jeffrey
@ 2012-03-21  2:27                     ` Devin Ben-Hur
  0 siblings, 0 replies; 15+ messages in thread
From: Devin Ben-Hur @ 2012-03-21  2:27 UTC (permalink / raw)
  To: unicorn list; +Cc: Yeung, Jeffrey

On 3/20/12 4:09 PM, Yeung, Jeffrey wrote:
> I have been unable to narrow down the cause of the conflict so far.  The list of Ruby gems (and gem versions) on the affected deployment are identical to the ones on another deployment where Unicorn is upgrading just fine (with preloaded app).  Grep'ing for USR2 in the gem installations did not reveal anything,  unfortunately.  Since then, I haven't been able to spend further time investigating.  Not sure where else to look, really, but I'm open to further suggestions.

Jeffery,

To uncover the culprit, you might try monkey-patching Kernel.trap and 
Signal.trap so it logs the last few entries in caller when it's called 
with USR2.

Put something like this really early in you app bootstrap:

[Kernel,Signal].each |klass|
   class << klass
     alias :orig_trap :trap
     def trap *args, &block
       if args.first.to_s =~ /USR2$/i || args.first.to_i == 31
         $stderr.puts "Caught someone trapping USR2 caller is:", 
caller.last(2)
       end
       orig_trap *args, &block
     end
   end
end
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Unicorn_rails ignores USR2 signal
  2012-03-20 19:57                 ` Eric Wong
  2012-03-20 23:09                   ` Yeung, Jeffrey
@ 2012-03-30 22:16                   ` Yeung, Jeffrey
  2012-03-30 22:51                     ` Alex Sharp
  1 sibling, 1 reply; 15+ messages in thread
From: Yeung, Jeffrey @ 2012-03-30 22:16 UTC (permalink / raw)
  To: Eric Wong, unicorn list

I have little to report, unfortunately.  I still have not identified what is trapping USR2 in my application, if something is indeed doing that.

I've attempted to do a bit more investigation work in my spare time this week, and the results are still rather mysterious.

First, I applied your patch to http_server.rb that you suggested in this thread back on March 9th (quoted below), which re-overrides the signal handlers in build_app!().  The patch did work.  Okay, that seems to reconfirm that the app is doing something with the USR2 signal handler...

Then I modified the master_siginit() method a bit further to provide more info:
  def master_siginit
    QUEUE_SIGS.each { |sig|
      old_handler = trap(sig) { SIG_QUEUE << sig; awaken_master }
      logger.info("old handler for #{sig.to_s}: #{old_handler.inspect}")
    }
    trap(:CHLD) { awaken_master }
  end

And the output was a bit surprising.  Here is the Unicorn stderr log when starting up the service.  master_siginit() is called twice, as expected...

I, [2012-03-30T14:57:51.301839 #15828]  INFO -- : listening on addr=0.0.0.0:8080 fd=3
I, [2012-03-30T14:57:51.302341 #15828]  INFO -- : old handler for WINCH: nil
I, [2012-03-30T14:57:51.302435 #15828]  INFO -- : old handler for QUIT: "DEFAULT"
I, [2012-03-30T14:57:51.302505 #15828]  INFO -- : old handler for INT: "DEFAULT"
I, [2012-03-30T14:57:51.302573 #15828]  INFO -- : old handler for TERM: "DEFAULT"
I, [2012-03-30T14:57:51.302641 #15828]  INFO -- : old handler for USR1: "DEFAULT"
I, [2012-03-30T14:57:51.302709 #15828]  INFO -- : old handler for USR2: "DEFAULT"
I, [2012-03-30T14:57:51.302786 #15828]  INFO -- : old handler for HUP: "DEFAULT"
I, [2012-03-30T14:57:51.302853 #15828]  INFO -- : old handler for TTIN: nil
I, [2012-03-30T14:57:51.302917 #15828]  INFO -- : old handler for TTOU: nil
I, [2012-03-30T14:57:51.303502 #15828]  INFO -- : Refreshing Gem list
old handler for WINCH: #<Proc:0x00000002d97488@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for QUIT: #<Proc:0x00000002d97098@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for INT: #<Proc:0x00000002d96c70@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for TERM: #<Proc:0x00000002d96848@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for USR1: #<Proc:0x00000002d96420@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for USR2: #<Proc:0x00000002d95ff8@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for HUP: #<Proc:0x00000002d95bd0@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for TTIN: #<Proc:0x00000002d957a8@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
old handler for TTOU: #<Proc:0x00000002d953b8@/usr/local/lib/ruby/gems/1.9.1/gems/unicorn-4.2.0/lib/unicorn/http_server.rb:745>
master process ready
worker=0 ready

Which to me, looks like the USR2 handler was unchanged in the second master_siginit() call.  Yet, if I remove that second master_siginit() call from there, USR2 ceases to work.  :S

-Jeff

-----Original Message-----
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index 7d2c623..1b9d693 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -128,8 +128,7 @@ class Unicorn::HttpServer
     # setup signal handlers before writing pid file in case people get
     # trigger happy and send signals as soon as the pid file exists.
     # Note that signals don't actually get handled until the #join method
-    QUEUE_SIGS.each { |sig| trap(sig) { SIG_QUEUE << sig; awaken_master } }
-    trap(:CHLD) { awaken_master }
+    master_siginit
     self.pid = config[:pid]
 
     self.master_pid = $$
@@ -689,6 +688,9 @@ class Unicorn::HttpServer
         Gem.refresh
       end
       self.app = app.call
+
+      # override signal handlers the app may have set
+      master_siginit if preload_app
     end
   end
 
@@ -736,4 +738,9 @@ class Unicorn::HttpServer
     config_listeners.each { |addr| listen(addr) }
     raise ArgumentError, "no listeners" if LISTENERS.empty?
   end
+
+  def master_siginit
+    QUEUE_SIGS.each { |sig| trap(sig) { SIG_QUEUE << sig; awaken_master } }
+    trap(:CHLD) { awaken_master }
+  end
 end
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-30 22:16                   ` Yeung, Jeffrey
@ 2012-03-30 22:51                     ` Alex Sharp
  2013-05-08 18:15                       ` Ryan Jones (RyanonRails)
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Sharp @ 2012-03-30 22:51 UTC (permalink / raw)
  To: unicorn list

Hey Jeffrey, we had a similar problem about 8 months ago. What version of ruby (including patch number) are you using? I think you said you're not using anything special in your deployment environment, right? You're not using god, by chance to do the restarts, are you?

Also, how are you doing your deployments, if not with cap?

-- 
Alex Sharp
Zaarly, Inc | @ajsharp | github.com/ajsharp | alexjsharp.com

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Unicorn_rails ignores USR2 signal
  2012-03-30 22:51                     ` Alex Sharp
@ 2013-05-08 18:15                       ` Ryan Jones (RyanonRails)
  0 siblings, 0 replies; 15+ messages in thread
From: Ryan Jones (RyanonRails) @ 2013-05-08 18:15 UTC (permalink / raw)
  To: mongrel-unicorn

We've run into the exact same problem. We spent a few good solid hours trying to
fix the problem with no avail (but we do have a 'solution').

Server specs:
Ubuntu 12.04.2 LTS (GNU/Linux 3.5.0-27-generic x86_64)

Rails & Ruby:
Rails 4.0.0.beta1
Ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Unicorn 4.6.2 - preload_app true
Capistrano

The first thing we tried was removing all of the gems (foreman, sidekiq, etc.)
and that didn't work. We tried preload_app false, and that worked (but this is
not the situation we want). We also tried searching for any gems that were
capturing USR2 (or any signals in general).

We tried monkey patching Kernel & Trap to see if we could pinpoint the problem,
but that didn't seem to turn up much. We then jumped in unicorn's http_server.rb
and threw in some logging. The only thing that seemed to show us was that USR2
was unable to get through to unicorn. Signals HUP, QUIT, etc. all worked fine
(and made their way through as per norm).

We then fired up strace with unicorn to see if we could 'see' the USR2 signal.
It looks almost identical to Jeffery's strace. It looks like the USR2 signal is
received by the process, but then dropped (or ignored). If we pass it a QUIT
signal it works just fine and we can see that in the strace.

So here's the chain of events that we think is occuring:

1. We start unicorn (a unicorn blessing)
2. before_fork is triggered (Signal trapping works here)
3. A unicorn is born (Signal trapping does not work here)
4. The after fork is triggered

In our situation. Once a unicorn is born, it can no longer received a USR2 (for
whatever reason). We never actually checked to see if the after fork received a
USR2.

After we managed to figure out the path of execution, we figured out a quick
fix. We actually call reexec method directly on the unicorn server in the
before_fork block in unicorn.rb. Here's our unicorn.rb
https://gist.github.com/RyanonRails/5541902

----- snip -----

before_fork do |server, worker|
  Signal.trap 'USR2' do
    puts 'Since a black hole is lurking and eating USR2s we will hit
http_server#reexec ourselves'
    server.send(:reexec)
  end
end
----- snip -----

This way we can actually force the USR2 directly on the server (since USR2 is
available inside of the before_fork).

This seems to be working well for us now. We're still frustrated that we weren't
able to find the cause of the problem, but we least we can move forward (with
minor disruption to the code/code base).

Lastly, here are our current theories as to why this could be occuring:
1. When unicorn is building it's native extensions something weird is happening
in regards to the USR2 signal
2. A gem is somehow gobbling up the signal (we were quite thorough, but we
might've missed something)

Hopefully this helps!

Thanks,
Ryan & Mike


_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-05-08 18:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-09 21:48 Unicorn_rails ignores USR2 signal Yeung, Jeffrey
2012-03-09 22:24 ` Eric Wong
2012-03-09 22:39   ` Yeung, Jeffrey
2012-03-10  0:02     ` Eric Wong
2012-03-10  1:07       ` Yeung, Jeffrey
2012-03-10  1:30         ` Eric Wong
2012-03-12 21:21           ` Eric Wong
2012-03-12 22:39             ` Yeung, Jeffrey
2012-03-12 22:44               ` Eric Wong
2012-03-20 19:57                 ` Eric Wong
2012-03-20 23:09                   ` Yeung, Jeffrey
2012-03-21  2:27                     ` Devin Ben-Hur
2012-03-30 22:16                   ` Yeung, Jeffrey
2012-03-30 22:51                     ` Alex Sharp
2013-05-08 18:15                       ` Ryan Jones (RyanonRails)

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).