unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* Issues with PID file renaming
@ 2013-11-26  1:00 Jimmy Soho
  2013-11-26  1:20 ` Eric Wong
  2013-11-26  1:42 ` Michael Fischer
  0 siblings, 2 replies; 12+ messages in thread
From: Jimmy Soho @ 2013-11-26  1:00 UTC (permalink / raw)
  To: mongrel-unicorn

Hi Eric,

Since we upgraded from 4.6.2 to 4.7.0 we're regularly having issues
where one or more unicorn master processes would not upgrade
correctly, resulting in an (old) master process.

What I notice is the following: when upgrading with 4.6.2 the file
unicorn.pid is copied to unicorn.pid.oldbin and the unicorn.pid file
is updated (the (new) master process). After a while the worker pid
files are updated and the unicorn.pid.oldbin file disappears, and all
is fine.

When upgrading with 4.7.0 though the file unicorn.pid.oldbin is
created, but there is no unicorn.pid file. After a while (when the new
master process has finished initialising, and is ready to fork the
workers) the worker pid files are updated and a unicorn.pid file
appears, and then the unicorn.pid.oldbin file disappears.

So there is a relatively long period where there is no unicorn.pid file.

I think the problem for us is caused by monit, our process monitor,
which monitors the unicorn.pid file:

check process unicorn with pidfile
/srv/app.itrp-staging.com/shared/pids/unicorn.pid
  start program = "/etc/init.d/unicorn start"
  stop program = "/etc/init.d/unicorn stop"
  ...

Because the unicorn.pid file is absent for a relatively long period
monit will try to start unicorn, while an upgrade is in progress.
(which might be another issue: the start action should recognise a
start or upgrade process is already underway; sadly this check too
relies on the existence of the unicorn.pid file)

I think the way 4.6.2 worked is better. There should be a pid file for
the new master process the moment it's created.

What do you think?

-- 
Jim
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:00 Issues with PID file renaming Jimmy Soho
@ 2013-11-26  1:20 ` Eric Wong
  2013-11-26  1:40   ` Michael Fischer
  2013-12-10 12:46   ` Petteri Räty
  2013-11-26  1:42 ` Michael Fischer
  1 sibling, 2 replies; 12+ messages in thread
From: Eric Wong @ 2013-11-26  1:20 UTC (permalink / raw)
  To: unicorn list

Jimmy Soho <jimmy.soho@gmail.com> wrote:
> Hi Eric,
> 
> Since we upgraded from 4.6.2 to 4.7.0 we're regularly having issues
> where one or more unicorn master processes would not upgrade
> correctly, resulting in an (old) master process.
> 
> What I notice is the following: when upgrading with 4.6.2 the file
> unicorn.pid is copied to unicorn.pid.oldbin and the unicorn.pid file
> is updated (the (new) master process). After a while the worker pid
> files are updated and the unicorn.pid.oldbin file disappears, and all
> is fine.

Ugh.  This is what I feared...  Slow startup time of most Ruby web apps
doesn't help.

> When upgrading with 4.7.0 though the file unicorn.pid.oldbin is
> created, but there is no unicorn.pid file. After a while (when the new
> master process has finished initialising, and is ready to fork the
> workers) the worker pid files are updated and a unicorn.pid file
> appears, and then the unicorn.pid.oldbin file disappears.
> 
> So there is a relatively long period where there is no unicorn.pid file.
> 
> I think the problem for us is caused by monit, our process monitor,
> which monitors the unicorn.pid file:
> 
> check process unicorn with pidfile
> /srv/app.itrp-staging.com/shared/pids/unicorn.pid
>   start program = "/etc/init.d/unicorn start"
>   stop program = "/etc/init.d/unicorn stop"
>   ...

Is there an alternate way to monitor unicorn with monit?
But I'd like to keep your case working if possible.

> Because the unicorn.pid file is absent for a relatively long period
> monit will try to start unicorn, while an upgrade is in progress.
> (which might be another issue: the start action should recognise a
> start or upgrade process is already underway; sadly this check too
> relies on the existence of the unicorn.pid file)
> 
> I think the way 4.6.2 worked is better. There should be a pid file for
> the new master process the moment it's created.

> What do you think?

How about having the old process create a hard link to .oldbin,
and having the new one override the pid if Process.ppid == pid file?
The check is still racy, but that's what pid files are :<
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:20 ` Eric Wong
@ 2013-11-26  1:40   ` Michael Fischer
  2013-11-26  2:07     ` Eric Wong
  2013-12-10 12:46   ` Petteri Räty
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Fischer @ 2013-11-26  1:40 UTC (permalink / raw)
  To: unicorn list

On Mon, Nov 25, 2013 at 5:20 PM, Eric Wong <normalperson@yhbt.net> wrote:

>> What I notice is the following: when upgrading with 4.6.2 the file
>> unicorn.pid is copied to unicorn.pid.oldbin and the unicorn.pid file
>> is updated (the (new) master process). After a while the worker pid
>> files are updated and the unicorn.pid.oldbin file disappears, and all
>> is fine.
>
> Ugh.  This is what I feared...  Slow startup time of most Ruby web apps
> doesn't help.

Why doesn't the new master write its pid file immediately upon
forking?  It shouldn't have to wait for everything else to load,
should it?

> How about having the old process create a hard link to .oldbin,
> and having the new one override the pid if Process.ppid == pid file?
> The check is still racy, but that's what pid files are :<

If you decide to implement this, please make it optional.  The current
behavior is correct IMO, and it will work much better if the above
issue is addressed.

--Michael
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:00 Issues with PID file renaming Jimmy Soho
  2013-11-26  1:20 ` Eric Wong
@ 2013-11-26  1:42 ` Michael Fischer
  2013-11-26  4:55   ` Jimmy Soho
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Fischer @ 2013-11-26  1:42 UTC (permalink / raw)
  To: unicorn list

On Mon, Nov 25, 2013 at 5:00 PM, Jimmy Soho <jimmy.soho@gmail.com> wrote:

> I think the problem for us is caused by monit, our process monitor,
> which monitors the unicorn.pid file:
>
> check process unicorn with pidfile
> /srv/app.itrp-staging.com/shared/pids/unicorn.pid
>   start program = "/etc/init.d/unicorn start"
>   stop program = "/etc/init.d/unicorn stop"
>   …

I'd suggest that you monitor Unicorn by issuing a test request to it
via its listening socket instead.  Ultimately, you're more likely
concerned about whether Unicorn is serving requests, not whether its
pid file exists.  (Such a check can also lead to false positives;
consider what might happen if an admin or the Linux OOM killer sends
it a SIGKILL, leaving the pid file intact.)

Best regards,

--Michael
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:40   ` Michael Fischer
@ 2013-11-26  2:07     ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2013-11-26  2:07 UTC (permalink / raw)
  To: unicorn list

Michael Fischer <mfischer@zendesk.com> wrote:
> On Mon, Nov 25, 2013 at 5:20 PM, Eric Wong <normalperson@yhbt.net> wrote:
> 
> >> What I notice is the following: when upgrading with 4.6.2 the file
> >> unicorn.pid is copied to unicorn.pid.oldbin and the unicorn.pid file
> >> is updated (the (new) master process). After a while the worker pid
> >> files are updated and the unicorn.pid.oldbin file disappears, and all
> >> is fine.
> >
> > Ugh.  This is what I feared...  Slow startup time of most Ruby web apps
> > doesn't help.
> 
> Why doesn't the new master write its pid file immediately upon
> forking?  It shouldn't have to wait for everything else to load,
> should it?

It used to do this in 4.6, but I wanted to support dropping the PID
later in case of config failures.  I'll revert that part of the change.

> > How about having the old process create a hard link to .oldbin,
> > and having the new one override the pid if Process.ppid == pid file?
> > The check is still racy, but that's what pid files are :<
> 
> If you decide to implement this, please make it optional.  The current
> behavior is correct IMO, and it will work much better if the above
> issue is addressed.

I've decided against that :)

Just pushed this out to git://bogomips.org/unicorn
as 795c3527337ff4f03ae6db08c5df01141565ed96

---------------------------- 8< -----------------------------
Subject: [PATCH] always write PID file early for compatibility

This reduces the window for a non-existent PID for folks who monitor
PIDs (not a great idea anyways).  Unfortunately, this change also brings
us back to the case where having a PID later (for other process monitors)
is beneficial but more unicorn releases exist where we write the PID
early.

Thanks to Jimmy Soho for reporting this issue.
ref: <CAHStS5gFYcPBDxkVizAHrOeDKAkjT69kruFdgaY0CbB+vLbK8Q@mail.gmail.com>

This partially reverts 7d6ac0c17eb29a00a5b74099dbb3d4d015999f27

Folks: please monitor your app with HTTP requests rather than checking
processes, a stuck/wedged Ruby VM is still a running one.
---
 lib/unicorn/http_server.rb | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 402f133..f15c8a7 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -136,20 +136,15 @@ class Unicorn::HttpServer
     trap(:CHLD) { awaken_master }
 
     # write pid early for Mongrel compatibility if we're not inheriting sockets
-    # This was needed for compatibility with some health checker a long time
-    # ago.  This unfortunately has the side effect of clobbering valid PID
-    # files.
-    self.pid = config[:pid] unless ENV["UNICORN_FD"]
+    # This is needed for compatibility some Monit setups at least.
+    # This unfortunately has the side effect of clobbering valid PID if
+    # we upgrade and the upgrade breaks during preload_app==true && build_app!
+    self.pid = config[:pid]
 
     self.master_pid = $$
     build_app! if preload_app
     bind_new_listeners!
 
-    # Assuming preload_app==false, we drop the pid file after the app is ready
-    # to process requests.  If binding or build_app! fails with
-    # preload_app==true, we'll never get here and the parent will recover
-    self.pid = config[:pid] if ENV["UNICORN_FD"]
-
     spawn_missing_workers
     self
   end
-- 
EW

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:42 ` Michael Fischer
@ 2013-11-26  4:55   ` Jimmy Soho
  0 siblings, 0 replies; 12+ messages in thread
From: Jimmy Soho @ 2013-11-26  4:55 UTC (permalink / raw)
  To: unicorn list

On Tue, Nov 26, 2013 at 12:42 PM, Michael Fischer <mfischer@zendesk.com> wrote:

> I'd suggest that you monitor Unicorn by issuing a test request to it
> via its listening socket instead.  Ultimately, you're more likely
> concerned about whether Unicorn is serving requests, not whether its
> pid file exists.  (Such a check can also lead to false positives;
> consider what might happen if an admin or the Linux OOM killer sends
> it a SIGKILL, leaving the pid file intact.)

Great suggestion, thanks. I've added that, as shown below, in case
others are interested. Though perhaps not necessary, am still checking
the PID file as well so it will start asap on absence of the pid file.

@Eric: Thanks for that change.

check process unicorn
  with pidfile /srv/app.example.com/shared/pids/unicorn.pid
  group web
  start program = "/etc/init.d/unicorn start"
  stop program = "/etc/init.d/unicorn stop"
  if mem > 300.0 MB for 1 cycles then restart
  if cpu > 50% for 2 cycles then alert
  if cpu > 80% for 3 cycles then restart
  if failed unixsocket /tmp/.unicorn_sock type tcp protocol http
    and request '/ping' hostheader 'app.example.com'
    with timeout 20 seconds for 2 cycles
    then restart


--
Jimmy
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-11-26  1:20 ` Eric Wong
  2013-11-26  1:40   ` Michael Fischer
@ 2013-12-10 12:46   ` Petteri Räty
  2013-12-10 19:52     ` Eric Wong
  1 sibling, 1 reply; 12+ messages in thread
From: Petteri Räty @ 2013-12-10 12:46 UTC (permalink / raw)
  To: mongrel-unicorn

On 26.11.2013 3.20, Eric Wong wrote:

>>
>> I think the way 4.6.2 worked is better. There should be a pid file for
>> the new master process the moment it's created.
> 
>> What do you think?
> 
> How about having the old process create a hard link to .oldbin,
> and having the new one override the pid if Process.ppid == pid file?
> The check is still racy, but that's what pid files are :<
> 

Isn't it possible to always keep a valid pid file by using the fact that
mv is atomic? Basically the new process writes the pid first to a temp
file and then moves it over the old pid file after having hard linked
the file to .oldbin?

$ echo "1" > foo.pid
$ ln foo.pid foo.oldpid
$ echo "2" > tmp
$ mv tmp foo.pid
$ cat *pid
1
2

Regards,
Petteri

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-12-10 12:46   ` Petteri Räty
@ 2013-12-10 19:52     ` Eric Wong
  2013-12-10 22:44       ` Petteri Räty
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2013-12-10 19:52 UTC (permalink / raw)
  To: unicorn list

Petteri Räty <betelgeuse@gentoo.org> wrote:
> On 26.11.2013 3.20, Eric Wong wrote:
> > How about having the old process create a hard link to .oldbin,
> > and having the new one override the pid if Process.ppid == pid file?
> > The check is still racy, but that's what pid files are :<
> 
> Isn't it possible to always keep a valid pid file by using the fact that
> mv is atomic? Basically the new process writes the pid first to a temp
> file and then moves it over the old pid file after having hard linked
> the file to .oldbin?
> 
> $ echo "1" > foo.pid
> $ ln foo.pid foo.oldpid
> $ echo "2" > tmp
> $ mv tmp foo.pid
> $ cat *pid
> 1
> 2

It's possible, but is it worth it?  Having both pid files point to the
same pid is still wrong.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-12-10 19:52     ` Eric Wong
@ 2013-12-10 22:44       ` Petteri Räty
  2013-12-11 13:54         ` Michael Fischer
  0 siblings, 1 reply; 12+ messages in thread
From: Petteri Räty @ 2013-12-10 22:44 UTC (permalink / raw)
  To: mongrel-unicorn

On 10.12.2013 21.52, Eric Wong wrote:
> Petteri Räty <betelgeuse@gentoo.org> wrote:
>> On 26.11.2013 3.20, Eric Wong wrote:
>>> How about having the old process create a hard link to .oldbin,
>>> and having the new one override the pid if Process.ppid == pid file?
>>> The check is still racy, but that's what pid files are :<
>>
>> Isn't it possible to always keep a valid pid file by using the fact that
>> mv is atomic? Basically the new process writes the pid first to a temp
>> file and then moves it over the old pid file after having hard linked
>> the file to .oldbin?
>>
>> $ echo "1" > foo.pid
>> $ ln foo.pid foo.oldpid
>> $ echo "2" > tmp
>> $ mv tmp foo.pid
>> $ cat *pid
>> 1
>> 2
> 
> It's possible, but is it worth it?  Having both pid files point to the
> same pid is still wrong.
> 

At least for pid based monitoring tools it is (I do agree with others
that you should also be monitoring http though). For example monit
requires that you give it a pid file. Why is it wrong for them to point
to the same pid? Seems ok to me especially since they share the inode.
Any way I think this is better than there being no pid file. After all
we are doing a zero downtime deploy which means there's always a valid
process.

Regards,
Petteri
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-12-10 22:44       ` Petteri Räty
@ 2013-12-11 13:54         ` Michael Fischer
  2013-12-12 19:15           ` Petteri Räty
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Fischer @ 2013-12-11 13:54 UTC (permalink / raw)
  To: unicorn list

On Tue, Dec 10, 2013 at 10:44 PM, Petteri Räty <betelgeuse@gentoo.org> wrote:

> At least for pid based monitoring tools it is (I do agree with others
> that you should also be monitoring http though). For example monit
> requires that you give it a pid file. Why is it wrong for them to point
> to the same pid?

Monit doesn't require a pid, never has:

http://mmonit.com/monit/documentation/monit.html#connection_testing

To answer your question, though, the reason the pid files must contain
different PIDs is that the two processes (previous and
current-generation masters) have different PIDs.  And some of us do
care that they differ :)

--Michael
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-12-11 13:54         ` Michael Fischer
@ 2013-12-12 19:15           ` Petteri Räty
  2013-12-12 20:51             ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Petteri Räty @ 2013-12-12 19:15 UTC (permalink / raw)
  To: mongrel-unicorn

On 11.12.2013 15.54, Michael Fischer wrote:
> On Tue, Dec 10, 2013 at 10:44 PM, Petteri Räty <betelgeuse@gentoo.org> wrote:
> 
>> At least for pid based monitoring tools it is (I do agree with others
>> that you should also be monitoring http though). For example monit
>> requires that you give it a pid file. Why is it wrong for them to point
>> to the same pid?
> 
> Monit doesn't require a pid, never has:
> 
> http://mmonit.com/monit/documentation/monit.html#connection_testing
> 

If you only want to do connection testing. What if you want to monitor
the memory usage of unicorn processes?

> To answer your question, though, the reason the pid files must contain
> different PIDs is that the two processes (previous and
> current-generation masters) have different PIDs.  And some of us do
> care that they differ :)
> 

Eric's original comment and my response was about the pid files having
the same content (albeit shortly).

Regards,
Petteri

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Issues with PID file renaming
  2013-12-12 19:15           ` Petteri Räty
@ 2013-12-12 20:51             ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2013-12-12 20:51 UTC (permalink / raw)
  To: unicorn list

Petteri Räty <betelgeuse@gentoo.org> wrote:
> On 11.12.2013 15.54, Michael Fischer wrote:
> > On Tue, Dec 10, 2013 at 10:44 PM, Petteri Räty <betelgeuse@gentoo.org> wrote:
> > 
> >> At least for pid based monitoring tools it is (I do agree with others
> >> that you should also be monitoring http though). For example monit
> >> requires that you give it a pid file. Why is it wrong for them to point
> >> to the same pid?
> > 
> > Monit doesn't require a pid, never has:
> > 
> > http://mmonit.com/monit/documentation/monit.html#connection_testing
> > 
> 
> If you only want to do connection testing. What if you want to monitor
> the memory usage of unicorn processes?

Your monitoring has to adapt to the new processes anyways.  Can it
really not deal with a PID file being temporarily absent?

There's a plethora of tools in the wild which deal with PID files.
Consider this case:

It's likely a tool will see foo.pid.oldbin existing, and wait for
foo.pid to become available.  If foo.pid is already available from a
hardlink, it'll be pointing to the old PID, and any tool which backs out
of the upgrade by intending to kill the new PID will end up hitting the
old one.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-12-12 20:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-26  1:00 Issues with PID file renaming Jimmy Soho
2013-11-26  1:20 ` Eric Wong
2013-11-26  1:40   ` Michael Fischer
2013-11-26  2:07     ` Eric Wong
2013-12-10 12:46   ` Petteri Räty
2013-12-10 19:52     ` Eric Wong
2013-12-10 22:44       ` Petteri Räty
2013-12-11 13:54         ` Michael Fischer
2013-12-12 19:15           ` Petteri Räty
2013-12-12 20:51             ` Eric Wong
2013-11-26  1:42 ` Michael Fischer
2013-11-26  4:55   ` Jimmy Soho

Code repositories for project(s) associated with this inbox:

	../../../unicorn.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).