Forking off the unicorn master process to create a background worker

unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed

* Forking off the unicorn master process to create a background worker
@ 2010-05-25 18:53 Russell Branca
  2010-05-26 21:05 ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Russell Branca @ 2010-05-25 18:53 UTC (permalink / raw)
  To: mongrel-unicorn

Hello,

I'm trying to find an efficient way to create a new instance of a
rails application to perform some background tasks without having to
load up the entire rails stack every time, so I figured forking off
the master process would be a good way to go. Now I can easily just
increment the worker count and then send a web request in, but the new
worker would be part of the main worker pool, so in the time between
spawning a new worker and sending the request, another request could
have come in and snagged that worker. Is it possible to create a new
worker and not have it enter the main worker pool so I could access it
directly?

I know this is not your typical use case for unicorn, and you're
probably thinking there is a lot better ways to do this, however, I
currently have a rails framework that powers a handful of standalone
applications on a server with limited resources, and I'm trying to
make a centralized queue that all the applications use, so the queue
needs to be able to spawn a new worker for each of the applications
efficiently, and incrementing/decrementing worker counts in unicorn is
the most efficient way I've found to spawn a new rails instance.

Any help, suggestions or insight into this would be greatly appreciated.

-Russell
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Forking off the unicorn master process to create a background worker
  2010-05-25 18:53 Forking off the unicorn master process to create a background worker Russell Branca
@ 2010-05-26 21:05 ` Eric Wong
  2010-06-15 17:55   ` Russell Branca
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2010-05-26 21:05 UTC (permalink / raw)
  To: unicorn list; +Cc: Russell Branca

Russell Branca <chewbranca@gmail.com> wrote:
> Hello,
> 
> I'm trying to find an efficient way to create a new instance of a
> rails application to perform some background tasks without having to
> load up the entire rails stack every time, so I figured forking off
> the master process would be a good way to go. Now I can easily just
> increment the worker count and then send a web request in, but the new
> worker would be part of the main worker pool, so in the time between
> spawning a new worker and sending the request, another request could
> have come in and snagged that worker. Is it possible to create a new
> worker and not have it enter the main worker pool so I could access it
> directly?

Hi Russell,

You could try having an endpoint in your webapp (with authentication, or
have it reject env['REMOTE_ADDR'] != '127.0.0.1') that runs the
background task for you.  Since it's a background app, you should
probably fork + Process.setsid + fork (or Process.daemon in 1.9), and
return an HTTP response immediately so your app can serve other
requests.

The following example should be enough to get you started (totally
untested)

------------ config.ru -------------
require 'rack/lobster'

map "/.seekrit_endpoint" do
  use Rack::ContentLength
  use Rack::ContentType, 'text/plain'
  run(lambda { |env|
    return [ 403, {}, [] ] if env['REMOTE_ADDR'] != '127.0.0.1'
    pid = fork
    if pid
      Process.waitpid(pid)

      # cheap way to avoid unintentional fd sharing with our children,
      # this causes the current Unicorn worker to exit after sending
      # the response:
      # Otherwise you'd have to be careful to disconnect+reconnect
      # databases/memcached/redis/whatever (in both the parent and
      # child) to avoid unintentional sharing that'll lead to headaches
      Process.kill(:QUIT, $$)

      [ 200, {}, [ "started background process\n" ] ]
    else
      # child, daemonize it so the unicorn master won't need to
      # reap it (that's the job of init)
      Process.setsid
      exit if fork

      begin
        # run your background code here instead of sleeping
        sleep 5
        env["rack.logger"].info "done sleeping"
      rescue => e
        env["rack.logger"].error(e.inspect)
      end
      # make sure we don't enter the normal response cycle back in the
      # worker...
      exit!(0)
    end
  })
end

map "/" do
  run Rack::Lobster.new
end

> I know this is not your typical use case for unicorn, and you're
> probably thinking there is a lot better ways to do this, however, I
> currently have a rails framework that powers a handful of standalone
> applications on a server with limited resources, and I'm trying to
> make a centralized queue that all the applications use, so the queue
> needs to be able to spawn a new worker for each of the applications
> efficiently, and incrementing/decrementing worker counts in unicorn is
> the most efficient way I've found to spawn a new rails instance.

Yeah, it's definitely an odd case and there are ways to shoot yourself
in the foot with it (especially with unintentional fd sharing), but Ruby
exposes all the Unix process management goodies better than most
languages (probably better than anything else I've used).

> Any help, suggestions or insight into this would be greatly appreciated.

Let us know how it goes :)

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Forking off the unicorn master process to create a background  worker
  2010-05-26 21:05 ` Eric Wong
@ 2010-06-15 17:55   ` Russell Branca
  2010-06-15 22:14     ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Russell Branca @ 2010-06-15 17:55 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn list

Hello Eric,


Sorry for the delayed response, with the combination of being sick and
heading out of town for a while, this project got put on the
backburner. I really appreciate your response and think its a clean
solution for what I'm trying to do. I've started back in getting the
job queue working this week, and will hopefully have a working
solution in the next day or two. A little more information about what
I'm doing, I'm trying to create a centralized resque job queue server
that each of the different applications can queue work into, so I'll
be using redis behind resque for storing jobs and what not, which
brings me an area I'm not sure of the best approach on. So when we hit
the job queue endpoint in the rack app, it spawns the new worker, and
then immediately returns the 200 ok started background job message,
which cuts off communication back to the job queue. My plan is to save
a status message of the result of the background task into redis, and
have resque check that to verify the task was successful. Is there a
better approach for returning the resulting status code with unicorn,
or is this a reasonable approach? Thanks again for your help.


-Russell

On Wed, May 26, 2010 at 2:05 PM, Eric Wong <normalperson@yhbt.net> wrote:
> Russell Branca <chewbranca@gmail.com> wrote:
>> Hello,
>>
>> I'm trying to find an efficient way to create a new instance of a
>> rails application to perform some background tasks without having to
>> load up the entire rails stack every time, so I figured forking off
>> the master process would be a good way to go. Now I can easily just
>> increment the worker count and then send a web request in, but the new
>> worker would be part of the main worker pool, so in the time between
>> spawning a new worker and sending the request, another request could
>> have come in and snagged that worker. Is it possible to create a new
>> worker and not have it enter the main worker pool so I could access it
>> directly?
>
> Hi Russell,
>
> You could try having an endpoint in your webapp (with authentication, or
> have it reject env['REMOTE_ADDR'] != '127.0.0.1') that runs the
> background task for you.  Since it's a background app, you should
> probably fork + Process.setsid + fork (or Process.daemon in 1.9), and
> return an HTTP response immediately so your app can serve other
> requests.
>
> The following example should be enough to get you started (totally
> untested)
>
> ------------ config.ru -------------
> require 'rack/lobster'
>
> map "/.seekrit_endpoint" do
>  use Rack::ContentLength
>  use Rack::ContentType, 'text/plain'
>  run(lambda { |env|
>    return [ 403, {}, [] ] if env['REMOTE_ADDR'] != '127.0.0.1'
>    pid = fork
>    if pid
>      Process.waitpid(pid)
>
>      # cheap way to avoid unintentional fd sharing with our children,
>      # this causes the current Unicorn worker to exit after sending
>      # the response:
>      # Otherwise you'd have to be careful to disconnect+reconnect
>      # databases/memcached/redis/whatever (in both the parent and
>      # child) to avoid unintentional sharing that'll lead to headaches
>      Process.kill(:QUIT, $$)
>
>      [ 200, {}, [ "started background process\n" ] ]
>    else
>      # child, daemonize it so the unicorn master won't need to
>      # reap it (that's the job of init)
>      Process.setsid
>      exit if fork
>
>      begin
>        # run your background code here instead of sleeping
>        sleep 5
>        env["rack.logger"].info "done sleeping"
>      rescue => e
>        env["rack.logger"].error(e.inspect)
>      end
>      # make sure we don't enter the normal response cycle back in the
>      # worker...
>      exit!(0)
>    end
>  })
> end
>
> map "/" do
>  run Rack::Lobster.new
> end
>
>> I know this is not your typical use case for unicorn, and you're
>> probably thinking there is a lot better ways to do this, however, I
>> currently have a rails framework that powers a handful of standalone
>> applications on a server with limited resources, and I'm trying to
>> make a centralized queue that all the applications use, so the queue
>> needs to be able to spawn a new worker for each of the applications
>> efficiently, and incrementing/decrementing worker counts in unicorn is
>> the most efficient way I've found to spawn a new rails instance.
>
> Yeah, it's definitely an odd case and there are ways to shoot yourself
> in the foot with it (especially with unintentional fd sharing), but Ruby
> exposes all the Unix process management goodies better than most
> languages (probably better than anything else I've used).
>
>> Any help, suggestions or insight into this would be greatly appreciated.
>
> Let us know how it goes :)
>
> --
> Eric Wong
>
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Forking off the unicorn master process to create a background worker
  2010-06-15 17:55   ` Russell Branca
@ 2010-06-15 22:14     ` Eric Wong
  2010-06-15 22:51       ` Russell Branca
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2010-06-15 22:14 UTC (permalink / raw)
  To: Russell Branca; +Cc: unicorn list

Russell Branca <chewbranca@gmail.com> wrote:
> Hello Eric,
> 
> Sorry for the delayed response, with the combination of being sick and
> heading out of town for a while, this project got put on the
> backburner. I really appreciate your response and think its a clean
> solution for what I'm trying to do. I've started back in getting the
> job queue working this week, and will hopefully have a working
> solution in the next day or two. A little more information about what
> I'm doing, I'm trying to create a centralized resque job queue server
> that each of the different applications can queue work into, so I'll
> be using redis behind resque for storing jobs and what not, which
> brings me an area I'm not sure of the best approach on. So when we hit
> the job queue endpoint in the rack app, it spawns the new worker, and
> then immediately returns the 200 ok started background job message,
> which cuts off communication back to the job queue. My plan is to save
> a status message of the result of the background task into redis, and
> have resque check that to verify the task was successful. Is there a
> better approach for returning the resulting status code with unicorn,
> or is this a reasonable approach? Thanks again for your help.

Hi Russell, please don't top post, thanks.

If you already have a queue server (and presumably a standalone app
processing the queue), I would probably forgo the background Unicorn
worker entirely.

Based on my ancient (mid-2000s) knowledge of user-facing web
applications: the application should queue the job, return 200, and have
HTML meta refresh to constantly reload the page every few seconds.

Hitting the reload endpoint would check the database (Redis in this
case) for completion, and return a new HTML page to stop the meta
refresh loop.

This means you're no longer keeping a single Unicorn worker idle and
wasting it.  Nowadays you could do it with long-polling on
Rainbows!/Thin/Zbatery, too, but long-polling is less reliable for
people switching between WiFi access points.  The meta refresh method
can be a waste of power/bandwidth on the client side if the background
job takes a long time, though.

I'm familiar at all with Resque or Redis, but I suspect other folks
on this mailing list should be able to help you flesh out the details.

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Forking off the unicorn master process to create a background  worker
  2010-06-15 22:14     ` Eric Wong
@ 2010-06-15 22:51       ` Russell Branca
  2010-06-16  0:06         ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Russell Branca @ 2010-06-15 22:51 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn list

On Tue, Jun 15, 2010 at 3:14 PM, Eric Wong <normalperson@yhbt.net> wrote:
> Russell Branca <chewbranca@gmail.com> wrote:
>> Hello Eric,
>>
>> Sorry for the delayed response, with the combination of being sick and
>> heading out of town for a while, this project got put on the
>> backburner. I really appreciate your response and think its a clean
>> solution for what I'm trying to do. I've started back in getting the
>> job queue working this week, and will hopefully have a working
>> solution in the next day or two. A little more information about what
>> I'm doing, I'm trying to create a centralized resque job queue server
>> that each of the different applications can queue work into, so I'll
>> be using redis behind resque for storing jobs and what not, which
>> brings me an area I'm not sure of the best approach on. So when we hit
>> the job queue endpoint in the rack app, it spawns the new worker, and
>> then immediately returns the 200 ok started background job message,
>> which cuts off communication back to the job queue. My plan is to save
>> a status message of the result of the background task into redis, and
>> have resque check that to verify the task was successful. Is there a
>> better approach for returning the resulting status code with unicorn,
>> or is this a reasonable approach? Thanks again for your help.
>
> Hi Russell, please don't top post, thanks.
>
> If you already have a queue server (and presumably a standalone app
> processing the queue), I would probably forgo the background Unicorn
> worker entirely.
>
> Based on my ancient (mid-2000s) knowledge of user-facing web
> applications: the application should queue the job, return 200, and have
> HTML meta refresh to constantly reload the page every few seconds.
>
> Hitting the reload endpoint would check the database (Redis in this
> case) for completion, and return a new HTML page to stop the meta
> refresh loop.
>
> This means you're no longer keeping a single Unicorn worker idle and
> wasting it.  Nowadays you could do it with long-polling on
> Rainbows!/Thin/Zbatery, too, but long-polling is less reliable for
> people switching between WiFi access points.  The meta refresh method
> can be a waste of power/bandwidth on the client side if the background
> job takes a long time, though.
>
> I'm familiar at all with Resque or Redis, but I suspect other folks
> on this mailing list should be able to help you flesh out the details.
>
> --
> Eric Wong
>

Hi Eric,

I have a queue server, but I don't have a standalone app processing
the jobs, because I have a large number of stand alone applications on
a single server. Right now I've got 12 separate apps running, so if I
wanted to have a standalone app for each, that would be 12 additional
applications in memory for handling background jobs. The whole reason
I want to go with the unicorn worker approach for handling background
jobs, is so I can fork off the master process as needed, avoid the
spawning time for a normal rails instance, and only use workers as
needed. This way I can have just a few workers running at any given
time, rather than 1 worker for each app. The number of apps is only
going to increase, but I want to keep the worker pool a constant. I'll
probably just update status of completion with redis, these jobs won't
be run by users, this is all background stuff like sending
notifications, data analysis, feed parsing, etc etc, so I'm planning
on just having resque initiate a request directly, and then use
unicorn to process the task in the background.

I didn't exactly follow what you meant when you were talking about a
unicorn worker being idle, from the example config.ru you responded
with earlier on, it looks like I can just spawn a new worker that will
be outside of the normal worker pool to handle the job. I'm pretty
sure this will work, I was curious about the best approach for
returning completion status, but I think just having the worker record
its status and exit is better than having long polling connections
open between the job queue and the new unicorn worker.

-Russell
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Forking off the unicorn master process to create a background worker
  2010-06-15 22:51       ` Russell Branca
@ 2010-06-16  0:06         ` Eric Wong
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2010-06-16  0:06 UTC (permalink / raw)
  To: Russell Branca; +Cc: unicorn list

Russell Branca <chewbranca@gmail.com> wrote:
> On Tue, Jun 15, 2010 at 3:14 PM, Eric Wong <normalperson@yhbt.net> wrote:
> > Russell Branca <chewbranca@gmail.com> wrote:
> >> Hello Eric,
> >>
> >> Sorry for the delayed response, with the combination of being sick and
> >> heading out of town for a while, this project got put on the
> >> backburner. I really appreciate your response and think its a clean
> >> solution for what I'm trying to do. I've started back in getting the
> >> job queue working this week, and will hopefully have a working
> >> solution in the next day or two. A little more information about what
> >> I'm doing, I'm trying to create a centralized resque job queue server
> >> that each of the different applications can queue work into, so I'll
> >> be using redis behind resque for storing jobs and what not, which
> >> brings me an area I'm not sure of the best approach on. So when we hit
> >> the job queue endpoint in the rack app, it spawns the new worker, and
> >> then immediately returns the 200 ok started background job message,
> >> which cuts off communication back to the job queue. My plan is to save
> >> a status message of the result of the background task into redis, and
> >> have resque check that to verify the task was successful. Is there a
> >> better approach for returning the resulting status code with unicorn,
> >> or is this a reasonable approach? Thanks again for your help.
> >
> > Hi Russell, please don't top post, thanks.
> >
> > If you already have a queue server (and presumably a standalone app
> > processing the queue), I would probably forgo the background Unicorn
> > worker entirely.
> >
> > Based on my ancient (mid-2000s) knowledge of user-facing web
> > applications: the application should queue the job, return 200, and have
> > HTML meta refresh to constantly reload the page every few seconds.
> >
> > Hitting the reload endpoint would check the database (Redis in this
> > case) for completion, and return a new HTML page to stop the meta
> > refresh loop.
> >
> > This means you're no longer keeping a single Unicorn worker idle and
> > wasting it.  Nowadays you could do it with long-polling on
> > Rainbows!/Thin/Zbatery, too, but long-polling is less reliable for
> > people switching between WiFi access points.  The meta refresh method
> > can be a waste of power/bandwidth on the client side if the background
> > job takes a long time, though.
> >
> > I'm familiar at all with Resque or Redis, but I suspect other folks
> > on this mailing list should be able to help you flesh out the details.
> 
> Hi Eric,
> 
> I have a queue server, but I don't have a standalone app processing
> the jobs, because I have a large number of stand alone applications on
> a single server. Right now I've got 12 separate apps running, so if I
> wanted to have a standalone app for each, that would be 12 additional
> applications in memory for handling background jobs. The whole reason
> I want to go with the unicorn worker approach for handling background
> jobs, is so I can fork off the master process as needed, avoid the
> spawning time for a normal rails instance, and only use workers as
> needed. This way I can have just a few workers running at any given
> time, rather than 1 worker for each app. The number of apps is only
> going to increase, but I want to keep the worker pool a constant. I'll
> probably just update status of completion with redis, these jobs won't
> be run by users, this is all background stuff like sending
> notifications, data analysis, feed parsing, etc etc, so I'm planning
> on just having resque initiate a request directly, and then use
> unicorn to process the task in the background.

Ah, so I guess it's a single queue server but multiple queues?  I
guess thats where I got confused with your description.

> I didn't exactly follow what you meant when you were talking about a
> unicorn worker being idle, from the example config.ru you responded
> with earlier on, it looks like I can just spawn a new worker that will
> be outside of the normal worker pool to handle the job. I'm pretty
> sure this will work, I was curious about the best approach for
> returning completion status, but I think just having the worker record
> its status and exit is better than having long polling connections
> open between the job queue and the new unicorn worker.

Yes, having the fork as I made in the example should work.  I haven't
tested it, of course :)  My instincts tell me recording the status and
exiting ASAP is better because it uses less memory.

You should test and experiment with it either way.  You know your apps,
requirements, and Redis/Resque far better than I do :)  Consider
software an evolutionary process, so whatever the "best approach" may
be, another one can usurp it eventually or be completely wrong in a
slightly different setting :)

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-06-16  0:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-25 18:53 Forking off the unicorn master process to create a background worker Russell Branca
2010-05-26 21:05 ` Eric Wong
2010-06-15 17:55   ` Russell Branca
2010-06-15 22:14     ` Eric Wong
2010-06-15 22:51       ` Russell Branca
2010-06-16  0:06         ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).