unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* workers not utilizing multiple CPUs
@ 2011-05-31  9:02 Nate Clark
  2011-05-31 12:10 ` Lawrence Pit
  2011-05-31 15:27 ` Eric Wong
  0 siblings, 2 replies; 8+ messages in thread
From: Nate Clark @ 2011-05-31  9:02 UTC (permalink / raw)
  To: mongrel-unicorn

We're using Unicorn to serve a Rails app on a few app servers built on
Amazon EC2 instances. Each of the xlarge EC2 instances have the
equivalent of 8 CPUs, but it seems like our Unicorn master and 8
workers are only utilizing the first CPU. We've been watching the CPU
graphs from collectd data when the website is under load, and only
cpu-0 shows any activity ... the others seem to be idle, or minimally
used by other services.

I had assumed that the OS would automatically allocate the Unicorn
worker processes to use multiple CPUs, but now I'm not sure. I
couldn't find anything about this in the Unicorn docs (except for the
mention of the worker_processes configuration, which seems to imply
that multiple CPUs would be used). Is there something that I'm not
doing?

Our EC2 instances are running Ubuntu 10.04 LTS with Linux kernel 2.6.32.

Thanks in advance for any insights or suggestions.

Nate Clark
Pivotal Labs Singapore
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31  9:02 workers not utilizing multiple CPUs Nate Clark
@ 2011-05-31 12:10 ` Lawrence Pit
  2011-05-31 12:20   ` Nate Clark
  2011-05-31 15:27 ` Eric Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Lawrence Pit @ 2011-05-31 12:10 UTC (permalink / raw)
  To: unicorn list; +Cc: Nate Clark

Hi Nate,
> We've been watching the CPU
> graphs from collectd data when the website is under load, and only
> cpu-0 shows any activity ... the others seem to be idle, or minimally
> used by other services.

I don't think you can rely on the numbers collectd (nor top) gives you 
when measuring from within the hypervisor powering your EC2 instance. 
The only reliable source of CPU utilization is Cloudwatch, as that 
measures outside your instances.

I've used an array of xlarge instances myself, each running 17 unicorn 
workers serving a rails app, consuming 4GB, leaving 3GB, no swap. Worked 
well for us under high load. It couldn't have handled that if all 17 
unicorn workers would've been served by 1 of those 8 virtual cores.


Cheers,
Lawrence

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31 12:10 ` Lawrence Pit
@ 2011-05-31 12:20   ` Nate Clark
  2011-05-31 14:07     ` Clifton King
  0 siblings, 1 reply; 8+ messages in thread
From: Nate Clark @ 2011-05-31 12:20 UTC (permalink / raw)
  To: Lawrence Pit; +Cc: unicorn list

Lawrence,

I've suspected that it may be a monitoring problem and not a Unicorn
problem, but I'm not yet convinced either way. Our monitoring via
collectd is done through Rightscale. They have a lot of experience
with EC2, so I'd assume that it is monitoring properly. Also, our
other services (mysql, for example) are showing activity on multiple
cores under load, so that leads me to believe that the monitoring is
working in at least some cases.

I wasn't aware of the Cloudwatch service until now, that looks
interesting ... I'll check it out.

Anyone else experience a problem like this?

Nate

On Tue, May 31, 2011 at 8:10 PM, Lawrence Pit <lawrence.pit@gmail.com> wrote:
> Hi Nate,
>>
>> We've been watching the CPU
>> graphs from collectd data when the website is under load, and only
>> cpu-0 shows any activity ... the others seem to be idle, or minimally
>> used by other services.
>
> I don't think you can rely on the numbers collectd (nor top) gives you when
> measuring from within the hypervisor powering your EC2 instance. The only
> reliable source of CPU utilization is Cloudwatch, as that measures outside
> your instances.
>
> I've used an array of xlarge instances myself, each running 17 unicorn
> workers serving a rails app, consuming 4GB, leaving 3GB, no swap. Worked
> well for us under high load. It couldn't have handled that if all 17 unicorn
> workers would've been served by 1 of those 8 virtual cores.
>
>
> Cheers,
> Lawrence
>
>
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31 12:20   ` Nate Clark
@ 2011-05-31 14:07     ` Clifton King
  2011-05-31 15:48       ` Eric Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Clifton King @ 2011-05-31 14:07 UTC (permalink / raw)
  To: unicorn list; +Cc: unicorn list, Lawrence Pit

We experience the same problem. I believe the problem has more to do with the kernel CPU scheduler than anything else. If you figure put a reliable way to spread the load, I'd like to hear it. 

Clifton King
Development
clifton@orgsync.com
512-940-7744

Sent from my phone. 

On May 31, 2011, at 7:20 AM, Nate Clark <nate@pivotallabs.com> wrote:

> Lawrence,
> 
> I've suspected that it may be a monitoring problem and not a Unicorn
> problem, but I'm not yet convinced either way. Our monitoring via
> collectd is done through Rightscale. They have a lot of experience
> with EC2, so I'd assume that it is monitoring properly. Also, our
> other services (mysql, for example) are showing activity on multiple
> cores under load, so that leads me to believe that the monitoring is
> working in at least some cases.
> 
> I wasn't aware of the Cloudwatch service until now, that looks
> interesting ... I'll check it out.
> 
> Anyone else experience a problem like this?
> 
> Nate
> 
> On Tue, May 31, 2011 at 8:10 PM, Lawrence Pit <lawrence.pit@gmail.com> wrote:
>> Hi Nate,
>>> 
>>> We've been watching the CPU
>>> graphs from collectd data when the website is under load, and only
>>> cpu-0 shows any activity ... the others seem to be idle, or minimally
>>> used by other services.
>> 
>> I don't think you can rely on the numbers collectd (nor top) gives you when
>> measuring from within the hypervisor powering your EC2 instance. The only
>> reliable source of CPU utilization is Cloudwatch, as that measures outside
>> your instances.
>> 
>> I've used an array of xlarge instances myself, each running 17 unicorn
>> workers serving a rails app, consuming 4GB, leaving 3GB, no swap. Worked
>> well for us under high load. It couldn't have handled that if all 17 unicorn
>> workers would've been served by 1 of those 8 virtual cores.
>> 
>> 
>> Cheers,
>> Lawrence
>> 
>> 
> _______________________________________________
> Unicorn mailing list - mongrel-unicorn@rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-unicorn
> Do not quote signatures (like this one) or top post when replying
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31  9:02 workers not utilizing multiple CPUs Nate Clark
  2011-05-31 12:10 ` Lawrence Pit
@ 2011-05-31 15:27 ` Eric Wong
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Wong @ 2011-05-31 15:27 UTC (permalink / raw)
  To: unicorn list

Nate Clark <nate@pivotallabs.com> wrote:
> We're using Unicorn to serve a Rails app on a few app servers built on
> Amazon EC2 instances. Each of the xlarge EC2 instances have the
> equivalent of 8 CPUs, but it seems like our Unicorn master and 8
> workers are only utilizing the first CPU. We've been watching the CPU
> graphs from collectd data when the website is under load, and only
> cpu-0 shows any activity ... the others seem to be idle, or minimally
> used by other services.

What is your request rate and average response time for the application?

If requests come in more quickly than one worker can respond, /then/ the
kernel may start using more workers.  However, it looks like your
application is just responding faster and can keep up with requests
coming in.

> I had assumed that the OS would automatically allocate the Unicorn
> worker processes to use multiple CPUs, but now I'm not sure.

The kernel does all the work for balancing.

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31 14:07     ` Clifton King
@ 2011-05-31 15:48       ` Eric Wong
  2011-05-31 15:55         ` Clifton King
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2011-05-31 15:48 UTC (permalink / raw)
  To: unicorn list

Clifton King <cliftonk@gmail.com> wrote:
> We experience the same problem. I believe the problem has more to do
> with the kernel CPU scheduler than anything else. If you figure put a
> reliable way to spread the load, I'd like to hear it.

Load not being spread is /not/ a problem unless there are requests that
get stuck in the listen queue.

If no requests are actually stuck in the queue (light load), the kernel
is right to put requests into the most recently used worker since it can
get better CPU cache behavior this way.


== The real problem

Under high loads (many cores, fast responses), Unicorn currently uses
more resources because of non-blocking accept() + select().  This isn't
a noticeable problem for most machines (1-16 cores).

Future versions of Unicorn may take advantage of /blocking/ accept()
optimizations under Linux.  Rainbows! already lets you take advantage
of this behavior if you meet the following requirements:

* Ruby 1.9.x under Linux
* only one listen socket (if worker_connections == 1 under Rainbows!)
* use ThreadPool|XEpollThreadPool|XEpollThreadSpawn|XEpoll

I haven't had a chance to benchmark any of this on very big machines so
I have no idea how well it actually works compared to Unicorn, only how
well it works in theory :)


Blocking accept() under Ruby 1.9.x + Linux should distribute load evenly
across workers in all situations, even in the non-busy cases where load
distribution doesn't matter (your case :).

[1] - http://rainbows.rubyforge.org/Rainbows/XEpollThreadPool.html

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
  2011-05-31 15:48       ` Eric Wong
@ 2011-05-31 15:55         ` Clifton King
       [not found]           ` <BANLkTikC5D+0pUDRDgvQbzu=dpwmdKNY=A@mail.gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Clifton King @ 2011-05-31 15:55 UTC (permalink / raw)
  To: unicorn list

Thanks Eric, I had expected that to be the case (we are under light
load as of now).

On Tue, May 31, 2011 at 10:48 AM, Eric Wong <normalperson@yhbt.net> wrote:
> Clifton King <cliftonk@gmail.com> wrote:
>> We experience the same problem. I believe the problem has more to do
>> with the kernel CPU scheduler than anything else. If you figure put a
>> reliable way to spread the load, I'd like to hear it.
>
> Load not being spread is /not/ a problem unless there are requests that
> get stuck in the listen queue.
>
> If no requests are actually stuck in the queue (light load), the kernel
> is right to put requests into the most recently used worker since it can
> get better CPU cache behavior this way.
>
>
> == The real problem
>
> Under high loads (many cores, fast responses), Unicorn currently uses
> more resources because of non-blocking accept() + select().  This isn't
> a noticeable problem for most machines (1-16 cores).
>
> Future versions of Unicorn may take advantage of /blocking/ accept()
> optimizations under Linux.  Rainbows! already lets you take advantage
> of this behavior if you meet the following requirements:
>
> * Ruby 1.9.x under Linux
> * only one listen socket (if worker_connections == 1 under Rainbows!)
> * use ThreadPool|XEpollThreadPool|XEpollThreadSpawn|XEpoll
>
> I haven't had a chance to benchmark any of this on very big machines so
> I have no idea how well it actually works compared to Unicorn, only how
> well it works in theory :)
>
>
> Blocking accept() under Ruby 1.9.x + Linux should distribute load evenly
> across workers in all situations, even in the non-busy cases where load
> distribution doesn't matter (your case :).
>
> [1] - http://rainbows.rubyforge.org/Rainbows/XEpollThreadPool.html
>
> --
> Eric Wong
> _______________________________________________
> Unicorn mailing list - mongrel-unicorn@rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-unicorn
> Do not quote signatures (like this one) or top post when replying
>
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: workers not utilizing multiple CPUs
       [not found]           ` <BANLkTikC5D+0pUDRDgvQbzu=dpwmdKNY=A@mail.gmail.com>
@ 2011-06-01  6:51             ` Nate Clark
  0 siblings, 0 replies; 8+ messages in thread
From: Nate Clark @ 2011-06-01  6:51 UTC (permalink / raw)
  To: unicorn list

Thanks for the responses, all.

Eric, you were right, our load was not enough. We had just started on
load testing our app, and I think we started with too many app servers
and not enough load. Once we cranked up the load and used fewer
instances, we're now definitely seeing all CPU cores being utilized. I
was not aware that the kernel would optimize like you described.

Once we did start seeing heavier load, our collectd data and htop were
reporting usage on the virtual cores correctly.
Thanks again, very happy with the results so far,

Nate

On Wed, Jun 1, 2011 at 2:35 PM, Nate Clark <nate@pivotallabs.com> wrote:
>
> Thanks for the responses, all.
> Eric, you were right, our load was not enough. We had just started on load testing our app, and I think we started with too many app servers and not enough load. Once we cranked up the load and used fewer instances, we're now definitely seeing all CPU cores being utilized. I was not aware that the kernel would optimize like you described.
> Once we did start seeing heavier load, our collectd data and htop were reporting usage on the virtual cores correctly.
> Thanks again, very happy with the results so far,
> Nate
> On Tue, May 31, 2011 at 11:55 PM, Clifton King <cliftonk@gmail.com> wrote:
>>
>> Thanks Eric, I had expected that to be the case (we are under light
>> load as of now).
>>
>> On Tue, May 31, 2011 at 10:48 AM, Eric Wong <normalperson@yhbt.net> wrote:
>> > Clifton King <cliftonk@gmail.com> wrote:
>> >> We experience the same problem. I believe the problem has more to do
>> >> with the kernel CPU scheduler than anything else. If you figure put a
>> >> reliable way to spread the load, I'd like to hear it.
>> >
>> > Load not being spread is /not/ a problem unless there are requests that
>> > get stuck in the listen queue.
>> >
>> > If no requests are actually stuck in the queue (light load), the kernel
>> > is right to put requests into the most recently used worker since it can
>> > get better CPU cache behavior this way.
>> >
>> >
>> > == The real problem
>> >
>> > Under high loads (many cores, fast responses), Unicorn currently uses
>> > more resources because of non-blocking accept() + select().  This isn't
>> > a noticeable problem for most machines (1-16 cores).
>> >
>> > Future versions of Unicorn may take advantage of /blocking/ accept()
>> > optimizations under Linux.  Rainbows! already lets you take advantage
>> > of this behavior if you meet the following requirements:
>> >
>> > * Ruby 1.9.x under Linux
>> > * only one listen socket (if worker_connections == 1 under Rainbows!)
>> > * use ThreadPool|XEpollThreadPool|XEpollThreadSpawn|XEpoll
>> >
>> > I haven't had a chance to benchmark any of this on very big machines so
>> > I have no idea how well it actually works compared to Unicorn, only how
>> > well it works in theory :)
>> >
>> >
>> > Blocking accept() under Ruby 1.9.x + Linux should distribute load evenly
>> > across workers in all situations, even in the non-busy cases where load
>> > distribution doesn't matter (your case :).
>> >
>> > [1] - http://rainbows.rubyforge.org/Rainbows/XEpollThreadPool.html
>> >
>> > --
>> > Eric Wong
>> > _______________________________________________
>> > Unicorn mailing list - mongrel-unicorn@rubyforge.org
>> > http://rubyforge.org/mailman/listinfo/mongrel-unicorn
>> > Do not quote signatures (like this one) or top post when replying
>> >
>> _______________________________________________
>> Unicorn mailing list - mongrel-unicorn@rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mongrel-unicorn
>> Do not quote signatures (like this one) or top post when replying
>
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-06-01  6:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-31  9:02 workers not utilizing multiple CPUs Nate Clark
2011-05-31 12:10 ` Lawrence Pit
2011-05-31 12:20   ` Nate Clark
2011-05-31 14:07     ` Clifton King
2011-05-31 15:48       ` Eric Wong
2011-05-31 15:55         ` Clifton King
     [not found]           ` <BANLkTikC5D+0pUDRDgvQbzu=dpwmdKNY=A@mail.gmail.com>
2011-06-01  6:51             ` Nate Clark
2011-05-31 15:27 ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).