Rainbows! Rack HTTP server user/dev discussion
 help / color / mirror / code / Atom feed
* EventMachine with thread pool and thread spawn
@ 2013-08-23 21:22 Lin Jen-Shin (godfat)
       [not found] ` <CAA2_N1uiz7Razb5J6wYCnD0w8sXrbCRp6LnLC+hTg2+Oipfrrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Lin Jen-Shin (godfat) @ 2013-08-23 21:22 UTC (permalink / raw)
  To: Rainbows! list

Hi,

So instead of trying to write a complete patch for Rainbows,
I decided to go with a gem first and see how it goes.
Here it is: https://github.com/godfat/rainbows-emtp
(It's using EM.defer internally, we could change that later)

This way, it might be easier for me to "rebase" on master.
However, I guess it might be time to give up this approach.
All tests passed except for t0106-rack-input-keepalive.sh,
but I don't find a good way to make it pass.

The key might be that pause/resume don't seem to work in
EventMachine? And using tempfile to buffer the request
might not be realistic.

Another way would be... simply mark this model as not
suitable for large chunk pipelined requests. At least it seems
working fine on our production site.

What do you think?

Thanks for all your listening.

(Maybe it's really time to move forward to celluloid-io,
not sure if I would get the chance to work on and finish it though..)
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found] ` <CAA2_N1uiz7Razb5J6wYCnD0w8sXrbCRp6LnLC+hTg2+Oipfrrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-08-23 22:51   ` Eric Wong
       [not found]     ` <20130823225114.GA5691-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2013-08-23 22:51 UTC (permalink / raw)
  To: Rainbows! list

"Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
> This way, it might be easier for me to "rebase" on master.
> However, I guess it might be time to give up this approach.
> All tests passed except for t0106-rack-input-keepalive.sh,
> but I don't find a good way to make it pass.

> The key might be that pause/resume don't seem to work in
> EventMachine? And using tempfile to buffer the request
> might not be realistic.

Yeah, t0106 is a tough one given the EM interface.

Btw, has anybody sent a patch to the EM guys to allow this?
(I don't do C++)

> Another way would be... simply mark this model as not
> suitable for large chunk pipelined requests. At least it seems
> working fine on our production site.
> 
> What do you think?

That's probably fine.

> Thanks for all your listening.
> 
> (Maybe it's really time to move forward to celluloid-io,
> not sure if I would get the chance to work on and finish it though..)

Fwiw, nowadays my personal take these days is to avoid EM/libev*-style
wrapper library unless they expose (or only have :P) a oneshot interface
(EPOLLONESHOT/EV_ONESHOT).  This is *especially* the case (for me) when
mixing epoll/kqueue with threads.

Standard event-triggered and level-triggered interfaces are both too
confusing to me.  Maybe it's just me, though :x
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]     ` <20130823225114.GA5691-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
@ 2013-08-25 12:34       ` Lin Jen-Shin (godfat)
       [not found]         ` <CAA2_N1sqvUap-97EjpiKyLicXt3J5zeNSws3O4CAJ3VKUvgVcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Lin Jen-Shin (godfat) @ 2013-08-25 12:34 UTC (permalink / raw)
  To: Rainbows! list

On Sat, Aug 24, 2013 at 6:51 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> "Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
>> This way, it might be easier for me to "rebase" on master.
>> However, I guess it might be time to give up this approach.
>> All tests passed except for t0106-rack-input-keepalive.sh,
>> but I don't find a good way to make it pass.
>
>> The key might be that pause/resume don't seem to work in
>> EventMachine? And using tempfile to buffer the request
>> might not be realistic.
>
> Yeah, t0106 is a tough one given the EM interface.
>
> Btw, has anybody sent a patch to the EM guys to allow this?
> (I don't do C++)

I just searched on Github and didn't see a relevant patch.
There are some related tickets, and it looks like pause/resume
should just work. I guess it might have some bugs somewhere.

I know some of C++, but I know little about I/O and system
programming. A naive patch that returns early in the event
dispatcher when it detects the connection has been paused,
didn't seem to make t0106 pass.

Since how pause/resume is implemented in EventMachine is
merely a flag telling it's pausing or not, I believe this
kind of implementation could be quite brittle as every place
would need to look into this flag to implement pause/resume
properly... I won't be too surprised this might not work well.

EM is also not actively maintained. A bunch of patches were
not reviewed (or rejected) and just sit there for several years...
This also discourages me putting more effort on it.

>> Another way would be... simply mark this model as not
>> suitable for large chunk pipelined requests. At least it seems
>> working fine on our production site.
>>
>> What do you think?
>
> That's probably fine.

Great! Then I'll try to make some patches for this.

>> Thanks for all your listening.
>>
>> (Maybe it's really time to move forward to celluloid-io,
>> not sure if I would get the chance to work on and finish it though..)
>
> Fwiw, nowadays my personal take these days is to avoid EM/libev*-style
> wrapper library unless they expose (or only have :P) a oneshot interface
> (EPOLLONESHOT/EV_ONESHOT).  This is *especially* the case (for me) when
> mixing epoll/kqueue with threads.
>
> Standard event-triggered and level-triggered interfaces are both too
> confusing to me.  Maybe it's just me, though :x

Since I don't know much about I/O, I don't know what they are :x
I wonder if I should just go with XEpollThreadPool then? My concern
is that as Heroku does not buffer the entire request and response,
we need something which would do this for us. Not sure if XEpollThreadPool
would be sufficient?

Another issue is that most of us develop on a Mac, so Mac support
would be desirable. Because of this I looked into sleepy_penguin,
realizing that it now has kqueue support. I just sent a patch to
its mailing list, trying to make it work on Mac. At least all
tests passed on my computer with that patch.
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]         ` <CAA2_N1sqvUap-97EjpiKyLicXt3J5zeNSws3O4CAJ3VKUvgVcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-08-25 21:57           ` Eric Wong
       [not found]             ` <20130825215701.GA31966-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2013-08-25 21:57 UTC (permalink / raw)
  To: Rainbows! list

"Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
> On Sat, Aug 24, 2013 at 6:51 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > "Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
 
> EM is also not actively maintained. A bunch of patches were
> not reviewed (or rejected) and just sit there for several years...
> This also discourages me putting more effort on it.

Yeah.  I get that feeling, too.  (And also what I've said about being
hooked on ONESHOT nowadays :)

> > Fwiw, nowadays my personal take these days is to avoid EM/libev*-style
> > wrapper library unless they expose (or only have :P) a oneshot interface
> > (EPOLLONESHOT/EV_ONESHOT).  This is *especially* the case (for me) when
> > mixing epoll/kqueue with threads.
> >
> > Standard event-triggered and level-triggered interfaces are both too
> > confusing to me.  Maybe it's just me, though :x
> 
> Since I don't know much about I/O, I don't know what they are :x

kqueue is a queue, it's in the name.  I haven't looked too hard at the
internal details, but the name says much about it :)

I've studied epoll internals a bit and know that epoll is also a queue,
it's just not in the name.

Oneshot behavior basically makes epoll/kqueue behave like a traditional
queue; once you pull a file/socket off the queue, it won't magically
reappear unless you ask epoll/kqueue to watch for it again.

This is significant for multithreaded servers, because it allows the
kernel to handle synchronization (just like accept()).

Level/edge-triggered epoll/kqueue is kind of like a tracklist in a music
player with repeat=on + random=on.  It's fine if you're playing one
track-a-time (like a single-threaded HTTP server), but if you try to
play multiple tracks in different threads, you could end up playing the
_same_ track in two or more threads.

A multi-threaded HTTP server must implement its own synchronization
on top of ET/LT epoll/kqueue to prevent serving the same client
from multiple threads.

> I wonder if I should just go with XEpollThreadPool then? My concern
> is that as Heroku does not buffer the entire request and response,
> we need something which would do this for us. Not sure if XEpollThreadPool
> would be sufficient?

Probably using XEpollThread* is a good choice, but neither buffer (but
you get more I/O concurrency anyways, so maybe it's not so bad)

Plain XEpoll does do full buffering, and you can probably add a
TryDefer-like middleware on top of that, even...

Anyways, it's still infuriating to me that Heroku cannot do full
buffering and still claims to "support" unicorn.

> Another issue is that most of us develop on a Mac, so Mac support
> would be desirable.

I can't officially support non-Free systems, but I can accept
non-intrusive patches.  I'm not nearly as experienced with kqueue,
so maybe you really found a bug in my kqueue API.

Anyways, I've considered unifying a X{Epoll,Kqueue}Thread* interface,
which is why I added kqueue support to sleepy_penguin before I forgot
about it :x

> Because of this I looked into sleepy_penguin,
> realizing that it now has kqueue support. I just sent a patch to
> its mailing list, trying to make it work on Mac. At least all
> tests passed on my computer with that patch.

I didn't get it on the list and can't find it on the archives, either.
http://librelist.com/browser/sleepy.penguin/
Are you sure it went out?  Feel free to Bcc me or send it here.

(Librelist doesn't like plain Cc :<)

It could be a librelist problem, too, but I just sent out a message
considering a move away from librelist and it went through fine:
http://mid.gmane.org/20130825212946.GA32430-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]             ` <20130825215701.GA31966-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
@ 2013-09-05 19:59               ` Lin Jen-Shin (godfat)
       [not found]                 ` <CAA2_N1utfNGSUcaNv9oLAHVzdO1MbC3wH0ar+wkfMHeCmPjkOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Lin Jen-Shin (godfat) @ 2013-09-05 19:59 UTC (permalink / raw)
  To: Rainbows! list

On Mon, Aug 26, 2013 at 5:57 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> "Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
[...]
> kqueue is a queue, it's in the name.  I haven't looked too hard at the
> internal details, but the name says much about it :)
>
> I've studied epoll internals a bit and know that epoll is also a queue,
> it's just not in the name.
>
> Oneshot behavior basically makes epoll/kqueue behave like a traditional
> queue; once you pull a file/socket off the queue, it won't magically
> reappear unless you ask epoll/kqueue to watch for it again.
>
> This is significant for multithreaded servers, because it allows the
> kernel to handle synchronization (just like accept()).
>
> Level/edge-triggered epoll/kqueue is kind of like a tracklist in a music
> player with repeat=on + random=on.  It's fine if you're playing one
> track-a-time (like a single-threaded HTTP server), but if you try to
> play multiple tracks in different threads, you could end up playing the
> _same_ track in two or more threads.
>
> A multi-threaded HTTP server must implement its own synchronization
> on top of ET/LT epoll/kqueue to prevent serving the same client
> from multiple threads.

I searched and read a bit. I am not sure if I understand correctly,
but I have a feeling that both LT and ET epoll are not designed to
be used in a multithreaded server (i.e. handle I/O in epoll and dispatch
tasks to threaded handlers), or say it would be quite hard to get it right.

That is, oneshot would be the way to go in this scenario?
I also heard that libev is merely a thin wrapper around epoll...

>> I wonder if I should just go with XEpollThreadPool then? My concern
>> is that as Heroku does not buffer the entire request and response,
>> we need something which would do this for us. Not sure if XEpollThreadPool
>> would be sufficient?
>
> Probably using XEpollThread* is a good choice, but neither buffer (but
> you get more I/O concurrency anyways, so maybe it's not so bad)

After benching mark against Heroku, I feel it does have some kind of
buffers. It's hard to tell though........ but yeah, maybe we don't really
need full buffering at application server level.

> Plain XEpoll does do full buffering, and you can probably add a
> TryDefer-like middleware on top of that, even...

Could you please explain to me why there's difference?
Is it because it might not make too much sense to buffer in
XEpollThread*, or implementation-wise, it's easier to do it this way?

> Anyways, it's still infuriating to me that Heroku cannot do full
> buffering and still claims to "support" unicorn.

They do provide a way to customize the deployment, so that we could
build an Nginx while deploying and run it for each unicorn process though.
They do not officially support this, but it could work... things might break
easily if they changed something though.

The reason why they don't fully buffer is because of latency concern
as far as I know. But I feel in most of the web apps, we don't really do
realtime stuffs.

And the most frustrating thing to me is that my co-workers don't really
trust me. They still prefer unicorn because Heroku recommends it
officially even when I showed benchmarks and some people at Heroku
did admit it could be an issue.

I am so tired of this.

> I can't officially support non-Free systems, but I can accept
> non-intrusive patches.  I'm not nearly as experienced with kqueue,
> so maybe you really found a bug in my kqueue API.
>
> Anyways, I've considered unifying a X{Epoll,Kqueue}Thread* interface,
> which is why I added kqueue support to sleepy_penguin before I forgot
> about it :x

Thanks for reading my patch. Hope one day I could jump to Linux
for my daily life as well :) I'll look into how clogger does the fallback.
I guess the idea is similar to duck "platforming" :P
This would also have the advantage that maybe one day the
platform would catch up and provide the same functions...

> I didn't get it on the list and can't find it on the archives, either.
> http://librelist.com/browser/sleepy.penguin/
> Are you sure it went out?  Feel free to Bcc me or send it here.
>
> (Librelist doesn't like plain Cc :<)
>
> It could be a librelist problem, too, but I just sent out a message
> considering a move away from librelist and it went through fine:
> http://mid.gmane.org/20130825212946.GA32430-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org

What's happening was that I sent it to librelist, and received an
email from librelist for confirming if I am subscribing it. I replied
the confirmation, wondering if my previous mail would show up
or not?

I was worried about sending it twice, thus didn't send again.
It seems after the confirmation everything works normally.
Not sure if librelist is a good choice, but at least it seems
working for me now.
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]                 ` <CAA2_N1utfNGSUcaNv9oLAHVzdO1MbC3wH0ar+wkfMHeCmPjkOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-05 23:03                   ` Eric Wong
       [not found]                     ` <20130905230305.GA5823-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
  2013-09-06  6:52                   ` Eric Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Wong @ 2013-09-05 23:03 UTC (permalink / raw)
  To: Rainbows! list

"Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
> On Mon, Aug 26, 2013 at 5:57 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > A multi-threaded HTTP server must implement its own synchronization
> > on top of ET/LT epoll/kqueue to prevent serving the same client
> > from multiple threads.
> 
> I searched and read a bit. I am not sure if I understand correctly,
> but I have a feeling that both LT and ET epoll are not designed to
> be used in a multithreaded server (i.e. handle I/O in epoll and dispatch
> tasks to threaded handlers), or say it would be quite hard to get it right.

I came to the same conclusion as you a while back.

The epoll_ctl() API isn't very efficient for single-threaded servers
(this is probably the biggest complaint of existing single-threaded
epoll users).  The kevent() API is better for single-threaded operation,
but I'm not sure if it's optimal for multithreaded.

I might propose a new epoll_xchg syscall which behaves closer to kevent.
It would only be a tiny improvement for multi-threaded epoll, but
possibly big for single-threaded epoll (but would require rewriting
existing single-threaded apps).  I don't think it's worth it since
there's lower hanging fruit for improving MT epoll performance.

> That is, oneshot would be the way to go in this scenario?
> I also heard that libev is merely a thin wrapper around epoll...

Yes.  Fwiw, I've been working on yet another Ruby server :)
It's not Rack-specific and the design is based on cmogstored
(a oneshot + multithreaded storage server for MogileFS).

> >> I wonder if I should just go with XEpollThreadPool then? My concern
> >> is that as Heroku does not buffer the entire request and response,
> >> we need something which would do this for us. Not sure if XEpollThreadPool
> >> would be sufficient?
> >
> > Probably using XEpollThread* is a good choice, but neither buffer (but
> > you get more I/O concurrency anyways, so maybe it's not so bad)
> 
> After benching mark against Heroku, I feel it does have some kind of
> buffers. It's hard to tell though........ but yeah, maybe we don't really
> need full buffering at application server level.

If your responses are all small, you'll never need/hit output buffering
in userspace (the kernel always buffers some).  I think any proxy
(Heroku included) does full header buffering, it's just few proxies can
do full request body (PUT/POST) buffering like nginx.

> > Plain XEpoll does do full buffering, and you can probably add a
> > TryDefer-like middleware on top of that, even...
> 
> Could you please explain to me why there's difference?

XEpoll is similar to the EventMachine use in Rainbows!.  It just won't
disconnect if a client goes overboard with pipelining, and there's no EM
API.  It reads all of the input before processing, and will buffer if
writes are blocked.

> Is it because it might not make too much sense to buffer in
> XEpollThread*, or implementation-wise, it's easier to do it this way?

Yeah, it was easier.  (Lazy) buffering with multiple threads helps,
too, but not as much as buffering with a single thread.  Threads
are fairly cheap nowadays.

> > It could be a librelist problem, too, but I just sent out a message
> > considering a move away from librelist and it went through fine:
> > http://mid.gmane.org/20130825212946.GA32430-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org
> 
> What's happening was that I sent it to librelist, and received an
> email from librelist for confirming if I am subscribing it. I replied
> the confirmation, wondering if my previous mail would show up
> or not?
> 
> I was worried about sending it twice, thus didn't send again.

Ah, the message says to send it again, I think.  Anyways, I don't
mind receiving too much email as long as it's plain-text.

> It seems after the confirmation everything works normally.
> Not sure if librelist is a good choice, but at least it seems
> working for me now.

Yeah, I don't think I'll use librelist for new projects (especially
since I favor reply-all more nowadays).  Not sure if it's worth the
migration hassle for small projects like s.p, though...

I'll probably go with Rubyforge or savannah.nongnu.org for new projects
for now.  Maybe using something completely decentralized and friendly to
command-line users will be suitable in the future.
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]                 ` <CAA2_N1utfNGSUcaNv9oLAHVzdO1MbC3wH0ar+wkfMHeCmPjkOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-09-05 23:03                   ` Eric Wong
@ 2013-09-06  6:52                   ` Eric Wong
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Wong @ 2013-09-06  6:52 UTC (permalink / raw)
  To: Rainbows! list

"Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
> On Mon, Aug 26, 2013 at 5:57 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > Anyways, it's still infuriating to me that Heroku cannot do full
> > buffering and still claims to "support" unicorn.
> 
> They do provide a way to customize the deployment, so that we could
> build an Nginx while deploying and run it for each unicorn process though.
> They do not officially support this, but it could work... things might break
> easily if they changed something though.
> 
> The reason why they don't fully buffer is because of latency concern
> as far as I know. But I feel in most of the web apps, we don't really do
> realtime stuffs.

(part 2, sorry got distracted during my first reply)

Latency for slow streaming?  Sure, but unicorn is completely wrong for slow
streaming in any direction.

For fast requests/responses, any latency introduced by nginx is dwarfed
by network latency to the client.

> And the most frustrating thing to me is that my co-workers don't really
> trust me. They still prefer unicorn because Heroku recommends it
> officially even when I showed benchmarks and some people at Heroku
> did admit it could be an issue.

Heh, I don't get it, either.  Especially given _my_ disapproval of
the default Heroku setup with unicorn.  Oh well, I get the feeling
some unicorn users only use it because of buzz from a handful of
blogs, not because they read the documentation and understand it.
That probably goes for most tech :<
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: EventMachine with thread pool and thread spawn
       [not found]                     ` <20130905230305.GA5823-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
@ 2013-09-26 17:59                       ` Lin Jen-Shin (godfat)
  0 siblings, 0 replies; 8+ messages in thread
From: Lin Jen-Shin (godfat) @ 2013-09-26 17:59 UTC (permalink / raw)
  To: Rainbows! list

On Fri, Sep 6, 2013 at 7:03 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> "Lin Jen-Shin (godfat)" <godfat-hOE/xeEBYYIdnm+yROfE0A@public.gmane.org> wrote:
>> On Mon, Aug 26, 2013 at 5:57 AM, Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
[...]
>> That is, oneshot would be the way to go in this scenario?
>> I also heard that libev is merely a thin wrapper around epoll...
>
> Yes.  Fwiw, I've been working on yet another Ruby server :)
> It's not Rack-specific and the design is based on cmogstored
> (a oneshot + multithreaded storage server for MogileFS).

Great news! From last time I knew, there's only Perl server
for MogileFS, though part of it could be replaced by Nginx.
I haven't been using MogileFS for several years, it would be
cool to revisit it. I still remember my super naive patch for
mogilefs-client... I knew so much more these years.

>> After benching mark against Heroku, I feel it does have some kind of
>> buffers. It's hard to tell though........ but yeah, maybe we don't really
>> need full buffering at application server level.
>
> If your responses are all small, you'll never need/hit output buffering
> in userspace (the kernel always buffers some).  I think any proxy
> (Heroku included) does full header buffering, it's just few proxies can
> do full request body (PUT/POST) buffering like nginx.

Just checked Heroku's document again. It seems the header
buffer size is 8K, and the response would have 1M buffer.
https://devcenter.heroku.com/articles/http-routing#request-buffering
The request body is still not buffered at all, if the document is
up-to-date. This is another reason I am quite disappointed.
Their stuffs are black boxes which we could only check the
document frequently and hope it's up-to-date. (which is not before)

Our responses should be far less than 1M though.

>> Is it because it might not make too much sense to buffer in
>> XEpollThread*, or implementation-wise, it's easier to do it this way?
>
> Yeah, it was easier.  (Lazy) buffering with multiple threads helps,
> too, but not as much as buffering with a single thread.  Threads
> are fairly cheap nowadays.

However we'll need to reduce the number of threads to avoid
exhausting PostgreSQL connections though :(
We only have roughly 400~450 connections available,
and connections could not be shared among threads.

I guess in order to escape from this constraint would be using
PgBouncer, which is a PostgreSQL connections pool. With this,
we could have more threads than PostgreSQL connections.

However this is not a standard Heroku addon which we could use,
we need to use some custom scripts to do so, and I am quite
worried that they might change something one day and it breaks.

I don't know why PostgreSQL connections are so heavy...

>> The reason why they don't fully buffer is because of latency concern
>> as far as I know. But I feel in most of the web apps, we don't really do
>> realtime stuffs.
>
> Latency for slow streaming?  Sure, but unicorn is completely wrong for slow
> streaming in any direction.

Interestingly, I also tried Thin which is behind EventMachine. Unicorn still
outperforms Thin easily with my simple benchmarks, unless I exploited the fact
that Unicorn does not buffer, sending a lot of very slow and large
POST requests.

So overall if no one is trying to attack like this, Unicorn still
performs better
than Thin. But I would still think leaving this kind of vulnerability
for people to
attack easily, and only solve it when we face it is not that good, especially
when we could make it much harder easily by putting some buffers.

> For fast requests/responses, any latency introduced by nginx is dwarfed
> by network latency to the client.

Yeah, right...

> Heh, I don't get it, either.  Especially given _my_ disapproval of
> the default Heroku setup with unicorn.  Oh well, I get the feeling
> some unicorn users only use it because of buzz from a handful of
> blogs, not because they read the documentation and understand it.

Exactly. I don't understand why sending links to archives for this mailing
list didn't really work... They seem to trust fancy blog posts more than
this list. I even doubt if they read the documentation... which is so clear
and inspiring. I learned so much from reading it.

> That probably goes for most tech :<

If this is true, I probably haven't used to it. I doubt if I would ever.
Sometimes I feel the more I understand, the more cynical I would be.
Not sure if it's a good thing...
_______________________________________________
Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/rainbows-talk
Do not quote signatures (like this one) or top post when replying


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-09-26 18:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-23 21:22 EventMachine with thread pool and thread spawn Lin Jen-Shin (godfat)
     [not found] ` <CAA2_N1uiz7Razb5J6wYCnD0w8sXrbCRp6LnLC+hTg2+Oipfrrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-23 22:51   ` Eric Wong
     [not found]     ` <20130823225114.GA5691-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2013-08-25 12:34       ` Lin Jen-Shin (godfat)
     [not found]         ` <CAA2_N1sqvUap-97EjpiKyLicXt3J5zeNSws3O4CAJ3VKUvgVcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-25 21:57           ` Eric Wong
     [not found]             ` <20130825215701.GA31966-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2013-09-05 19:59               ` Lin Jen-Shin (godfat)
     [not found]                 ` <CAA2_N1utfNGSUcaNv9oLAHVzdO1MbC3wH0ar+wkfMHeCmPjkOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-05 23:03                   ` Eric Wong
     [not found]                     ` <20130905230305.GA5823-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2013-09-26 17:59                       ` Lin Jen-Shin (godfat)
2013-09-06  6:52                   ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/rainbows.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).