mirror of mongrel-development@rubyforge.org (inactive)
 help / color / mirror / Atom feed
* Pure Ruby HTTP parser
@ 2008-04-24  0:50 Tony
       [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Tony @ 2008-04-24  0:50 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw


[-- Attachment #1.1: Type: text/plain, Size: 1206 bytes --]

Before anything else, let me state this: Of course it's going to be
PAINFULLY slow on MRI.  That's not the point :)

I thought I'd try out writing out a Ruby version of the parser for the
purposes of Rubinius.  For those of you who aren't aware, Ragel supports a
goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head
honcho guy Evan Phoenix is working on a patch for Ragel to update it to the
new compiler semantics.  So really, there is a purpose for trying this out.

Anyway, here's my initial hack.  It's nasty, and presently jams the entire
FSM into instance-specific data.  Aieee!  But it more or less seems to
generate similar (albeit not identical) output to the C one:

http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da

I've thought about having a Mongrel::HttpParser::FSM module to store the
actual Ragel-generated state machine, and pass all ivars from the
Mongrel::HttpParser to an execute method then recapture them as return
values, or something to that effect.

Thoughts?  Suggestions?  Complete rewrites?  I'd appreciate them all.

-- 
Tony Arcieri
medioh.com

[-- Attachment #1.2: Type: text/html, Size: 1532 bytes --]

[-- Attachment #2: Type: text/plain, Size: 199 bytes --]

_______________________________________________
Mongrel-development mailing list
Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/mongrel-development

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Pure Ruby HTTP parser
       [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-24 11:38   ` ry dahl
       [not found]     ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: ry dahl @ 2008-04-24 11:38 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

One could replace http11's parser with some regular expressions and
out-of-bounds checking rather easily. I think Kirk Haines did this (?)
and said it was rather comparable in speed to the C/Ragel state
machine. I guess that wasn't really the point of your exercise, but
it's worth noting, if anyone actually wants a pure ruby http parser.

ry


On Thu, Apr 24, 2008 at 2:50 AM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote:
> Before anything else, let me state this: Of course it's going to be
> PAINFULLY slow on MRI.  That's not the point :)
>
> I thought I'd try out writing out a Ruby version of the parser for the
> purposes of Rubinius.  For those of you who aren't aware, Ragel supports a
> goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head
> honcho guy Evan Phoenix is working on a patch for Ragel to update it to the
> new compiler semantics.  So really, there is a purpose for trying this out.
>
> Anyway, here's my initial hack.  It's nasty, and presently jams the entire
> FSM into instance-specific data.  Aieee!  But it more or less seems to
> generate similar (albeit not identical) output to the C one:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da
>
> I've thought about having a Mongrel::HttpParser::FSM module to store the
> actual Ragel-generated state machine, and pass all ivars from the
> Mongrel::HttpParser to an execute method then recapture them as return
> values, or something to that effect.
>
> Thoughts?  Suggestions?  Complete rewrites?  I'd appreciate them all.
>
> --
> Tony Arcieri
> medioh.com
> _______________________________________________
>  Mongrel-development mailing list
>  Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
>  http://rubyforge.org/mailman/listinfo/mongrel-development
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Pure Ruby HTTP parser
       [not found]     ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-26 23:33       ` Tony
       [not found]         ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-05-06  6:27       ` Zed A. Shaw
  1 sibling, 1 reply; 5+ messages in thread
From: Tony @ 2008-04-26 23:33 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw


[-- Attachment #1.1: Type: text/plain, Size: 831 bytes --]

I pushed an updated version here:

http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950

It's now confirmed working with Mongrel::HttpServer on Rubinius with a
"Hello, world!" Mongrel::HttpHandler.

It can be used to generate a goto-driven FSM using Rubinius assembly:

http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950

Some performance figures:

MRI + C extension, parsing 10,000 requests:
  0.150000   0.000000   0.150000 (  0.152268)

Rubinius + Rubinius.asm parser, parsing 10,000 requests:
 20.500086   0.000000  20.500086 ( 20.500085)

So, presently ~135x slower than the C extension on MRI :)

-- 
Tony Arcieri
medioh.com

[-- Attachment #1.2: Type: text/html, Size: 1378 bytes --]

[-- Attachment #2: Type: text/plain, Size: 199 bytes --]

_______________________________________________
Mongrel-development mailing list
Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/mongrel-development

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Pure Ruby HTTP parser
       [not found]         ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-26 23:57           ` Luis Lavena
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Lavena @ 2008-04-26 23:57 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

On Sat, Apr 26, 2008 at 8:33 PM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote:
> I pushed an updated version here:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
>
> It's now confirmed working with Mongrel::HttpServer on Rubinius with a
> "Hello, world!" Mongrel::HttpHandler.
>
> It can be used to generate a goto-driven FSM using Rubinius assembly:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
>
> Some performance figures:
>
> MRI + C extension, parsing 10,000 requests:
>   0.150000   0.000000   0.150000 (  0.152268)
>
> Rubinius + Rubinius.asm parser, parsing 10,000 requests:
>  20.500086   0.000000  20.500086 ( 20.500085)
>
> So, presently ~135x slower than the C extension on MRI :)
>

Hey Tony,

how that can compare with Rubinius + substend?

-- 
Luis Lavena
Multimedia systems
-
Human beings, who are almost unique in having the ability to learn from
the experience of others, are also remarkable for their apparent
disinclination to do so.
Douglas Adams

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Pure Ruby HTTP parser
       [not found]     ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-04-26 23:33       ` Tony
@ 2008-05-06  6:27       ` Zed A. Shaw
  1 sibling, 0 replies; 5+ messages in thread
From: Zed A. Shaw @ 2008-05-06  6:27 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

On Thu, 24 Apr 2008 13:38:03 +0200
"ry dahl" <ry-Xek56AhD01PHviPkdFu9cA@public.gmane.org> wrote:

> One could replace http11's parser with some regular expressions and
> out-of-bounds checking rather easily. I think Kirk Haines did this (?)
> and said it was rather comparable in speed to the C/Ragel state
> machine. I guess that wasn't really the point of your exercise, but
> it's worth noting, if anyone actually wants a pure ruby http parser.

Yes, fast, but not correct.  The main difference between a generated
parser based on algorithms and hand crafted regex is when the parser
blows up it says:

"Syntax error at character #34 expecting BLAH, FOO, and BAR symbols."

Regexen do this:

"Hi, oh thanks, I *love* hacks like this.  You crafted this shellcode
really well so that it looks mildly like a payload.  Super awesome I'll
just pass this vaguely HTTP string right on to our app."

:-)

-- 
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-05-06  6:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-24  0:50 Pure Ruby HTTP parser Tony
     [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-24 11:38   ` ry dahl
     [not found]     ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-26 23:33       ` Tony
     [not found]         ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-26 23:57           ` Luis Lavena
2008-05-06  6:27       ` Zed A. Shaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).