* Pure Ruby HTTP parser
@ 2008-04-24 0:50 Tony
[not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Tony @ 2008-04-24 0:50 UTC (permalink / raw)
To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw
[-- Attachment #1.1: Type: text/plain, Size: 1206 bytes --]
Before anything else, let me state this: Of course it's going to be
PAINFULLY slow on MRI. That's not the point :)
I thought I'd try out writing out a Ruby version of the parser for the
purposes of Rubinius. For those of you who aren't aware, Ragel supports a
goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head
honcho guy Evan Phoenix is working on a patch for Ragel to update it to the
new compiler semantics. So really, there is a purpose for trying this out.
Anyway, here's my initial hack. It's nasty, and presently jams the entire
FSM into instance-specific data. Aieee! But it more or less seems to
generate similar (albeit not identical) output to the C one:
http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da
I've thought about having a Mongrel::HttpParser::FSM module to store the
actual Ragel-generated state machine, and pass all ivars from the
Mongrel::HttpParser to an execute method then recapture them as return
values, or something to that effect.
Thoughts? Suggestions? Complete rewrites? I'd appreciate them all.
--
Tony Arcieri
medioh.com
[-- Attachment #1.2: Type: text/html, Size: 1532 bytes --]
[-- Attachment #2: Type: text/plain, Size: 199 bytes --]
_______________________________________________
Mongrel-development mailing list
Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/mongrel-development
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Pure Ruby HTTP parser
[not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-24 11:38 ` ry dahl
[not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: ry dahl @ 2008-04-24 11:38 UTC (permalink / raw)
To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw
One could replace http11's parser with some regular expressions and
out-of-bounds checking rather easily. I think Kirk Haines did this (?)
and said it was rather comparable in speed to the C/Ragel state
machine. I guess that wasn't really the point of your exercise, but
it's worth noting, if anyone actually wants a pure ruby http parser.
ry
On Thu, Apr 24, 2008 at 2:50 AM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote:
> Before anything else, let me state this: Of course it's going to be
> PAINFULLY slow on MRI. That's not the point :)
>
> I thought I'd try out writing out a Ruby version of the parser for the
> purposes of Rubinius. For those of you who aren't aware, Ragel supports a
> goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head
> honcho guy Evan Phoenix is working on a patch for Ragel to update it to the
> new compiler semantics. So really, there is a purpose for trying this out.
>
> Anyway, here's my initial hack. It's nasty, and presently jams the entire
> FSM into instance-specific data. Aieee! But it more or less seems to
> generate similar (albeit not identical) output to the C one:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da
>
> I've thought about having a Mongrel::HttpParser::FSM module to store the
> actual Ragel-generated state machine, and pass all ivars from the
> Mongrel::HttpParser to an execute method then recapture them as return
> values, or something to that effect.
>
> Thoughts? Suggestions? Complete rewrites? I'd appreciate them all.
>
> --
> Tony Arcieri
> medioh.com
> _______________________________________________
> Mongrel-development mailing list
> Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
> http://rubyforge.org/mailman/listinfo/mongrel-development
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Pure Ruby HTTP parser
[not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-26 23:33 ` Tony
[not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-05-06 6:27 ` Zed A. Shaw
1 sibling, 1 reply; 5+ messages in thread
From: Tony @ 2008-04-26 23:33 UTC (permalink / raw)
To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw
[-- Attachment #1.1: Type: text/plain, Size: 831 bytes --]
I pushed an updated version here:
http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
It's now confirmed working with Mongrel::HttpServer on Rubinius with a
"Hello, world!" Mongrel::HttpHandler.
It can be used to generate a goto-driven FSM using Rubinius assembly:
http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
Some performance figures:
MRI + C extension, parsing 10,000 requests:
0.150000 0.000000 0.150000 ( 0.152268)
Rubinius + Rubinius.asm parser, parsing 10,000 requests:
20.500086 0.000000 20.500086 ( 20.500085)
So, presently ~135x slower than the C extension on MRI :)
--
Tony Arcieri
medioh.com
[-- Attachment #1.2: Type: text/html, Size: 1378 bytes --]
[-- Attachment #2: Type: text/plain, Size: 199 bytes --]
_______________________________________________
Mongrel-development mailing list
Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org
http://rubyforge.org/mailman/listinfo/mongrel-development
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Pure Ruby HTTP parser
[not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-04-26 23:57 ` Luis Lavena
0 siblings, 0 replies; 5+ messages in thread
From: Luis Lavena @ 2008-04-26 23:57 UTC (permalink / raw)
To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw
On Sat, Apr 26, 2008 at 8:33 PM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote:
> I pushed an updated version here:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
>
> It's now confirmed working with Mongrel::HttpServer on Rubinius with a
> "Hello, world!" Mongrel::HttpHandler.
>
> It can be used to generate a goto-driven FSM using Rubinius assembly:
>
> http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950
>
> Some performance figures:
>
> MRI + C extension, parsing 10,000 requests:
> 0.150000 0.000000 0.150000 ( 0.152268)
>
> Rubinius + Rubinius.asm parser, parsing 10,000 requests:
> 20.500086 0.000000 20.500086 ( 20.500085)
>
> So, presently ~135x slower than the C extension on MRI :)
>
Hey Tony,
how that can compare with Rubinius + substend?
--
Luis Lavena
Multimedia systems
-
Human beings, who are almost unique in having the ability to learn from
the experience of others, are also remarkable for their apparent
disinclination to do so.
Douglas Adams
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Pure Ruby HTTP parser
[not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-26 23:33 ` Tony
@ 2008-05-06 6:27 ` Zed A. Shaw
1 sibling, 0 replies; 5+ messages in thread
From: Zed A. Shaw @ 2008-05-06 6:27 UTC (permalink / raw)
To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw
On Thu, 24 Apr 2008 13:38:03 +0200
"ry dahl" <ry-Xek56AhD01PHviPkdFu9cA@public.gmane.org> wrote:
> One could replace http11's parser with some regular expressions and
> out-of-bounds checking rather easily. I think Kirk Haines did this (?)
> and said it was rather comparable in speed to the C/Ragel state
> machine. I guess that wasn't really the point of your exercise, but
> it's worth noting, if anyone actually wants a pure ruby http parser.
Yes, fast, but not correct. The main difference between a generated
parser based on algorithms and hand crafted regex is when the parser
blows up it says:
"Syntax error at character #34 expecting BLAH, FOO, and BAR symbols."
Regexen do this:
"Hi, oh thanks, I *love* hacks like this. You crafted this shellcode
really well so that it looks mildly like a payload. Super awesome I'll
just pass this vaguely HTTP string right on to our app."
:-)
--
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-05-06 6:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-24 0:50 Pure Ruby HTTP parser Tony
[not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-24 11:38 ` ry dahl
[not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-26 23:33 ` Tony
[not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-26 23:57 ` Luis Lavena
2008-05-06 6:27 ` Zed A. Shaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).