* Pure Ruby HTTP parser @ 2008-04-24 0:50 Tony [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Tony @ 2008-04-24 0:50 UTC (permalink / raw) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw [-- Attachment #1.1: Type: text/plain, Size: 1206 bytes --] Before anything else, let me state this: Of course it's going to be PAINFULLY slow on MRI. That's not the point :) I thought I'd try out writing out a Ruby version of the parser for the purposes of Rubinius. For those of you who aren't aware, Ragel supports a goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head honcho guy Evan Phoenix is working on a patch for Ragel to update it to the new compiler semantics. So really, there is a purpose for trying this out. Anyway, here's my initial hack. It's nasty, and presently jams the entire FSM into instance-specific data. Aieee! But it more or less seems to generate similar (albeit not identical) output to the C one: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da I've thought about having a Mongrel::HttpParser::FSM module to store the actual Ragel-generated state machine, and pass all ivars from the Mongrel::HttpParser to an execute method then recapture them as return values, or something to that effect. Thoughts? Suggestions? Complete rewrites? I'd appreciate them all. -- Tony Arcieri medioh.com [-- Attachment #1.2: Type: text/html, Size: 1532 bytes --] [-- Attachment #2: Type: text/plain, Size: 199 bytes --] _______________________________________________ Mongrel-development mailing list Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org http://rubyforge.org/mailman/listinfo/mongrel-development ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Pure Ruby HTTP parser [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2008-04-24 11:38 ` ry dahl [not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: ry dahl @ 2008-04-24 11:38 UTC (permalink / raw) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw One could replace http11's parser with some regular expressions and out-of-bounds checking rather easily. I think Kirk Haines did this (?) and said it was rather comparable in speed to the C/Ragel state machine. I guess that wasn't really the point of your exercise, but it's worth noting, if anyone actually wants a pure ruby http parser. ry On Thu, Apr 24, 2008 at 2:50 AM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote: > Before anything else, let me state this: Of course it's going to be > PAINFULLY slow on MRI. That's not the point :) > > I thought I'd try out writing out a Ruby version of the parser for the > purposes of Rubinius. For those of you who aren't aware, Ragel supports a > goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head > honcho guy Evan Phoenix is working on a patch for Ragel to update it to the > new compiler semantics. So really, there is a purpose for trying this out. > > Anyway, here's my initial hack. It's nasty, and presently jams the entire > FSM into instance-specific data. Aieee! But it more or less seems to > generate similar (albeit not identical) output to the C one: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da > > I've thought about having a Mongrel::HttpParser::FSM module to store the > actual Ragel-generated state machine, and pass all ivars from the > Mongrel::HttpParser to an execute method then recapture them as return > values, or something to that effect. > > Thoughts? Suggestions? Complete rewrites? I'd appreciate them all. > > -- > Tony Arcieri > medioh.com > _______________________________________________ > Mongrel-development mailing list > Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org > http://rubyforge.org/mailman/listinfo/mongrel-development > > ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Pure Ruby HTTP parser [not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2008-04-26 23:33 ` Tony [not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-05-06 6:27 ` Zed A. Shaw 1 sibling, 1 reply; 5+ messages in thread From: Tony @ 2008-04-26 23:33 UTC (permalink / raw) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw [-- Attachment #1.1: Type: text/plain, Size: 831 bytes --] I pushed an updated version here: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 It's now confirmed working with Mongrel::HttpServer on Rubinius with a "Hello, world!" Mongrel::HttpHandler. It can be used to generate a goto-driven FSM using Rubinius assembly: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 Some performance figures: MRI + C extension, parsing 10,000 requests: 0.150000 0.000000 0.150000 ( 0.152268) Rubinius + Rubinius.asm parser, parsing 10,000 requests: 20.500086 0.000000 20.500086 ( 20.500085) So, presently ~135x slower than the C extension on MRI :) -- Tony Arcieri medioh.com [-- Attachment #1.2: Type: text/html, Size: 1378 bytes --] [-- Attachment #2: Type: text/plain, Size: 199 bytes --] _______________________________________________ Mongrel-development mailing list Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org http://rubyforge.org/mailman/listinfo/mongrel-development ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Pure Ruby HTTP parser [not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2008-04-26 23:57 ` Luis Lavena 0 siblings, 0 replies; 5+ messages in thread From: Luis Lavena @ 2008-04-26 23:57 UTC (permalink / raw) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw On Sat, Apr 26, 2008 at 8:33 PM, Tony <tony-INw5wk3xIkAIjDr1QQGPvw@public.gmane.org> wrote: > I pushed an updated version here: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 > > It's now confirmed working with Mongrel::HttpServer on Rubinius with a > "Hello, world!" Mongrel::HttpHandler. > > It can be used to generate a goto-driven FSM using Rubinius assembly: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 > > Some performance figures: > > MRI + C extension, parsing 10,000 requests: > 0.150000 0.000000 0.150000 ( 0.152268) > > Rubinius + Rubinius.asm parser, parsing 10,000 requests: > 20.500086 0.000000 20.500086 ( 20.500085) > > So, presently ~135x slower than the C extension on MRI :) > Hey Tony, how that can compare with Rubinius + substend? -- Luis Lavena Multimedia systems - Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so. Douglas Adams ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Pure Ruby HTTP parser [not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-04-26 23:33 ` Tony @ 2008-05-06 6:27 ` Zed A. Shaw 1 sibling, 0 replies; 5+ messages in thread From: Zed A. Shaw @ 2008-05-06 6:27 UTC (permalink / raw) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw On Thu, 24 Apr 2008 13:38:03 +0200 "ry dahl" <ry-Xek56AhD01PHviPkdFu9cA@public.gmane.org> wrote: > One could replace http11's parser with some regular expressions and > out-of-bounds checking rather easily. I think Kirk Haines did this (?) > and said it was rather comparable in speed to the C/Ragel state > machine. I guess that wasn't really the point of your exercise, but > it's worth noting, if anyone actually wants a pure ruby http parser. Yes, fast, but not correct. The main difference between a generated parser based on algorithms and hand crafted regex is when the parser blows up it says: "Syntax error at character #34 expecting BLAH, FOO, and BAR symbols." Regexen do this: "Hi, oh thanks, I *love* hacks like this. You crafted this shellcode really well so that it looks mildly like a payload. Super awesome I'll just pass this vaguely HTTP string right on to our app." :-) -- Zed A. Shaw - Hate: http://savingtheinternetwithhate.com/ - Good: http://www.zedshaw.com/ - Evil: http://yearofevil.com/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-05-06 6:27 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-04-24 0:50 Pure Ruby HTTP parser Tony [not found] ` <c7e6b2b00804231750o1c86d377i25c83d5e99f41dd1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-04-24 11:38 ` ry dahl [not found] ` <3ae7f4480804240438g62ed7190if2a84ff08dd2fe34-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-04-26 23:33 ` Tony [not found] ` <c7e6b2b00804261633y2835329eqaa1b0190e19f59ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-04-26 23:57 ` Luis Lavena 2008-05-06 6:27 ` Zed A. Shaw
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).