From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Tony Newsgroups: gmane.comp.lang.ruby.mongrel.devel Subject: Pure Ruby HTTP parser Date: Wed, 23 Apr 2008 18:50:14 -0600 Message-ID: Reply-To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0588510938==" X-Trace: ger.gmane.org 1208998613 15135 80.91.229.12 (24 Apr 2008 00:56:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 24 Apr 2008 00:56:53 +0000 (UTC) To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Original-X-From: mongrel-development-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Thu Apr 24 02:57:28 2008 Return-path: Envelope-to: gclrmd-mongrel-development@m.gmane.org X-Google-Sender-Auth: 74054dafbe8314db X-BeenThere: mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: mongrel-development-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Errors-To: mongrel-development-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Xref: news.gmane.org gmane.comp.lang.ruby.mongrel.devel:49 Archived-At: Received: from rubyforge.org ([205.234.109.19]) by lo.gmane.org with esmtp (Exim 4.50) id 1Jopm9-0000XR-Oo for gclrmd-mongrel-development@m.gmane.org; Thu, 24 Apr 2008 02:57:26 +0200 Received: from rubyforge.org (rubyforge.org [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id B33C0185862E; Wed, 23 Apr 2008 20:56:42 -0400 (EDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.226]) by rubyforge.org (Postfix) with ESMTP id E7A9D1858660 for ; Wed, 23 Apr 2008 20:50:15 -0400 (EDT) Received: by wr-out-0506.google.com with SMTP id c48so1758795wra.23 for ; Wed, 23 Apr 2008 17:50:15 -0700 (PDT) Received: by 10.140.140.3 with SMTP id n3mr938380rvd.299.1208998214499; Wed, 23 Apr 2008 17:50:14 -0700 (PDT) Received: by 10.140.194.4 with HTTP; Wed, 23 Apr 2008 17:50:14 -0700 (PDT) List-Post: --===============0588510938== Content-Type: multipart/alternative; boundary="----=_Part_5847_31229455.1208998214466" ------=_Part_5847_31229455.1208998214466 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Before anything else, let me state this: Of course it's going to be PAINFULLY slow on MRI. That's not the point :) I thought I'd try out writing out a Ruby version of the parser for the purposes of Rubinius. For those of you who aren't aware, Ragel supports a goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head honcho guy Evan Phoenix is working on a patch for Ragel to update it to the new compiler semantics. So really, there is a purpose for trying this out. Anyway, here's my initial hack. It's nasty, and presently jams the entire FSM into instance-specific data. Aieee! But it more or less seems to generate similar (albeit not identical) output to the C one: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da I've thought about having a Mongrel::HttpParser::FSM module to store the actual Ragel-generated state machine, and pass all ivars from the Mongrel::HttpParser to an execute method then recapture them as return values, or something to that effect. Thoughts? Suggestions? Complete rewrites? I'd appreciate them all. -- Tony Arcieri medioh.com ------=_Part_5847_31229455.1208998214466 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Before anything else, let me state this: Of course it's going to be PAINFULLY slow on MRI.  That's not the point :)

I thought I'd try out writing out a Ruby version of the parser for the purposes of Rubinius.  For those of you who aren't aware, Ragel supports a goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head honcho guy Evan Phoenix is working on a patch for Ragel to update it to the new compiler semantics.  So really, there is a purpose for trying this out.

Anyway, here's my initial hack.  It's nasty, and presently jams the entire FSM into instance-specific data.  Aieee!  But it more or less seems to generate similar (albeit not identical) output to the C one:

http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da

I've thought about having a Mongrel::HttpParser::FSM module to store the actual Ragel-generated state machine, and pass all ivars from the Mongrel::HttpParser to an execute method then recapture them as return values, or something to that effect.

Thoughts?  Suggestions?  Complete rewrites?  I'd appreciate them all.

--
Tony Arcieri
medioh.com ------=_Part_5847_31229455.1208998214466-- --===============0588510938== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Mongrel-development mailing list Mongrel-development-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org http://rubyforge.org/mailman/listinfo/mongrel-development --===============0588510938==--