mirror of mongrel-development@rubyforge.org (inactive)
 help / color / mirror / Atom feed
* [RFC Mongrel2] simpler response API + updated HTTP parser
@ 2009-09-12 23:57 Eric Wong
       [not found] ` <20090912235729.GA9370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2009-09-12 23:57 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

Hi all,

I've pushed out some changes based on fauna/master[1] to
git://git.bogomips.org/ur-mongrel that includes a good chunk of the
platform-independent stuff found in Unicorn.

The new HTTP parser is named "mongrel_http" to avoid loadtime conflicts
with the old one ("http11") but maintains the same class name
(Mongrel::HttpParser).  This one even supports HTTP/0.9, so "http11"
wasn't an appropriate name for it :)


Problems:

  I'm having some trouble with Rake+Echoe 3.2 with an "uninitialized
  constant Platform" error but everything seems to work by hand without
  Rake+Echoe.

  I'm also getting some test failures under 1.9.1-p243 with the
  semaphore/threading tests.  I haven't looked too hard at this current
  threading model, but my gut feeling is that it's too complicated and a
  "dumber" model in mongrel 1.x *or* a fixed number of worker threads
  doing accept() is sufficient...

  One thing that may be cool is to support multiple
  threading/concurrency models since 1.8/1.9/jruby/rubinius all
  implement threads differently and we can also get Actors with
  1.9/Rubinius.

shortlog and diffstat below:

Eric Wong (6):
      http_response: replace old API with simpler one
      http_response: drop old API compatibility
      remove HeaderOut class
      Add new HTTP/{0.9,1.0,1.1} parser
      Start using the new HTTP parser + TeeInput
      Remove unused Const::HTTP_STATUS_CODES hash

 Manifest                                     |   10 +-
 ext/mongrel_http/c_util.h                    |  107 ++++
 ext/mongrel_http/common_field_optimization.h |  111 ++++
 ext/mongrel_http/ext_help.h                  |   48 ++
 ext/mongrel_http/extconf.rb                  |    8 +
 ext/mongrel_http/global_variables.h          |   91 ++++
 ext/mongrel_http/mongrel_http.rl             |  708 ++++++++++++++++++++++++++
 ext/mongrel_http/mongrel_http_common.rl      |   74 +++
 lib/mongrel.rb                               |   64 +---
 lib/mongrel/const.rb                         |   46 +--
 lib/mongrel/header_out.rb                    |   34 --
 lib/mongrel/http_request.rb                  |  147 ++----
 lib/mongrel/http_response.rb                 |  202 ++------
 lib/mongrel/tee_input.rb                     |  144 ++++++
 test/unit/test_http_parser.rb                |  425 ++++++++++++++--
 test/unit/test_http_parser_ng.rb             |  307 +++++++++++
 test/unit/test_response.rb                   |   12 +-
 test/unit/test_server.rb                     |    3 +
 18 files changed, 2101 insertions(+), 440 deletions(-)

Full changelog:

commit 4e6ab7b7d608bd074107c6a1804401d8165062d4
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Sat Sep 12 16:38:22 2009 -0700

    Remove unused Const::HTTP_STATUS_CODES hash
    
    It's no longer used when we generate responses, instead we just
    use the one found in Rack (which was originally "stolen" from
    us) so it's one less thing for us to maintain.

commit 46ca4a1c35b92109cedd59808908e7ad1d289abb
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Sat Sep 12 10:40:30 2009 -0700

    Start using the new HTTP parser + TeeInput
    
    The new HTTP parser minimizes the amount of Ruby support code
    needed and the HttpRequest class has been changed to a single
    class method: HttpRequest.read
    
    As a result, this hooks up the TeeInput class into the request
    processing cycle.  TeeInput lets us read the request body off
    the socket while the Rack application is being called (instead
    of being buffered before-hand) while providing rewindable
    semantics that the Rack spec requires.

commit c5a63522bc7e323c706609f7d99ed9f09fe9975d
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Fri Sep 11 13:55:20 2009 -0700

    Add new HTTP/{0.9,1.0,1.1} parser
    
    This is descended from the Mongrel parser but modified to
    support:
    
      * chunked transfer-encoding
      * trailers after chunked request bodies
      * HTTP/0.9
      * absolute URI requests
      * multi-line headers with continuation lines
      * repeated headers (joined by commas)
      * #keepalive? boolean method
      * better integration with Rack
    
    This is not yet hooked into any existing parts of Mongrel,
    that is the next step.

commit 8c1c7bdd3c1767708f8507d5aef8ded03b6f1796
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Fri Sep 11 13:16:25 2009 -0700

    remove HeaderOut class
    
    HttpResponse has been rewritten to just iterate through the
    headers Rack gives us in a GC-friendly way so we have no need
    for this any longer.

commit 392ea08624e39faec8d5e10ba04b21dfd9ca19a1
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Fri Sep 11 12:58:42 2009 -0700

    http_response: drop old API compatibility
    
    Avoid needless overhead in allocating a HttpResponse object and
    instead just use a class method.  This is alright with Rack
    applications since Rack specifies the response is already a
    tuple for writing.  Of course the headers and body of the
    response can both be generated iteratively with #each.

commit 469a507133bd20034df485f03b6eb7b0e82080d6
Author: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date:   Fri Sep 11 12:52:20 2009 -0700

    http_response: replace old API with simpler one
    
    The old API is completely dropped, a compatibility layer for the
    old one will be added as Rack middleware instead.  This allows
    newly-written applications to go through fewer layers of
    abstraction.

git: git://git.bogomips.org/ur-mongrel
cgit: http://git.bogomips.org/cgit/ur-mongrel.git

[1] 9f9a9d488ed32a2891dc3dd7d50a17a16357042d

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC Mongrel2] simpler response API + updated HTTP parser
       [not found] ` <20090912235729.GA9370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
@ 2009-10-08  1:35   ` Eric Wong
       [not found]     ` <20091008013542.GA12370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2009-10-08  1:35 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> Hi all,
> 
> I've pushed out some changes based on fauna/master[1] to
> git://git.bogomips.org/ur-mongrel that includes a good chunk of the
> platform-independent stuff found in Unicorn.

We hit one bug/weird-interaction with Rails in Unicorn so here's a fix I
put it.  Unfortunately the test cases for TeeInput in Unicorn currently
rely on fork() + pipe() (it was just more natural for me to write), but
if there's interest I could be persuaded to write a non-*nix version.

This issue that could be arguably considered a bug in Rails:
  https://rails.lighthouseapp.com/projects/8994/tickets/3343

Just in case, I'm also asking for Rack to allow the readpartial method
into the "rack.input" spec here:
   http://groups.google.com/group/rack-devel/browse_thread/thread/3dfccb68172a6ed6

>>From 87254d37c519b63a1d39c938cd4a53b08e2a1065 Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date: Wed, 7 Oct 2009 18:24:27 -0700
Subject: [PATCH] more-compatible TeeInput#read for POSTs with Content-Length

There are existing applications and libraries that don't check
the return value of env['rack.input'].read(length) (like Rails
:x).  Those applications became broken under the IO#readpartial
semantics of TeeInput#read when handling larger request bodies.

We'll preserve the IO#readpartial semantics _only_ when handling
chunked requests (as long as Rack allows it, it's useful for
real-time processing of audio/video streaming uploads,
especially with Rainbows! and mobile clients) but use
read-in-full semantics for TeeInput#read on requests with a
known Content-Length.
---
 lib/mongrel/tee_input.rb |   43 +++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/lib/mongrel/tee_input.rb b/lib/mongrel/tee_input.rb
index 442c55a..3605e20 100644
--- a/lib/mongrel/tee_input.rb
+++ b/lib/mongrel/tee_input.rb
@@ -44,6 +44,26 @@ module Mongrel
       @size = tmp_size
     end
 
+    # call-seq:
+    #   ios = env['rack.input']
+    #   ios.read([length [, buffer ]]) => string, buffer, or nil
+    #
+    # Reads at most length bytes from the I/O stream, or to the end of
+    # file if length is omitted or is nil. length must be a non-negative
+    # integer or nil. If the optional buffer argument is present, it
+    # must reference a String, which will receive the data.
+    #
+    # At end of file, it returns nil or "" depend on length.
+    # ios.read() and ios.read(nil) returns "".
+    # ios.read(length [, buffer]) returns nil.
+    #
+    # If the Content-Length of the HTTP request is known (as is the common
+    # case for POST requests), then ios.read(length [, buffer]) will block
+    # until the specified length is read (or it is the last chunk).
+    # Otherwise, for uncommon "Transfer-Encoding: chunked" requests,
+    # ios.read(length [, buffer]) will return immediately if there is
+    # any data and only block when nothing is available (providing
+    # IO#readpartial semantics).
     def read(*args)
       socket or return @tmp.read(*args)
 
@@ -58,9 +78,9 @@ module Mongrel
         rv = args.shift || @buf2.dup
         diff = tmp_size - @tmp.pos
         if 0 == diff
-          tee(length, rv)
+          ensure_length(tee(length, rv), length)
         else
-          @tmp.read(diff > length ? length : diff, rv)
+          ensure_length(@tmp.read(diff > length ? length : diff, rv), length)
         end
       end
     end
@@ -140,5 +160,24 @@ module Mongrel
       tmp
     end
 
+    # tee()s into +buf+ until it is of +length+ bytes (or until
+    # we've reached the Content-Length of the request body).
+    # Returns +buf+ (the exact object, not a duplicate)
+    # To continue supporting applications that need near-real-time
+    # streaming input bodies, this is a no-op for
+    # "Transfer-Encoding: chunked" requests.
+    def ensure_length(buf, length)
+      # @size is nil for chunked bodies, so we can't ensure length for those
+      # since they could be streaming bidirectionally and we don't want to
+      # block the caller in that case.
+      return buf if buf.nil? || @size.nil?
+
+      while buf.size < length && @size != @tmp.pos
+        buf << tee(length - buf.size, @buf2)
+      end
+
+      buf
+    end
+
   end
 end
-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC Mongrel2] simpler response API + updated HTTP parser
       [not found]     ` <20091008013542.GA12370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
@ 2009-10-27 21:59       ` Eric Wong
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Wong @ 2009-10-27 21:59 UTC (permalink / raw)
  To: mongrel-development-GrnCvJ7WPxnNLxjTenLetw

Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org> wrote:
> > Hi all,
> > 
> > I've pushed out some changes based on fauna/master[1] to
> > git://git.bogomips.org/ur-mongrel that includes a good chunk of the
> > platform-independent stuff found in Unicorn.

One more that I just pushed out to git://git.bogomips.org/ur-mongrel

>>From f1e493e98a76345b4a05b29e037826626138776b Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Date: Tue, 27 Oct 2009 14:38:51 -0700
Subject: [PATCH] tee_input: avoid IO#sync=true to workaround BSD stdio issue

IO#sync = true causes bad things with Ruby 1.8.x and stdio in
*BSDs.  Since Mongrel 1.x originally didn't use IO#sync=true and
needs to work on slow clients and a wider number of OSes than
Unicorn, it maybe be better to just avoid IO#sync=true instead
of an explicit seek-after-write (like Unicorn does).

This issue was tracked (and fixed) in ruby-core:26300[1], but a
MRI 1.8 release may be a while off and people have a tendency to
upgrade MRI slowly.

[1] http://redmine.ruby-lang.org/issues/show/2267
---
 lib/mongrel/tee_input.rb |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/lib/mongrel/tee_input.rb b/lib/mongrel/tee_input.rb
index 3605e20..cf20613 100644
--- a/lib/mongrel/tee_input.rb
+++ b/lib/mongrel/tee_input.rb
@@ -134,6 +134,10 @@ module Mongrel
         begin
           if parser.filter_body(dst, socket.readpartial(length, buf)).nil?
             @tmp.write(dst)
+            # This seek is to workaround a BSD stdio + MRI 1.8.x issue,
+            # [ruby-core:26300] but currently not needed unless we've
+            # set @tmp.sync=true
+            # @tmp.seek(0, IO::SEEK_END) if @tmp.sync
             return dst
           end
         rescue EOFError
@@ -155,7 +159,6 @@ module Mongrel
 
     def tmpfile
       tmp = Tempfile.new(Const::MONGREL_TMP_BASE)
-      tmp.sync = true
       tmp.binmode
       tmp
     end
-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-10-27 21:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-12 23:57 [RFC Mongrel2] simpler response API + updated HTTP parser Eric Wong
     [not found] ` <20090912235729.GA9370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2009-10-08  1:35   ` Eric Wong
     [not found]     ` <20091008013542.GA12370-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2009-10-27 21:59       ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).