about summary refs log tree commit homepage
path: root/lib
diff options
context:
space:
mode:
authorEric Wong <normalperson@yhbt.net>2009-10-06 19:45:05 -0700
committerEric Wong <normalperson@yhbt.net>2009-10-06 19:45:05 -0700
commit438c99aec2d74489fa89b3a6c60d1fb41bb2f7e6 (patch)
tree08d263fa35a8d4adce7a81225773589abeffed55 /lib
parent89c786d64ddbe74905d37bc0d110771de5f79f49 (diff)
downloadunicorn-438c99aec2d74489fa89b3a6c60d1fb41bb2f7e6.tar.gz
There are existing applications and libraries that don't check
the return value of env['rack.input'].read(length) (like Rails
:x).  Those applications became broken under the IO#readpartial
semantics of TeeInput#read when handling larger request bodies.

We'll preserve the IO#readpartial semantics _only_ when handling
chunked requests (as long as Rack allows it, it's useful for
real-time processing of audio/video streaming uploads,
especially with Rainbows! and mobile clients) but use
read-in-full semantics for TeeInput#read on requests with a
known Content-Length.
Diffstat (limited to 'lib')
-rw-r--r--lib/unicorn/tee_input.rb43
1 files changed, 41 insertions, 2 deletions
diff --git a/lib/unicorn/tee_input.rb b/lib/unicorn/tee_input.rb
index 96a053a..188e2ea 100644
--- a/lib/unicorn/tee_input.rb
+++ b/lib/unicorn/tee_input.rb
@@ -41,6 +41,26 @@ module Unicorn
       @size = tmp_size
     end
 
+    # call-seq:
+    #   ios = env['rack.input']
+    #   ios.read([length [, buffer ]]) => string, buffer, or nil
+    #
+    # Reads at most length bytes from the I/O stream, or to the end of
+    # file if length is omitted or is nil. length must be a non-negative
+    # integer or nil. If the optional buffer argument is present, it
+    # must reference a String, which will receive the data.
+    #
+    # At end of file, it returns nil or "" depend on length.
+    # ios.read() and ios.read(nil) returns "".
+    # ios.read(length [, buffer]) returns nil.
+    #
+    # If the Content-Length of the HTTP request is known (as is the common
+    # case for POST requests), then ios.read(length [, buffer]) will block
+    # until the specified length is read (or it is the last chunk).
+    # Otherwise, for uncommon "Transfer-Encoding: chunked" requests,
+    # ios.read(length [, buffer]) will return immediately if there is
+    # any data and only block when nothing is available (providing
+    # IO#readpartial semantics).
     def read(*args)
       socket or return @tmp.read(*args)
 
@@ -55,9 +75,9 @@ module Unicorn
         rv = args.shift || @buf2.dup
         diff = tmp_size - @tmp.pos
         if 0 == diff
-          tee(length, rv)
+          ensure_length(tee(length, rv), length)
         else
-          @tmp.read(diff > length ? length : diff, rv)
+          ensure_length(@tmp.read(diff > length ? length : diff, rv), length)
         end
       end
     end
@@ -130,5 +150,24 @@ module Unicorn
       StringIO === @tmp ? @tmp.size : @tmp.stat.size
     end
 
+    # tee()s into +buf+ until it is of +length+ bytes (or until
+    # we've reached the Content-Length of the request body).
+    # Returns +buf+ (the exact object, not a duplicate)
+    # To continue supporting applications that need near-real-time
+    # streaming input bodies, this is a no-op for
+    # "Transfer-Encoding: chunked" requests.
+    def ensure_length(buf, length)
+      # @size is nil for chunked bodies, so we can't ensure length for those
+      # since they could be streaming bidirectionally and we don't want to
+      # block the caller in that case.
+      return buf if buf.nil? || @size.nil?
+
+      while buf.size < length && @size != @tmp.pos
+        buf << tee(length - buf.size, @buf2)
+      end
+
+      buf
+    end
+
   end
 end