From 6945342a1f0a4caaa918f2b0b1efef88824439e0 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Fri, 5 Jun 2009 18:03:46 -0700 Subject: Transfer-Encoding: chunked streaming input support This adds support for handling POST/PUT request bodies sent with chunked transfer encodings ("Transfer-Encoding: chunked"). Attention has been paid to ensure that a client cannot OOM us by sending an extremely large chunk. This implementation is pure Ruby as the Ragel-based implementation in rfuzz didn't offer a streaming interface. It should be reasonably close to RFC-compliant but please test it in an attempt to break it. The more interesting part is the ability to stream data to the hosted Rack application as it is being transferred to the server. This can be done regardless if the input is chunked or not, enabling the streaming of POST/PUT bodies can allow the hosted Rack application to process input as it receives it. See examples/echo.ru for an example echo server over HTTP. Enabling streaming also allows Rack applications to support upload progress monitoring previously supported by Mongrel handlers. Since Rack specifies that the input needs to be rewindable, this input is written to a temporary file (a la tee(1)) as it is streamed to the application the first time. Subsequent rewinded reads will read from the temporary file instead of the socket. Streaming input to the application is disabled by default since applications may not necessarily read the entire input body before returning. Since this is a completely new feature we've never seen in any Ruby HTTP application server before, we're taking the safe route by leaving it disabled by default. Enabling this can only be done globally by changing the Unicorn HttpRequest::DEFAULTS hash: Unicorn::HttpRequest::DEFAULTS["unicorn.stream_input"] = true Similarly, a Rack application can check if streaming input is enabled by checking the value of the "unicorn.stream_input" key in the environment hashed passed to it. All of this code has only been lightly tested and test coverage is lacking at the moment. [1] - http://tools.ietf.org/html/rfc2616#section-3.6.1 --- lib/unicorn/chunked_reader.rb | 96 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 lib/unicorn/chunked_reader.rb (limited to 'lib/unicorn/chunked_reader.rb') diff --git a/lib/unicorn/chunked_reader.rb b/lib/unicorn/chunked_reader.rb new file mode 100644 index 0000000..b9178a9 --- /dev/null +++ b/lib/unicorn/chunked_reader.rb @@ -0,0 +1,96 @@ +module Unicorn; end + +module Unicorn + class ChunkedReader + + Z = '' + Z.force_encoding(Encoding::BINARY) if Z.respond_to?(:force_encoding) + + def initialize + @input = @buf = nil + @chunk_left = 0 + end + + def reopen(input, buf) + buf ||= Z.dup + buf.force_encoding(Encoding::BINARY) if buf.respond_to?(:force_encoding) + @input, @buf = input, buf + parse_chunk_header + self + end + + def readpartial(max, buf = Z.dup) + buf.force_encoding(Encoding::BINARY) if buf.respond_to?(:force_encoding) + + while @input && @chunk_left <= 0 && ! parse_chunk_header + @buf << @input.readpartial(Const::CHUNK_SIZE, buf) + end + + if @input + begin + @buf << @input.read_nonblock(Const::CHUNK_SIZE, buf) + rescue Errno::EAGAIN, Errno::EINTR + end + end + + max = @chunk_left if max > @chunk_left + buf.replace(last_block(max) || Z) + @chunk_left -= buf.size + (0 == buf.size && @input.nil?) and raise EOFError + buf + end + + def gets + line = nil + begin + line = readpartial(Const::CHUNK_SIZE) + begin + if line.sub!(%r{\A(.*?#{$/})}, Z) + @chunk_left += line.size + @buf = @buf ? (line << @buf) : line + return $1.dup + end + line << readpartial(Const::CHUNK_SIZE) + end while true + rescue EOFError + return line + end + end + + private + + def last_block(max = nil) + rv = @buf + if max && rv && max < rv.size + @buf = rv[max - rv.size, rv.size - max] + return rv[0, max] + end + @buf = Z.dup + rv + end + + def parse_chunk_header + buf = @buf + # ignoring chunk-extension info for now, I haven't seen any use for it + # (or any users, and TE:chunked sent by clients is rare already) + # if there was not enough data in buffer to parse length of the chunk + # then just return + if buf.sub!(/\A(?:\r\n)?([a-fA-F0-9]{1,8})[^\r]*?\r\n/, Z) + @chunk_left = $1.to_i(16) + if 0 == @chunk_left # EOF + buf.sub!(/\A\r\n(?:\r\n)?/, Z) # cleanup for future requests + @input = nil + end + return @chunk_left + end + + buf.size > 256 and + raise HttpParserError, + "malformed chunk, chunk-length not found in buffer: " \ + "#{buf.inspect}" + nil + end + + end + +end -- cgit v1.2.3-24-ge0c7