about summary refs log tree commit homepage
path: root/lib/unicorn/tee_input.rb
DateCommit message (Collapse)
2011-02-25tee_input: remove old *BSD stdio workaround
Ruby 1.8.* users should get the latest Ruby 1.8.7 anyways since they contain critical bugfixes. We don't keep workarounds forever since the root problem is fixed/worked-around in upstream and people have had more than a year to upgrade Ruby.
2011-02-02Fix Ruby 1.9.3dev warnings
for i in `git ls-files '*.rb'`; do ruby -w -c $i; done
2010-12-09allow client_buffer_body_size to be tuned
Since modern machines have more memory these days and clients are sending more data, avoiding potentially slow filesystem operations for larger uploads can be useful for some applications.
2010-12-09tee_input: fix accounting error on corked requests
In case a request sends the header and buffer as one packet, TeeInput relying on accounting info from StreamInput is harmful as StreamInput will buffer in memory outside of TeeInput's control. This bug is triggered by calling env["rack.input"].size or env["rack.input"].rewind before to read.
2010-11-13tee_input: restore read position after #size
It's possible for an application to call size after it has read a few bytes/lines, so do not screw up a user's read offset when consuming input.
2010-11-12*_input: make life easier for subclasses/modules
Avoid having specific knowledge of internals in TeeInput and instead move that to StreamInput when dealing with byte counts. This makes things easier for Rainbows! which will need to extends these classes.
2010-11-11add stream_input class and build tee_input on it
We will eventually expose a Unicorn::StreamInput object as "rack.input" for Rack 2.x applications. StreamInput allows applications to avoid buffering input to disk, removing the (potentially expensive) rewindability requirement of Rack 1.x. TeeInput is also rewritten to build off StreamInput for simplicity. The only regression is that TeeInput#rewind forces us to consume an unconsumed stream before returning, a negligible price to pay for decreased complexity.
2010-11-07tee_input: switch to simpler API for parsing trailers
Not that anybody uses trailers extensively, but it's good to know it's there.
2010-10-07start using more compact parser API
This should be easier for Rainbows! to use
2010-10-05tee_input: use kgio to avoid stack traces on EOF
TeeInput methods may be invoked deep in the stack, so avoid giving them more work to do if a client disconnects due to a bad upload.
2010-10-05Unicorn::Util.tmpio => Unicorn::TmpIO.new
This is slightly shorter and hopefully easier to find.
2010-10-04tee_input: update interface to use HttpRequest
This should ensure we have less typing to do.
2010-07-11tee_input: fix constant resolution for client EOF
Noticed while hacking on a Zbatery-using application
2010-07-08cleanup "stringio" require
"stringio" is part of the Ruby distro and we use it in multiple places, so avoid re-requiring it.
2010-07-08tee_input: safer record separator ($/) handling
Different threads may change $/ during execution, so cache it at function entry to a local variable for safety. $/ may also be of a non-binary encoding, so rely on Rack::Utils.bytesize to portably capture the correct size. Our string slicing is always safe from 1.9 encoding: both our socket and backing temporary file are opened in binary mode, so we'll always be dealing with binary strings in this class (in accordance to the Rack spec).
2010-06-24tee_input: undent, avoid (re)-declaring "module Unicorn"
It makes RDoc look better and cleaner, since we don't do anything in the Unicorn namespace.
2010-06-24tee_input: allow tuning of client_body_buffer_size/io_size
Some folks may require more fine-grained control of buffering and I/O chunk sizes, so we'll support them (unofficially, for now).
2010-06-24tee_input: (nitpick) use IO#rewind instead of IO#seek(0)
no need to pass an extra argument
2010-06-14tee_input: update documentation for Rack 1.2
Rack 1.2 removed the +size+ method requirement, but we'll still support it since Rack 1.2 doesn't _prohibit_ it, and Rack 1.[01] applications will continue to exist for a while.
2010-02-27tee_input: do not #dup string buffers
It's a waste of memory bandwidth to do memcpy() when we know Unicorn::HttpParser (via rb_str_resize()) will allocate new memory for the string for us. An empty String is "free", as we've already paid the Object cost regardless.
2010-02-26tee_input: avoid instance variables, it's a struct
We'll use struct members exclusively from now on instead of throwing ivars into the mix. This allows us to _unofficially_ support direct access to more members easily. Unofficial extensions may include the ability to splice(2)/tee(2) for better performance. This also makes our object size smaller across all Ruby implementations as well, too (helps Rainbows! out).
2009-12-21tee_input: rdoc for all public methods
2009-11-15tee_input: client_error always raises
We do not hide unforseen exceptions, as that could cause us to waste precious time attempting to continue processing after errors.
2009-11-15tee_input: expand client error handling
First move it to a separate method, this allows subclasses to reuse our error handler. Additionally, capture HttpParserError as well since backtraces are worthless when a client sends us a bad request, too.
2009-11-13tee_input: fix comment from an intermediate commit
2009-11-13raise Unicorn::ClientShutdown if client aborts in TeeInput
Leaving the EOFError exception as-is bad because most applications/frameworks run an application-wide exception handler to pretty-print and/or log the exception with a huge backtrace. Since there's absolutely nothing we can do in the server-side app to deal with clients prematurely shutting down, having a backtrace does not make sense. Having a backtrace can even be harmful since it creates unnecessary noise for application engineers monitoring or tracking down real bugs.
2009-11-13tee_input: don't shadow struct members
It's confusing when a local variable reuses the same name as a struct member.
2009-11-11tee_input: better premature disconnect handling
Just let the error bubble all the way up to where Unicorn calls process_client where it'll be appropriately handled. Additionally, we'l just check the return value of tee() in ensure_length and avoid it if it nils on us.
2009-11-11tee_input: fix RDoc argument definition for tee
2009-11-05Util::tmpio returns a TmpIO that responds to #size
Subclass off the core File class so we don't have to worry about #size being defined. This will mainly be useful to Rainbows! but allows us to simplify our TeeInput implementation a little, too.
2009-11-04tee_input: do not clobber trailer buffer on partial uploads
Found in Rainbows! testing. Reusing the buffer when finalizing input for headers could be problematic because it would lead to the @buf2 instance variable being clobbered; allowing the trailers to "leak" into the body.
2009-10-25workaround FreeBSD/OSX IO bug for large uploads
Under FreeBSD writing to the file in sync mode does not change current position, so change position to the end of the file. Without this patch multipart post requests with large data (image uploading) does not work correctly: Status: 500 Internal Server Error bad content body /usr/local/lib/ruby/gems/1.8/gems/rack-1.0.0/lib/rack/utils.rb:347:in `parse_multipart' /usr/local/lib/ruby/gems/1.8/gems/rack-1.0.0/lib/rack/utils.rb:319:in `loop' /usr/local/lib/ruby/gems/1.8/gems/rack-1.0.0/lib/rack/utils.rb:319:in `parse_multipart' File position behavior under FreeBSD : ruby -v ruby 1.8.7 (2009-04-08 patchlevel 160) [i386-freebsd7] irb(main):001:0> b = File.new("abc", "w+") => #<File:abc> irb(main):002:0> b.sync = true => true irb(main):004:0> b.write("abc") => 3 irb(main):005:0> b.pos => 0 Acked-by: Eric Wong <normalperson@yhbt.net>
2009-10-06more-compatible TeeInput#read for POSTs with Content-Length
There are existing applications and libraries that don't check the return value of env['rack.input'].read(length) (like Rails :x). Those applications became broken under the IO#readpartial semantics of TeeInput#read when handling larger request bodies. We'll preserve the IO#readpartial semantics _only_ when handling chunked requests (as long as Rack allows it, it's useful for real-time processing of audio/video streaming uploads, especially with Rainbows! and mobile clients) but use read-in-full semantics for TeeInput#read on requests with a known Content-Length.
2009-09-27Remove "Z" constant for binary strings
We've started using magic comments to ensure any strings we create are binary instead. Additionally, ensure we create any StringIO objects with an explicit string (which default to binary) to ensure the StringIO object is binary. This is because StringIO.new (with no arguments) will always use the process-wide default encoding since it does not know about magic comments (and couldn't, really...)
2009-09-08"encoding: binary" comments for all sources (1.9)
This ensures any string literals that pop up in *our* code will just be a bag of bytes. This shouldn't affect/fix/break existing apps in most cases, but most constants will always have the "correct" encoding (none!) to be consistent with HTTP/socket expectations. Since this comment affects things only on a per-source basis, it won't affect existing apps with the exception of strings we pass to the Rack application. This will eventually allow us to get rid of that Unicorn::Z constant, too.
2009-08-20tee_input: fix rdoc
the docs for the TeeInput class was clobbering the Unicorn module documentation instead.
2009-08-15tee_input: make interface more usable outside of Unicorn
TeeInput being needed is now (once again) an uncommon code path so there's no point in relying on global constants. While we're at it, allow StringIO to be used in the presence of small inputs; too.
2009-08-10http: rename read_body to filter_body
This method is strictly a filter, it does no I/O so "read" is not an appropriate name to give it.
2009-08-09Switch to Ragel/C-based chunk/trailer parser
This should be more robust, faster and easier to deal with than the ugly proof-of-concept regexp-based ones.
2009-07-19Remove core Tempfile dependency (1.9.2-preview1 compat)
With the 1.9.2preview1 release (and presumably 1.9.1 p243), the Ruby core team has decided that bending over backwards to support crippled operating/file systems was necessary and that files must be closed before unlinking. Regardless, this is more efficient than using Tempfile because: 1) no delegation is necessary, this is a real File object 2) no mkdir is necessary for locking, we can trust O_EXCL to work properly without unnecessary FS activity 3) no finalizer is needed to unlink the file, we unlink it as soon as possible after creation.
2009-07-16move all #gets logic to tee_input out of chunked_reader
This simplifies chunked_reader substantially with a slight increase in tee_input complexity. This is beneficial because chunked_reader is more complex to begin with and more likely to experience correctness issues.
2009-07-01Force streaming input onto apps by default
This change gives applications full control to deny clients from uploading unwanted message bodies. This also paves the way for doing things like upload progress notification within applications in a Rack::Lint-compatible manner. Since we don't support HTTP keepalive, so we have more freedom here by being able to close TCP connections and deny clients the ability to write to us (and thus wasting our bandwidth). While I could've left this feature off by default indefinitely for maximum backwards compatibility (for arguably broken applications), Unicorn is not and has never been about supporting the lowest common denominator.
2009-07-01tee_input: avoid ignoring initial body blob
This was causing the first part of the body to be missing when an HTTP client failed to delay between sending the header and body in the request.
2009-06-30TeeInput: use only one IO for tempfile
We can actually just use one IO and file descriptor here and simplify the code while we're at it.
2009-06-29tee_input: avoid rereading fresh data
Oops!
2009-06-29Make TeeInput easier to use
The complexity of making the object persistent isn't worth the potential performance gain here.
2009-06-29tee_input: avoid defining a @rd.size method
We don't ever expose the @rd object to the public so Rack-applications won't ever call size() on it.
2009-06-25tee_input: Don't expose the @rd object as a return value
Pay a performance penalty and always proxy reads through our TeeInput object to ensure nobody closes our internal reader.
2009-06-09Avoid duplicating the "Z" constant
Trying not to repeat ourselves. Unfortunately, Ruby 1.9 forces us to actually care about encodings of arbitrary byte sequences.
2009-06-06Put copyright text in new files, include GPL2 text
Just clarifying the license terms of the new code. Other files should really have this notice in there as well.