about summary refs log tree commit homepage
path: root/lib/unicorn/http_request.rb
DateCommit message (Collapse)
2009-06-05Transfer-Encoding: chunked streaming input support
This adds support for handling POST/PUT request bodies sent with chunked transfer encodings ("Transfer-Encoding: chunked"). Attention has been paid to ensure that a client cannot OOM us by sending an extremely large chunk. This implementation is pure Ruby as the Ragel-based implementation in rfuzz didn't offer a streaming interface. It should be reasonably close to RFC-compliant but please test it in an attempt to break it. The more interesting part is the ability to stream data to the hosted Rack application as it is being transferred to the server. This can be done regardless if the input is chunked or not, enabling the streaming of POST/PUT bodies can allow the hosted Rack application to process input as it receives it. See examples/echo.ru for an example echo server over HTTP. Enabling streaming also allows Rack applications to support upload progress monitoring previously supported by Mongrel handlers. Since Rack specifies that the input needs to be rewindable, this input is written to a temporary file (a la tee(1)) as it is streamed to the application the first time. Subsequent rewinded reads will read from the temporary file instead of the socket. Streaming input to the application is disabled by default since applications may not necessarily read the entire input body before returning. Since this is a completely new feature we've never seen in any Ruby HTTP application server before, we're taking the safe route by leaving it disabled by default. Enabling this can only be done globally by changing the Unicorn HttpRequest::DEFAULTS hash: Unicorn::HttpRequest::DEFAULTS["unicorn.stream_input"] = true Similarly, a Rack application can check if streaming input is enabled by checking the value of the "unicorn.stream_input" key in the environment hashed passed to it. All of this code has only been lightly tested and test coverage is lacking at the moment. [1] - http://tools.ietf.org/html/rfc2616#section-3.6.1
2009-06-05http_request: fix typo for 1.9
2009-05-31http_request: StringIO is binary for empty bodies (1.9)
2009-05-30http_request: no need to reset the request
That method no longer exists, but Ruby would never know until it tried to run it. Yes, I miss my compiled languages.
2009-05-28Make our HttpRequest object a global constant
This should be faster/cheaper than using an instance variable since it's accessed in a critical code path. Unicorn was never designed to be reentrant or thread-safe at all, either.
2009-05-11HttpRequest::DEF_PARAMS => HttpRequest::DEFAULTS
Give this a more palatable name and unfreeze it, allowing users to modify it more easily.
2009-05-10http_request: use Rack::InputWrapper-compatible methods
This allows alternative I/O implementations to be easier to use with Unicorn...
2009-05-04Inline and remove the HttpRequest#reset method
These potentially leaves an open file handle around until the next request hits the process, but this makes the common case faster.
2009-05-03http_request: switch to readpartial over sysread
readpartial is actually as low-level as sysread is, except it's less likely to throw exceptions and won't change the blocking/non-blocking status of a file descriptor (we explicitly enable blocking I/O)
2009-05-03http_request: avoid StringIO.new for GET/HEAD requests
Since the vast majority of web traffic is GET/HEAD requests without bodies, avoid creating a StringIO object for every single request that comes in.
2009-04-25Rack 1.0.0 compatibility
Keep in mind that it's plenty possible to use Unicorn as a library without using Rack itself. Most of the unit tests do not depend on Rack, for example.
2009-04-23http_request: micro optimizations
This leads to a ~10% improvement in test/benchmark/request.rb Some of these changes will need to be reworked for multi-threaded servers (Mongrel); but Unicorn will always be single-threaded.
2009-04-23Get rid of UNICORN_TMP_BASE constant
It was just a waste of space and would've caused line wrapping. This reinstates the "unicorn" prefix when we create tempfiles, too.
2009-04-23Fix data corruption with small uploads via browsers
StringIO.new(partial_body) does not update the offset for new writes. So instead create the StringIO object and then syswrite to it and try to follow the same code path used by large uploads which use Tempfiles.
2009-04-21Stop extending core classes
This removes the #unicorn_peeraddr methods from TCPSocket and UNIXSocket core classes. Instead, just move that logic into the only place it needs to be used in HttpRequest.
2009-04-21HttpParser: set QUERY_STRING for Rack-compliance
2009-04-21http_request: freeze modifiable elements
Otherwise applications can change them behind our back and affect subsequent requests.
2009-04-21Move absolute URI parsing into HTTP parser
It's part of the HTTP/1.1 (rfc2616), so we might as well handle it in there and set PATH_INFO while we're at it. Also, make "OPTIONS *" test not fail Rack::Lint
2009-04-08http11: handle "X-Forwarded-Proto: https"
Pass "https" to "rack.url_scheme" if the X-Forwarded-Proto header matches "https". X-Forwarded-Proto is a semi-standard header that Ruby frameworks seem to respect; so we use that. We won't support ENV['HTTPS'] since that can only be set at start time and some app servers supporting https also support http. Currently, "rack.url_scheme" only allows "http" and "https", so we won't set anything else to avoid breaking Rack::Lint.
2009-03-29http11: use :http_body instead of "HTTP_BODY"
"HTTP_BODY" could conflict with a "Body:" HTTP header if there ever is one. Also, try to hide this body from the Rack environment before @app is called since it is only used by Unicorn internally.
2009-03-27Always try to send a valid HTTP response back
This reworks error handling throughout the entire stack to be more Ruby-ish. Exceptions are raised instead of forcing the us to check return values. If a client is sending us a bad request, we send a 400. If unicorn or app breaks in an unexpected way, we'll send a 500. Both of these last-resort error responses are sent using IO#write_nonblock to avoid tying Unicorn up longer than necessary and all exceptions raised are ignored. Sending a valid HTTP response back should reduce the chance of us from being marked as down or broken by a load balancer. Previously, some load balancers would mark us as down if we close a socket without sending back a valid response; so make a best effort to send one. If for some reason we cannot write a valid response, we're still susceptible to being marked as down. A successful HttpResponse.write() call will now close the socket immediately (instead of doing it higher up the stack). This ensures the errors will never get written to the socket on a successful response.
2009-03-27Remove needless line break
2009-03-25Merge commit 'v0.2.3'
* commit 'v0.2.3': unicorn 0.2.3 Ensure Tempfiles are unlinked after every request Don't bother unlinking UNIX sockets Conflicts: lib/unicorn/socket.rb
2009-03-25Ensure Tempfiles are unlinked after every request
Otherwise we bloat TMPDIR and run the host out of space, oops!
2009-03-24simplify the HttpParser interface
This cuts the HttpParser interface down to #execute and #reset method. HttpParser#execute will return true if it completes and false if it is not. http->nread state is kept internally so we don't have to keep track of it in Ruby; removing one parameter from #execute. HttpParser#reset is unchanged. All errors are handled through exceptions anyways, so the HttpParser#error? method stopped being useful. Also added some more unit tests to the HttpParser since I know some folks are (rightfully) uncomfortable with changing stable C code. We now have tests for incremental parsing. In summary, we have: * more test cases * less C code * simpler interfaces * small performance improvement => win \o/
2009-03-24HttpRequest: small improvement for GET requests
Most HTTP requests are GET requests and the majority of those GET requests are complete after one sysread. This is especially true since we're optimized for fast clients. So short the extra checks and trust our HTTP parser implementation to do the right thing (we have decent unit tests for it).
2009-03-22Streamline rack environment generation
Ensure constants are used as hash keys and cleanup unused constants. This gives a 10-15% improvement with test/benchmark/request.rb
2009-03-21HttpRequest: correctly reference logger
2009-03-21http11: don't set headers Rack doesn't like
Fix the logic in HttpParser up front so we don't have to mess around with the following convoluted steps: 1. setting the HTTP_CONTENT_{LENGTH,TYPE} headers 2. reading the HTTP_CONTENT_{LENGTH,TYPE} headers again 3. setting the CONTENT_{LENGTH,TYPE} based on the HTTP_-prefixed one 4. deleting the HTTP_CONTENT_{LENGTH,TYPE} headers (since Rack doesn't like them) 1, 2, 3 were in the C code, 4 was in Ruby. Now the logic is: 1. if CONTENT_{LENGTH,TYPE} headers are seen, don't prefix with "HTTP_". All the branch logic for the new code is done at init time, too so there's no additional overhead in the HTTP parsing phase. There's also no additional overhead of hash lookups in the extra steps.
2009-03-10HttpRequest: update comment regarding short writes v0.1.0
Or lack thereof on POSIX.
2009-03-10HttpRequest: set binmode on tempfiles
Just in case this stupid Ruby 1.9-ism creeps up on someone; I haven't been able to reproduce I/O corruption from the test cases, but better safe than sorry here.
2009-03-03Allow stderr_path and stdout_path to be set in the config
As opposed to doing this in the shell, this allows the files to be reopened reliably after rotation. While we're at it, use $stderr/$stdout instead of STDERR/STDOUT since they seem to be more favored.
2009-02-25rename http11 => unicorn/http11
Avoid conflicting with existing (and future) Mongrel installs in case either changes. Of course, this also allows us more freedom to experiment and break the API if needed... However, I'm only planning on making minor changes to remove the amount of C code we have to maintain and possibly some minor performance improvements.
2009-02-13Remove tempfile reuse from HttpRequest, upload tests
Tempfile reuse was over-engineered and the problem was not nearly as big a problem as initially thought. Additionally, it could lead to a subtle bug in an applications that link(2)s or rename(2)s the temporary file to a permanent location _without_ closing it after the request is done. Applications that suffer from the problem of directory bloat are still free to modify ENV['TMPDIR'] to influence the creation of Tempfiles.
2009-02-09Refactor and get exec + FD inheritance working
Along with worker process management. This is nginx-style inplace upgrading (I don't know of another web server that does this). Basically we can preserve our opened listen sockets across entire executable upgrades. Signals: USR2 - Sending USR2 to the master unicorn process will cause it to exec a new master and keep the original workers running. This is useful to validate that the new code changes took place are valid and don't immediately die. Once the changes are validated (manually), you may send QUIT to the original master process to have it gracefully exit. HUP - Sending this to the master will make it immediately exec a new binary and cause the old workers to gracefully exit. Use this if you're certain the latest changes to Unicorn (and your app) are ready and don't need validating. Unlike nginx, re-execing a new binary will pick up any and all configuration changes. However listener sockets cannot be removed when exec-ing; only added (for now). I apologize for making such a big change in one commit, but once I got the ability to replace the entire codebase while preserving connections, it was too tempting to continue working. So I wrote a large chunk of this while hitting the unicorn-hello-world app with the following loop: while curl -vSsfN http://0:8080; do date +%N; done _Zero_ requests lost across multiple restarts.
2009-02-09HttpRequest: restart read(2) on EINTR
Since we handle signals, read(2) syscalls can fail on sockets with EINTR. Restart the call if we hit this.
2009-02-09Refactor HTTP Request processing into HttpRequest
Keeping I/O out of unicorn.rb
2009-02-09Skip EINTR/EAGAIN handling with syswrite
I'll be removing signal handling from worker processes...
2009-02-09Use a persistent buffer with HttpRequest
This allows us to avoid the overhead of allocating a new buffer each and every time we call sysread (even when just parsing headers for GET requests).
2009-02-09HttpRequest#reset! => HttpRequest#reset
Keep this somewhat consistent with the HttpParser API which also exposes #reset instead of #reset!
2009-02-09Make HttpRequest object (and temp files) persistent
This will help prevent TMPDIR from becoming bloated when handling thousands of large uploads a day. This is a problem in many UNIX filesystems (including ext3): names of entries never expire even after files are gone and the only way to clear it is to get rid of the directory itself.
2009-02-09Don't set SCRIPT_NAME to "/" and then clear it for Rack
It's pointless...
2009-02-09HttpRequest: avoid repeated hash lookups for HTTP_BODY
read_body can be a long-running loop; so avoid wasting CPU cycles by repeatedly performing a hash lookup to get to a temporary buffer.
2009-02-09Remove threading and use worker processes instead
All tests for threading and semaphores have been removed. One test was changed because it depended on a shared variable. Tests will be replaced with tests to do process management instead.
2009-02-09s/Mongrel/Unicorn/g
Avoid conflicting with existing Mongrel libraries since we'll be incompatible and break things w/o disrupting Mongrel installations.