about summary refs log tree commit homepage
path: root/lib/unicorn/http_request.rb
DateCommit message (Collapse)
2019-05-26remove kgio from all read(2) and write(2) wrappers no-kgio-wip
It's fairly easy given unicorn was designed with synchronous I/O in mind. The overhead of backtraces from EOFError on readpartial should be rare given our requirement to only accept requests from fast, reliable clients on LAN (e.g. nginx or yet-another-horribly-named-server).
2018-10-18doc: update more URLs to use HTTPS and avoid redirects
Latency from redirects is painful, and HTTPS can protect privacy in some cases.
2018-08-20shrink pipes under Linux
We have never had any need for pipes with the default 64K capacity on Linux. Our pipes are only used for tiny writes in signal handlers and to perform parent shutdown detection. With the current /proc/sys/fs/pipe-user-pages-soft default, only 1024 pipes can be created by an unprivileged user before Linux clamps down the pipe size to 4K (a single page) for newly-created pipes[1]. So avoid penalizing OTHER pipe users who could benefit from the increased capacity and use only a single page for ourselves. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pipe.c?h=v4.18#n642
2017-12-16avoid reusing env on hijack
Hijackers may capture and reuse `env' indefinitely, so we must not use it in those cases for future requests. For non-hijack requests, we continue to reuse the `env' object to reduce memory recycling. Reported-and-tested-by: Sam Saffron <sam.saffron@gmail.com>
2017-03-26Check for Socket::TCP_INFO constant before trying to get TCP_INFO
The ruby constant Socket::TCP_INFO is only defined if TCP_INFO is defined in C, so we can just check for the presence of that ruby constant instead of rescuing SocketError from the call to getsockopt.
2017-03-24Check for SocketError on first ccc attempt
On OpenBSD, getsockopt(2) does not support TCP_INFO. With the current code, this results in a 500 for all clients if check_client_connection is enabled on OpenBSD. This patch rescues SocketError on the first getsockopt call, and if SocketError is raised, it doesn't check in the future. This should be the same behavior as if TCP_INFO was supported but inspect did not return a string in the expected format.
2017-03-21http_request: support proposed Raindrops::TCP states on non-Linux
raindrops 0.18+ will have Raindrops::TCP state hash for portable mapping of TCP states to their respective numeric values. This was necessary because TCP state numbers (and even macro names) differ between FreeBSD and Linux (and possibly other OSes). Favor using the Raindrops::TCP state hash if available, but fall back to the hard-coded values since older versions of raindrops did not support TCP_INFO on non-Linux systems. While we're in the area, favor "const_defined?" over "defined?" to reduce the inline constant cache footprint for branches which are only evaluated once. Patches to implement Raindrops::TCP for FreeBSD are available at: https://bogomips.org/raindrops-public/20170316031652.17433-1-e@80x24.org/T/
2017-03-14http_request: reduce insn size for check_client_connection
Unlike constants and instance variables, class variable access is not optimized in the mainline Ruby VM. Use a constant instead, to take advantage of inline constant caching. This further reduces runtime instruction size by avoiding a branch by allocating the Raindrops::TCP_Info object up front. This reduces the method size by roughly 300 bytes on 64-bit.
2017-03-08support "struct tcp_info" on non-Linux and Ruby 2.2+
Ruby 2.2+ can show "struct tcp_info" as a string via Socket::Option#inspect, and we can attempt to parse it out to extract the information we need. Parsing this string is inefficient, but does not depend on the ordering of the tcp_info struct.
2017-03-08revert signature change to HttpServer#process_client
We can force kgio_tryaccept to return an internal class for TCP objects by subclassing Kgio::TCPServer. This avoids breakage in any unfortunate projects which depend on our undocumented internal APIs, such as gctools <https://github.com/tmm1/gctools>
2017-03-08check_client_connection: use tcp state on linux ccc-tcp-v2
* Use a frozen empty array and a class variable for TCP_Info to avoid garbage. As far as I can tell, this shouldn't result in any garbage on any requests (other than on the first request). * Pass listener socket to #read to only check the client connection on a TCP server. * Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's the most common state after ESTABLISHED, it makes the numbers un-ordered, though. But comment should make it OK. * Definition of of `check_client_connection` based on whether Raindrops::TCP_Info is defined, instead of the class variable approach. * Changed the unit tests to pass a `nil` listener. Tested on our staging environment, and still works like a dream. I should note that I got the idea between this patch into Puma as well! https://github.com/puma/puma/pull/1227 [ew: squashed in temporary change for oob_gc.rb, but we'll come up with a different change to avoid breaking gctools <https://github.com/tmm1/gctools>] Acked-by: Eric Wong <e@80x24.org>
2017-02-13http_request: freeze constant strings passed IO#write
This ensures we won't have duplicate objects in Ruby 2.0-2.4. For Ruby 2.5.0dev+, this avoids any duplicate cleanup introduced as of r57471: https://bugs.ruby-lang.org/issues/13085
2015-06-30reduce constants and optimize for Ruby 2.2
Ruby (MRI) 2.1 optimizes allocations away on String#freeze with literal strings. Furthermore, Ruby 2.2 optimizes away literal string allocations when they are used as arguments to Hash#[] and Hash#[]= Thus we can avoid expensive constant lookups and cache overhead by taking advantage of advancements in Ruby. Since Ruby 2.2 has been out for 7 months, now; it ought to be safe to introduce minor performance regressions for folks using older Rubies (1.9.3+ remains supported) to benefit folks on the latest Ruby. This should recover the performance lost in the "reflect changes in Rack::Utils::HTTP_STATUS_CODES" change in synthetic benchmarks.
2015-06-06http: move response_start_sent into the C ext
Combined with the previous commit to eliminate the `@socket' instance variable, this eliminates the last instance variable in the Unicorn::HttpRequest class. Eliminating the last instance variable avoids the creation of a internal hash table used for implementing the "generic" instance variables found in non-pure-Ruby classes. Method entry overhead remains the same. While this change doesn't do a whole lot for unicorn memory usage where the HttpRequest is a singleton, it helps other HTTP servers which rely on this code where thousands of clients may be connected.
2015-06-06move the socket into Rack env for hijacking
This avoids the expensive generic instance variable for @socket and exposes the socket as `env["unicorn.socket"]' to the Rack application. As as nice side-effect, applications may access `env["unicorn.socket"]' as part of the API may be useful for 3rd-party bits such as Raindrops::TCP_Info for reading the tcp_info struct on Linux-based systems. Yes, `env["unicorn.socket"]' is a proprietary API in unicorn! News at 11! But then again, unicorn is not the first Rack server to implement `env["#{servername}.socket"]', either...
2015-05-16avoid extra allocation for hijack proc creation
proc creation is expensive, so merely use a 48-byte generic ivar hash slot for @socket instead.
2015-05-16http_request: support rack.hijack by default
Rack 1.4 and earlier will soon die out, so avoid having extra code The only minor overhead is assigning two hash slots and the extra hash checks when running ancient versions of Rack, so it is unlikely anybody cares about that overhead with Rack 1.5 and later.
2015-03-01reduce and localize constant string use
Literal String#freeze avoids allocations since Ruby 2.1 via the opt_str_freeze instruction, so we can start relying on it in some places as Ruby 2.1 adoption increases. The 100-continue handling is a good place to start since it is an uncommonly-used code path which benefits from size reduction and the negative performance impact is restricted to a handful of users. HTTP_RESPONSE_START can safely live in http_request.rb as its usage does not cross namespace boundaries The goal is to eventually eliminate Unicorn::Const entirely.
2013-02-09http_request: drop conditional assignment for hijack
As far as I can tell, this was never necessary.
2013-02-08http_request: remove FIXME for rack.version clarification
commit a9474624a148fe58e0944664190b259787dcf51e in rack.git
2013-01-22ignore normal Rack response at request-time hijack
Once a connection is hijacked, we ignore it completely and leave the connection at the mercy of the application.
2013-01-22support for Rack hijack in request and response
Rack 1.5.0 (protocol version [1,2]) adds support for hijacking the client socket (removing it from the control of unicorn (or any other Rack webserver)). Tested with rack 1.5.0.
2012-11-29Begin writing HTTP request headers early to detect disconnected clients
This patch checks incoming connections and avoids calling the application if the connection has been closed. It works by sending the beginning of the HTTP response before calling the application to see if the socket can successfully be written to. By enabling this feature users can avoid wasting application rendering time only to find the connection is closed when attempting to write, and throwing out the result. When a client disconnects while being queued or processed, Nginx will log HTTP response 499 but the application will log a 200. Enabling this feature will minimize the time window during which the problem can arise. The feature is disabled by default and can be enabled by adding 'check_client_connection true' to the unicorn config. [ew: After testing this change, Tom Burns wrote: So we just finished the US Black Friday / Cyber Monday weekend running unicorn forked with the last version of the patch I had sent you. It worked splendidly and helped us handle huge flash sales without increased response time over the weekend. Whereas in previous flash traffic scenarios we would see the number of HTTP 499 responses grow past the number of real HTTP 200 responses, over the weekend we saw no growth in 499s during flash sales. Unexpectedly the patch also helped us ward off a DoS attack where the attackers were disconnecting immediately after making a request. ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com> ] Signed-off-by: Eric Wong <normalperson@yhbt.net>
2011-05-04http_parser: new add_parse method
Combines the following sequence: http_parser.buf << socket.readpartial(0x4000) http_parser.parse Into: http_parser.add_parse(socket.readpartial(0x4000)) It was too damn redundant otherwise...
2011-04-13http_parser: remove RDoc
It's not needed for users, so avoid confusing them. Unicorn itself is not intended to be an API, it just hosts Rack applications.
2011-01-05http_parser: add clear method, deprecate reset
But allows small optimizations to be made to avoid constant/instance variable lookups later :)
2010-11-11configurator: enable "rewindable_input" directive
This allows users to override the current Rack spec and disable the rewindable input requirement. This can allow applications to use less I/O to minimize the performance impact when processing uploads.
2010-10-07start using more compact parser API
This should be easier for Rainbows! to use
2010-10-05upgrade to kgio 1.2.0
This provides the kgio_read! method which is like readpartial, only significantly cheaper when a client disconnects on us.
2010-10-05start using kgio, the kinder, gentler I/O library
This should hopefully make the non-blocking accept() situation more tolerable under Ruby 1.9.2.
2010-10-04http_request: reformat and small reorg
This hides more HTTP request logic inside our object.
2010-10-04tee_input: update interface to use HttpRequest
This should ensure we have less typing to do.
2010-10-04http_request: avoid globals
Rainbows! will be able to reuse this.
2010-07-08cleanup "stringio" require
"stringio" is part of the Ruby distro and we use it in multiple places, so avoid re-requiring it.
2010-07-08prefer "[]" to "first"/"last" where possible
"[]" is slightly faster under Ruby 1.9 (but slightly slower under 1.8).
2010-02-22avoid needlessly preallocating read buffer
Copy-on-write will always invalidate it regardless, and the first request is likely to be slow for any app.
2009-12-26bump "rack.version" env to [1,1]
This will match what's in Rack the 1.1.0 release.
2009-12-07http_request: small cleanup/code reduction
No point in bloating code for an unlikely path (and the memcpy() vs malloc() tradeoff is debatable...)
2009-09-27Remove "Z" constant for binary strings
We've started using magic comments to ensure any strings we create are binary instead. Additionally, ensure we create any StringIO objects with an explicit string (which default to binary) to ensure the StringIO object is binary. This is because StringIO.new (with no arguments) will always use the process-wide default encoding since it does not know about magic comments (and couldn't, really...)
2009-09-27http_request: simplify and remove handle_body method
It is simpler and even slightly faster in micro benchmarks when inlined.
2009-09-16Avoid freezing objects that don't benefit from it
This gives applications more rope to play with in case they have any reasons for changing some values of the default constants. Freezing strings for Hash assignments still speeds up MRI, so we'll keep on doing that for now (and as long as MRI supports frozen strings, I expect them to always be faster for Hashes though I'd be very happy to be proven wrong...)
2009-09-08"encoding: binary" comments for all sources (1.9)
This ensures any string literals that pop up in *our* code will just be a bag of bytes. This shouldn't affect/fix/break existing apps in most cases, but most constants will always have the "correct" encoding (none!) to be consistent with HTTP/socket expectations. Since this comment affects things only on a per-source basis, it won't affect existing apps with the exception of strings we pass to the Rack application. This will eventually allow us to get rid of that Unicorn::Z constant, too.
2009-08-15tee_input: make interface more usable outside of Unicorn
TeeInput being needed is now (once again) an uncommon code path so there's no point in relying on global constants. While we're at it, allow StringIO to be used in the presence of small inputs; too.
2009-08-15http_request: reinstate empty StringIO optimization
This makes a noticeable difference on light GET/HEAD requests. Heck, even the tests run a few seconds faster.
2009-08-09Switch to Ragel/C-based chunk/trailer parser
This should be more robust, faster and easier to deal with than the ugly proof-of-concept regexp-based ones.
2009-07-16HttpRequest: no need for a temporary variable
Regexps can be run against nil just fine
2009-07-16http_request: don't support apps that close env["rack.input"]
Anything that calls close on a rack.input body is violating Rack::Lint; so don't waste cycles supporting them. Being liberal in things we accept tolerates bad behavior and Unicorn doesn't have a large userbase that would scream bloody murder if we stopped supporting broken behavior.
2009-07-15Rename unicorn/http11 => unicorn_http
We couldn't do proper namespacing for the C module so there was a potential conflict with Init_http11() in Mongrel. This was needed because Mongrel's HTTP parser could be used in some applications and we may be unfortunate enough need to support them.
2009-07-04Favor Struct members to instance variables
There's a small memory reduction to be had when forking oodles of processes and the Perl hacker in me still gets confused into thinking those are arrays...
2009-07-01Force streaming input onto apps by default
This change gives applications full control to deny clients from uploading unwanted message bodies. This also paves the way for doing things like upload progress notification within applications in a Rack::Lint-compatible manner. Since we don't support HTTP keepalive, so we have more freedom here by being able to close TCP connections and deny clients the ability to write to us (and thus wasting our bandwidth). While I could've left this feature off by default indefinitely for maximum backwards compatibility (for arguably broken applications), Unicorn is not and has never been about supporting the lowest common denominator.