Date | Commit message (Collapse) |
|
This patch checks incoming connections and avoids calling the application
if the connection has been closed.
It works by sending the beginning of the HTTP response before calling
the application to see if the socket can successfully be written to.
By enabling this feature users can avoid wasting application rendering
time only to find the connection is closed when attempting to write, and
throwing out the result.
When a client disconnects while being queued or processed, Nginx will log
HTTP response 499 but the application will log a 200.
Enabling this feature will minimize the time window during which the problem
can arise.
The feature is disabled by default and can be enabled by adding
'check_client_connection true' to the unicorn config.
[ew: After testing this change, Tom Burns wrote:
So we just finished the US Black Friday / Cyber Monday weekend running
unicorn forked with the last version of the patch I had sent you. It
worked splendidly and helped us handle huge flash sales without
increased response time over the weekend.
Whereas in previous flash traffic scenarios we would see the number of
HTTP 499 responses grow past the number of real HTTP 200 responses,
over the weekend we saw no growth in 499s during flash sales.
Unexpectedly the patch also helped us ward off a DoS attack where the
attackers were disconnecting immediately after making a request.
ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com>
]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
The previous REQUEST_PATH limit of 1024 is relatively small and
some users encounter problems with long URLs. 4K is a common
limit for PATH_MAX on modern GNU/Linux systems and REQUEST_PATH
is likely to translate to a filesystem path name.
Thanks to Nuo Yan <yan.nuo@gmail.com> and Lawrence Pit
<lawrence.pit@gmail.com> for their feedback on this issue.
ref: http://mid.gmane.org/CB935F19-72B8-4EC2-8A1D-5084B37C09F2@gmail.com
|
|
Existing license terms (Ruby-specific) and GPLv2 remain
in place, but GPLv3 is preferred as it helps with
distribution of AGPLv3 code and is explicitly compatible
with Apache License (v2.0).
Many more reasons are documented by the FSF:
https://www.gnu.org/licenses/quick-guide-gplv3.html
http://gplv3.fsf.org/rms-why.html
ref: http://thread.gmane.org/gmane.comp.lang.ruby.unicorn.general/933
|
|
RFC 2616 doesn't appear to allow most CTL bytes even though
Mongrel always did. Rack::Lint disallows 0..31, too, though we
allow "\t" (HT, 09) since it's LWS and allowed by RFC 2616.
|
|
Not all invocations of filter_body will trigger CoW on the
given destination string. We can also avoid an unnecessary
rb_str_set_len() in the non-chunked path, too.
|
|
Needless line noise, kgio doesn't support tainting anyways.
|
|
chunk_ready! was my original name for it, but I'm indecisive
when it comes to naming things.
|
|
This allows one to enter the dechunker without parsing
HTTP headers beforehand. Since we skipped header parsing,
trailer parsing is not supported since we don't know
what trailers might be (to our knowledge, nobody uses trailers
anyways)
|
|
copy-on-write behavior doesn't help you if your common
use case triggers copies.
|
|
Makes things easier-to-understand since it's based on memcpy()
|
|
Ruby 1.9.3dev (trunk) requires it if the string size
is unchanged.
|
|
RFC 2616, section 4.2:
> The field-content does not include any leading or trailing LWS:
> linear white space occurring before the first non-whitespace
> character of the field-value or after the last non-whitespace
> character of the field-value. Such leading or trailing LWS MAY be
> removed without changing the semantics of the field value. Any LWS
> that occurs between field-content MAY be replaced with a single SP
> before interpreting the field value or forwarding the message
> downstream.
|
|
Rainbows! wants to be able to lower this eventually...
|
|
Combines the following sequence:
http_parser.buf << socket.readpartial(0x4000)
http_parser.parse
Into:
http_parser.add_parse(socket.readpartial(0x4000))
It was too damn redundant otherwise...
|
|
There's an HTTP status code allocated for it in
<http://www.iana.org/assignments/http-status-codes>, so
return that instead of 400.
|
|
Just in case we have people that don't use DNS, we can support
folks who enter ugly IPv6 addresses...
IPv6 uses brackets around the address to avoid confusing
the colons used in the address with the colon used to denote
the TCP port number in URIs.
|
|
But allows small optimizations to be made to avoid
constant/instance variable lookups later :)
|
|
This can return a static string and be significantly
faster as it reduces object allocations and Ruby method
calls for the fastest websites that serve thousands of
requests a second.
It assumes the Ruby runtime is single-threaded, but that
is the case of Ruby 1.8 and 1.9 and also what Unicorn
is all about. This change is safe for Rainbows! under 1.8
and 1.9.
|
|
We do not link against any external libraries
|
|
We need to preserve our internal flags and only clear them on
HttpParser#parse. This allows the async concurrency models in
Rainbows! to work properly.
|
|
More config bloat, sadly this is necessary for Rainbows! :<
|
|
Evil clients may be exposed to the Unicorn parser via
Rainbows!, so we'll allow people to turn off blindly
trusting certain X-Forwarded* headers for "rack.url_scheme"
and rely on middleware to handle it.
|
|
rack.url_scheme handling and SERVER_{NAME,PORT} handling
each deserve their own functions.
|
|
The first value of X-Forwarded-Proto in rack.url_scheme should
be used as it can be chained. This header can be set multiple
times via different proxies in the chain, but consider the first
one to be valid.
Additionally, respect X-Forwarded-SSL as it may be passed with
the "on" flag instead of X-Forwarded-Proto.
ref: rack commit 85ca454e6143a3081d90e4546ccad602a4c3ad2e
and 35bb5ba6746b5d346de9202c004cc926039650c7
|
|
This limits the number of keepalive requests of a single
connection to prevent a single client from monopolizing server
resources. On multi-process servers (e.g. Rainbows!) with many
keepalive clients per worker process, this can force a client to
reconnect and increase its chances of being accepted on a
less-busy worker process.
This directive is named after the nginx directive which
is identical in function.
|
|
This allows apps/middlewares on Rainbows! that rely on env in
the response_body#close to hold onto the env.
|
|
Not that anybody uses trailers extensively, but it's
good to know it's there.
|
|
An easy combination of the existing HttpParser#keepalive? and
HttpParser#reset methods, this makes it easier to implement
persistence.
|
|
Yes, this means even POST/PUT bodies may be kept alive,
but only if the body (and trailers) are fully-consumed.
|
|
We cannot clear the buffer between requests because
clients may send multiple requests that get taken in
one read()/recv() call.
|
|
Rubinius no longer uses it, and it conflicts with a public
method in MRI.
|
|
The parser and request object become one and the
same, since the parser lives for the lifetime
of the request.
|
|
It's expensive to generate a backtrace and this exception
is only triggered by bad clients. So make it harder for
them to DoS us by sending bad requests.
|
|
It makes for messy documentation.
|
|
|
|
Rubinius now supports rb_str_set_len() and sets -fPIC.
We shouldn't check for rb_str_modify() since link-time detection
is broken under Rubinius and even 1.8.6 has rb_str_modify().
|
|
Since the "Version" header is uncommon and never hits our
optimized case, we don't need to check for it in the common
case.
|
|
When Unicorn receives a request with a "Version" header, the
HttpParser transforms it into "HTTP_VERSION". After that tries to add
it to the request hash which already contains a "HTTP_VERSION" key
with the actual http version of the request. So it tries to append the
new value separated by a comma. But since the http version is a
freezed constant, the TypeError exception is raised.
According to the HTTP RFC
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.1) a
"Version" header is valid. However, it's not supported in rack, since
rack has a HTTP_VERSION env variable for the http version. So I think
the easiest way to deal with this problem is to just ignore the header
since it is extremely unusual. We were getting it from a crappy bot.
ref: http://mid.gmane.org/AANLkTimuGgcwNAMcVZdViFWdF-UcW_RGyZAue7phUXps@mail.gmail.com
Acked-by: Eric Wong <normalperson@yhbt.net>
|
|
This is allowed by RFC 2616, section 2.2, where spaces and
horizontal tabs are counted as linear white space and linear
white space (not just regular spaces) may prefix field-values
(section 4.2).
This has _not_ been a real issue in ~4 years of using this
parser (starting with Mongrel) with clients in the wild.
Thanks to IƱaki Baz Castillo for pointing this out.
|
|
HTTP requests without trailers still need a CRLF after the last
chunk, that is: it must end as: "0\r\n\r\n", not "0\r\n". So
we'll always pretend there are trailers to parse for the
sake of TeeInput.
This is mostly a pedantic fix, as the two bytes in the socket
buffer are unlikely to trigger protocol errors.
|
|
...instead of tripping an assertion.
This fixes a potential denial-of-service for servers exposed directly
to untrusted clients.
This bug does not affect supported Unicorn deployments as Unicorn is
only supported with trusted clients (such as nginx) on a LAN. nginx is
known to reject clients that send invalid Content-Length headers, so any
deployments on a trusted LAN and/or behind nginx are safe.
Servers affected by this bug include (but are not limited to) Rainbows!
and Zbatery. This does not affect Thin nor Mongrel which never got
request body filtering treatment that the Unicorn HTTP parser got in
August 2009.
|
|
this file may be sourced and used later, too
|
|
Not fun, but maybe this can help us spot _real_ problems
more easily in the future.
|
|
* init_globals() is a static function, avoid conflicting
with any potential libraries out there...
* mUnicorn and cHttpParser do not need to be static globals
they're not used outside of Init_unicorn_http().
|
|
We never come close to the signed limits anywhere, so it
should be safe either way, but make paranoid compiler settings
less noisy if possible.
|
|
First off, this memory leak DOES NOT affect Unicorn itself.
Unicorn allocates the HttpParser once and always reuses it
in every sequential request.
This leak affects applications which repeatedly allocate a new
HTTP parser. Thus this bug affects _all_ deployments of
Rainbows! and Zbatery. These servers allocate a new parser for
every client connection.
I misread the Data_Make_Struct/Data_Wrap_Struct documentation
and ended up passing NULL as the "free" argument instead of -1,
causing the memory to never be freed.
From README.EXT in the MRI source which I misread:
> The free argument is the function to free the pointer
> allocation. If this is -1, the pointer will be just freed.
> The functions mark and free will be called from garbage
> collector.
|
|
This is not explicitly specified or listed as an example in in
rfc2616. However, rfc2616 section 3.2.1 defers to rfc2396[1]
for the definition of absolute URIs, so the userinfo component
should be allowable, even if it does not make any sense.
In the real world, previous versions of Mongrel used URI.parse()
and thus allowed userinfo, so we also have precedence to allow
userinfo to be compatible *in case* our interpretation of the
RFCs is incorrect. This change is unfortunately needed because
*occasionally* real clients rely on them.
Reported-by: Scott Chacon
[1] rfc3986 obsoletes rfc2396, but also includes userinfo
|
|
This is allowed according to RFC 2396, section 3.3 and matches
the behavior of URI.parse, as well.
|
|
Rubinius appears to have changed semantics and our
rb_str_set_len() emulation doesn't work anymore. Since
rb_str_set_len() is just an optimized version of
rb_str_resize(), we'll just use rb_str_resize() for now
unless we notice something better.
test_http_parser and test_http_parser_ng tests pass under
Rubinius 0.13.0.
|
|
This allows clients to trickle headers and trailers. While
Unicorn itself does not support slow clients for many reasons,
this affects servers that depend on our parser like Rainbows!.
This actually does affect Unicorn when handling trailers, but
HTTP trailers are very ever rarely used in requests.
Fortunately this stupid bug does not seem able to trigger
out-of-bounds conditions.
|