about summary refs log tree commit homepage
path: root/ext
DateCommit message (Collapse)
2012-04-17http: increase REQUEST_PATH maximum length to 4K
The previous REQUEST_PATH limit of 1024 is relatively small and some users encounter problems with long URLs. 4K is a common limit for PATH_MAX on modern GNU/Linux systems and REQUEST_PATH is likely to translate to a filesystem path name. Thanks to Nuo Yan <yan.nuo@gmail.com> and Lawrence Pit <lawrence.pit@gmail.com> for their feedback on this issue. ref: http://mid.gmane.org/CB935F19-72B8-4EC2-8A1D-5084B37C09F2@gmail.com
2011-08-29add GPLv3 option to the license
Existing license terms (Ruby-specific) and GPLv2 remain in place, but GPLv3 is preferred as it helps with distribution of AGPLv3 code and is explicitly compatible with Apache License (v2.0). Many more reasons are documented by the FSF: https://www.gnu.org/licenses/quick-guide-gplv3.html http://gplv3.fsf.org/rms-why.html ref: http://thread.gmane.org/gmane.comp.lang.ruby.unicorn.general/933
2011-07-13http: reject non-LWS CTL chars (0..31 + 127) in field values
RFC 2616 doesn't appear to allow most CTL bytes even though Mongrel always did. Rack::Lint disallows 0..31, too, though we allow "\t" (HT, 09) since it's LWS and allowed by RFC 2616.
2011-06-15http: delay CoW string invalidations in filter_body
Not all invocations of filter_body will trigger CoW on the given destination string. We can also avoid an unnecessary rb_str_set_len() in the non-chunked path, too.
2011-06-15http: remove tainting flag
Needless line noise, kgio doesn't support tainting anyways.
2011-06-14http: fix documentation for dechunk!
chunk_ready! was my original name for it, but I'm indecisive when it comes to naming things.
2011-06-13http: dechunk! method to enter dechunk mode
This allows one to enter the dechunker without parsing HTTP headers beforehand. Since we skipped header parsing, trailer parsing is not supported since we don't know what trailers might be (to our knowledge, nobody uses trailers anyways)
2011-06-13http: document reasoning for memcpy in filter_body
copy-on-write behavior doesn't help you if your common use case triggers copies.
2011-06-13http: rename variables in filter_body implementation
Makes things easier-to-understand since it's based on memcpy()
2011-05-23http: call rb_str_modify before rb_str_resize
Ruby 1.9.3dev (trunk) requires it if the string size is unchanged.
2011-05-23strip trailing and leading linear whitespace in headers
RFC 2616, section 4.2: > The field-content does not include any leading or trailing LWS: > linear white space occurring before the first non-whitespace > character of the field-value or after the last non-whitespace > character of the field-value. Such leading or trailing LWS MAY be > removed without changing the semantics of the field value. Any LWS > that occurs between field-content MAY be replaced with a single SP > before interpreting the field value or forwarding the message > downstream.
2011-05-05http_parser: add max_header_len accessor
Rainbows! wants to be able to lower this eventually...
2011-05-04http_parser: new add_parse method
Combines the following sequence: http_parser.buf << socket.readpartial(0x4000) http_parser.parse Into: http_parser.add_parse(socket.readpartial(0x4000)) It was too damn redundant otherwise...
2011-05-04return 414 for URI length violations
There's an HTTP status code allocated for it in <http://www.iana.org/assignments/http-status-codes>, so return that instead of 400.
2011-02-02http: parser handles IPv6 bracketed IP hostnames
Just in case we have people that don't use DNS, we can support folks who enter ugly IPv6 addresses... IPv6 uses brackets around the address to avoid confusing the colons used in the address with the colon used to denote the TCP port number in URIs.
2011-01-05http_parser: add clear method, deprecate reset
But allows small optimizations to be made to avoid constant/instance variable lookups later :)
2011-01-04http_response: implement httpdate in C
This can return a static string and be significantly faster as it reduces object allocations and Ruby method calls for the fastest websites that serve thousands of requests a second. It assumes the Ruby runtime is single-threaded, but that is the case of Ruby 1.8 and 1.9 and also what Unicorn is all about. This change is safe for Rainbows! under 1.8 and 1.9.
2010-12-29http: remove unnecessary dir_config statement
We do not link against any external libraries
2010-12-26http: #keepalive? and #headers? work after #next?
We need to preserve our internal flags and only clear them on HttpParser#parse. This allows the async concurrency models in Rainbows! to work properly.
2010-12-21http: hook up "trust_x_forwarded" to configurator
More config bloat, sadly this is necessary for Rainbows! :<
2010-12-20http: allow ignoring X-Forwarded-* for url_scheme
Evil clients may be exposed to the Unicorn parser via Rainbows!, so we'll allow people to turn off blindly trusting certain X-Forwarded* headers for "rack.url_scheme" and rely on middleware to handle it.
2010-12-20http: refactor finalize_header function
rack.url_scheme handling and SERVER_{NAME,PORT} handling each deserve their own functions.
2010-12-20http: update setting of "https" for rack.url_scheme
The first value of X-Forwarded-Proto in rack.url_scheme should be used as it can be chained. This header can be set multiple times via different proxies in the chain, but consider the first one to be valid. Additionally, respect X-Forwarded-SSL as it may be passed with the "on" flag instead of X-Forwarded-Proto. ref: rack commit 85ca454e6143a3081d90e4546ccad602a4c3ad2e and 35bb5ba6746b5d346de9202c004cc926039650c7
2010-12-20http: support keepalive_requests directive
This limits the number of keepalive requests of a single connection to prevent a single client from monopolizing server resources. On multi-process servers (e.g. Rainbows!) with many keepalive clients per worker process, this can force a client to reconnect and increase its chances of being accepted on a less-busy worker process. This directive is named after the nginx directive which is identical in function.
2010-12-19http: delay clearing env on HttpParser#next?
This allows apps/middlewares on Rainbows! that rely on env in the response_body#close to hold onto the env.
2010-11-07tee_input: switch to simpler API for parsing trailers
Not that anybody uses trailers extensively, but it's good to know it's there.
2010-11-06http_parser: add HttpParser#next? method
An easy combination of the existing HttpParser#keepalive? and HttpParser#reset methods, this makes it easier to implement persistence.
2010-11-06enable HTTP keepalive support for all methods
Yes, this means even POST/PUT bodies may be kept alive, but only if the body (and trailers) are fully-consumed.
2010-10-07http: fix behavior with pipelined requests
We cannot clear the buffer between requests because clients may send multiple requests that get taken in one read()/recv() call.
2010-10-07http: remove unnecessary rb_str_update() calls
Rubinius no longer uses it, and it conflicts with a public method in MRI.
2010-10-07http: allow this to be used as a request object
The parser and request object become one and the same, since the parser lives for the lifetime of the request.
2010-10-05http: raise empty backtrace for HttpParserError
It's expensive to generate a backtrace and this exception is only triggered by bad clients. So make it harder for them to DoS us by sending bad requests.
2010-06-24http: avoid (re-)declaring the Unicorn module
It makes for messy documentation.
2010-06-12http: fix rb_str_set_len() define for 1.8.6
2010-06-10http: cleanups for latest Rubinius
Rubinius now supports rb_str_set_len() and sets -fPIC. We shouldn't check for rb_str_modify() since link-time detection is broken under Rubinius and even 1.8.6 has rb_str_modify().
2010-06-08http: move Version: header check into a less common path
Since the "Version" header is uncommon and never hits our optimized case, we don't need to check for it in the common case.
2010-06-08http: ignore Version: header if explicitly set by client
When Unicorn receives a request with a "Version" header, the HttpParser transforms it into "HTTP_VERSION". After that tries to add it to the request hash which already contains a "HTTP_VERSION" key with the actual http version of the request. So it tries to append the new value separated by a comma. But since the http version is a freezed constant, the TypeError exception is raised. According to the HTTP RFC (http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.1) a "Version" header is valid. However, it's not supported in rack, since rack has a HTTP_VERSION env variable for the http version. So I think the easiest way to deal with this problem is to just ignore the header since it is extremely unusual. We were getting it from a crappy bot. ref: http://mid.gmane.org/AANLkTimuGgcwNAMcVZdViFWdF-UcW_RGyZAue7phUXps@mail.gmail.com Acked-by: Eric Wong <normalperson@yhbt.net>
2010-05-07http: allow horizontal tab as leading whitespace in header values
This is allowed by RFC 2616, section 2.2, where spaces and horizontal tabs are counted as linear white space and linear white space (not just regular spaces) may prefix field-values (section 4.2). This has _not_ been a real issue in ~4 years of using this parser (starting with Mongrel) with clients in the wild. Thanks to IƱaki Baz Castillo for pointing this out.
2010-04-26http: pedantic fix for trailer-less chunked requests
HTTP requests without trailers still need a CRLF after the last chunk, that is: it must end as: "0\r\n\r\n", not "0\r\n". So we'll always pretend there are trailers to parse for the sake of TeeInput. This is mostly a pedantic fix, as the two bytes in the socket buffer are unlikely to trigger protocol errors.
2010-04-19http: negative/invalid Content-Length raises exception
...instead of tripping an assertion. This fixes a potential denial-of-service for servers exposed directly to untrusted clients. This bug does not affect supported Unicorn deployments as Unicorn is only supported with trusted clients (such as nginx) on a LAN. nginx is known to reject clients that send invalid Content-Length headers, so any deployments on a trusted LAN and/or behind nginx are safe. Servers affected by this bug include (but are not limited to) Rainbows! and Zbatery. This does not affect Thin nor Mongrel which never got request body filtering treatment that the Unicorn HTTP parser got in August 2009.
2010-02-18http: document CFLAGS used for development
this file may be sourced and used later, too
2010-02-18http: const correctness fixes
Not fun, but maybe this can help us spot _real_ problems more easily in the future.
2010-02-18http: cleanup globals and ABI namespace
* init_globals() is a static function, avoid conflicting with any potential libraries out there... * mUnicorn and cHttpParser do not need to be static globals they're not used outside of Init_unicorn_http().
2010-02-18http: avoid signedness warnings
We never come close to the signed limits anywhere, so it should be safe either way, but make paranoid compiler settings less noisy if possible.
2010-02-13http: fix memory leak exposed in concurrent servers
First off, this memory leak DOES NOT affect Unicorn itself. Unicorn allocates the HttpParser once and always reuses it in every sequential request. This leak affects applications which repeatedly allocate a new HTTP parser. Thus this bug affects _all_ deployments of Rainbows! and Zbatery. These servers allocate a new parser for every client connection. I misread the Data_Make_Struct/Data_Wrap_Struct documentation and ended up passing NULL as the "free" argument instead of -1, causing the memory to never be freed. From README.EXT in the MRI source which I misread: > The free argument is the function to free the pointer > allocation. If this is -1, the pointer will be just freed. > The functions mark and free will be called from garbage > collector.
2009-12-19http: allow userinfo component in absoluteURIs
This is not explicitly specified or listed as an example in in rfc2616. However, rfc2616 section 3.2.1 defers to rfc2396[1] for the definition of absolute URIs, so the userinfo component should be allowable, even if it does not make any sense. In the real world, previous versions of Mongrel used URI.parse() and thus allowed userinfo, so we also have precedence to allow userinfo to be compatible *in case* our interpretation of the RFCs is incorrect. This change is unfortunately needed because *occasionally* real clients rely on them. Reported-by: Scott Chacon [1] rfc3986 obsoletes rfc2396, but also includes userinfo
2009-12-06http: PATH_INFO/REQUEST_PATH includes semi-colons
This is allowed according to RFC 2396, section 3.3 and matches the behavior of URI.parse, as well.
2009-11-21http: Rubinius 0.13.0 compatibility fix
Rubinius appears to have changed semantics and our rb_str_set_len() emulation doesn't work anymore. Since rb_str_set_len() is just an optimized version of rb_str_resize(), we'll just use rb_str_resize() for now unless we notice something better. test_http_parser and test_http_parser_ng tests pass under Rubinius 0.13.0.
2009-11-04http: allow headers/trailers to be written byte-wise
This allows clients to trickle headers and trailers. While Unicorn itself does not support slow clients for many reasons, this affects servers that depend on our parser like Rainbows!. This actually does affect Unicorn when handling trailers, but HTTP trailers are very ever rarely used in requests. Fortunately this stupid bug does not seem able to trigger out-of-bounds conditions.
2009-09-18http: don't force -fPIC if it can't be used
Not everybody can use it, even if most of the world can.