about summary refs log tree commit homepage
path: root/ext
DateCommit message (Collapse)
2010-11-07tee_input: switch to simpler API for parsing trailers
Not that anybody uses trailers extensively, but it's good to know it's there.
2010-11-06http_parser: add HttpParser#next? method
An easy combination of the existing HttpParser#keepalive? and HttpParser#reset methods, this makes it easier to implement persistence.
2010-11-06enable HTTP keepalive support for all methods
Yes, this means even POST/PUT bodies may be kept alive, but only if the body (and trailers) are fully-consumed.
2010-10-07http: fix behavior with pipelined requests
We cannot clear the buffer between requests because clients may send multiple requests that get taken in one read()/recv() call.
2010-10-07http: remove unnecessary rb_str_update() calls
Rubinius no longer uses it, and it conflicts with a public method in MRI.
2010-10-07http: allow this to be used as a request object
The parser and request object become one and the same, since the parser lives for the lifetime of the request.
2010-10-05http: raise empty backtrace for HttpParserError
It's expensive to generate a backtrace and this exception is only triggered by bad clients. So make it harder for them to DoS us by sending bad requests.
2010-06-24http: avoid (re-)declaring the Unicorn module
It makes for messy documentation.
2010-06-12http: fix rb_str_set_len() define for 1.8.6
2010-06-10http: cleanups for latest Rubinius
Rubinius now supports rb_str_set_len() and sets -fPIC. We shouldn't check for rb_str_modify() since link-time detection is broken under Rubinius and even 1.8.6 has rb_str_modify().
2010-06-08http: move Version: header check into a less common path
Since the "Version" header is uncommon and never hits our optimized case, we don't need to check for it in the common case.
2010-06-08http: ignore Version: header if explicitly set by client
When Unicorn receives a request with a "Version" header, the HttpParser transforms it into "HTTP_VERSION". After that tries to add it to the request hash which already contains a "HTTP_VERSION" key with the actual http version of the request. So it tries to append the new value separated by a comma. But since the http version is a freezed constant, the TypeError exception is raised. According to the HTTP RFC (http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.1) a "Version" header is valid. However, it's not supported in rack, since rack has a HTTP_VERSION env variable for the http version. So I think the easiest way to deal with this problem is to just ignore the header since it is extremely unusual. We were getting it from a crappy bot. ref: http://mid.gmane.org/AANLkTimuGgcwNAMcVZdViFWdF-UcW_RGyZAue7phUXps@mail.gmail.com Acked-by: Eric Wong <normalperson@yhbt.net>
2010-05-07http: allow horizontal tab as leading whitespace in header values
This is allowed by RFC 2616, section 2.2, where spaces and horizontal tabs are counted as linear white space and linear white space (not just regular spaces) may prefix field-values (section 4.2). This has _not_ been a real issue in ~4 years of using this parser (starting with Mongrel) with clients in the wild. Thanks to IƱaki Baz Castillo for pointing this out.
2010-04-26http: pedantic fix for trailer-less chunked requests
HTTP requests without trailers still need a CRLF after the last chunk, that is: it must end as: "0\r\n\r\n", not "0\r\n". So we'll always pretend there are trailers to parse for the sake of TeeInput. This is mostly a pedantic fix, as the two bytes in the socket buffer are unlikely to trigger protocol errors.
2010-04-19http: negative/invalid Content-Length raises exception
...instead of tripping an assertion. This fixes a potential denial-of-service for servers exposed directly to untrusted clients. This bug does not affect supported Unicorn deployments as Unicorn is only supported with trusted clients (such as nginx) on a LAN. nginx is known to reject clients that send invalid Content-Length headers, so any deployments on a trusted LAN and/or behind nginx are safe. Servers affected by this bug include (but are not limited to) Rainbows! and Zbatery. This does not affect Thin nor Mongrel which never got request body filtering treatment that the Unicorn HTTP parser got in August 2009.
2010-02-18http: document CFLAGS used for development
this file may be sourced and used later, too
2010-02-18http: const correctness fixes
Not fun, but maybe this can help us spot _real_ problems more easily in the future.
2010-02-18http: cleanup globals and ABI namespace
* init_globals() is a static function, avoid conflicting with any potential libraries out there... * mUnicorn and cHttpParser do not need to be static globals they're not used outside of Init_unicorn_http().
2010-02-18http: avoid signedness warnings
We never come close to the signed limits anywhere, so it should be safe either way, but make paranoid compiler settings less noisy if possible.
2010-02-13http: fix memory leak exposed in concurrent servers
First off, this memory leak DOES NOT affect Unicorn itself. Unicorn allocates the HttpParser once and always reuses it in every sequential request. This leak affects applications which repeatedly allocate a new HTTP parser. Thus this bug affects _all_ deployments of Rainbows! and Zbatery. These servers allocate a new parser for every client connection. I misread the Data_Make_Struct/Data_Wrap_Struct documentation and ended up passing NULL as the "free" argument instead of -1, causing the memory to never be freed. From README.EXT in the MRI source which I misread: > The free argument is the function to free the pointer > allocation. If this is -1, the pointer will be just freed. > The functions mark and free will be called from garbage > collector.
2009-12-19http: allow userinfo component in absoluteURIs
This is not explicitly specified or listed as an example in in rfc2616. However, rfc2616 section 3.2.1 defers to rfc2396[1] for the definition of absolute URIs, so the userinfo component should be allowable, even if it does not make any sense. In the real world, previous versions of Mongrel used URI.parse() and thus allowed userinfo, so we also have precedence to allow userinfo to be compatible *in case* our interpretation of the RFCs is incorrect. This change is unfortunately needed because *occasionally* real clients rely on them. Reported-by: Scott Chacon [1] rfc3986 obsoletes rfc2396, but also includes userinfo
2009-12-06http: PATH_INFO/REQUEST_PATH includes semi-colons
This is allowed according to RFC 2396, section 3.3 and matches the behavior of URI.parse, as well.
2009-11-21http: Rubinius 0.13.0 compatibility fix
Rubinius appears to have changed semantics and our rb_str_set_len() emulation doesn't work anymore. Since rb_str_set_len() is just an optimized version of rb_str_resize(), we'll just use rb_str_resize() for now unless we notice something better. test_http_parser and test_http_parser_ng tests pass under Rubinius 0.13.0.
2009-11-04http: allow headers/trailers to be written byte-wise
This allows clients to trickle headers and trailers. While Unicorn itself does not support slow clients for many reasons, this affects servers that depend on our parser like Rainbows!. This actually does affect Unicorn when handling trailers, but HTTP trailers are very ever rarely used in requests. Fortunately this stupid bug does not seem able to trigger out-of-bounds conditions.
2009-09-18http: don't force -fPIC if it can't be used
Not everybody can use it, even if most of the world can.
2009-09-15http: add #endif comment labels where appropriate
Sometimes I end up hacking on 10-row high terminals and need more context :x
2009-09-15http: cleanup assertion for memoized header strings
assert_frozen() should not be checking what type of object it is, instead put an extra assertion in there to ensure we have a string.
2009-09-14http: create a new string buffer on empty values
Since empty values on one line can be a heuristic to determine future lines are continuation lines (and a as a result, a decently long header), pre-allocate a string buffer just in case. This is to workaround what appears to be bug in the Rubinius C API, but it could be considered (intended) DWIM behavior, too...
2009-09-14http: use rb_str_{update,flush} if available
Rubinius supports these functions as of 039091066244cfcf483310b86b5c4989aaa6302b This allows the test_http_parser_ng.rb test to run under Rubinius db612aa62cad9e5cc41a4a4be645642362029d20
2009-09-14http: compile with -fPIC
Rubinius doesn't seem to set this by default
2009-09-14http: no-op rb_str_modify() for Rubies without it
Rubinius has no rb_str_modify() function, it /may/ not need it.
2009-09-14http: define OFFT2NUM macro on Rubies without it
Hope they have the LL2NUM macro (Rubinius does)
2009-09-14http: support Rubies without the OBJ_FROZEN macro
Rubinius does not support frozen objects, maybe other Rubies lack support for it as well.
2009-09-08"encoding: binary" comments for all sources (1.9)
This ensures any string literals that pop up in *our* code will just be a bag of bytes. This shouldn't affect/fix/break existing apps in most cases, but most constants will always have the "correct" encoding (none!) to be consistent with HTTP/socket expectations. Since this comment affects things only on a per-source basis, it won't affect existing apps with the exception of strings we pass to the Rack application. This will eventually allow us to get rid of that Unicorn::Z constant, too.
2009-09-06http: ignore Host: continuation lines with absolute URIs
This probably doesn't affect anyone with HTTP/1.1, but future versions of HTTP will use absolute URIs and maybe we'll eventually get clients that (mistakenly) send us Host: headers along with absolute URIs.
2009-09-06http: rb_gc_mark already ignores immediates
No need to add an extra check, even if it does avoid a function call.
2009-09-06http: NIL_P(var) instead of var == Qnil
This should be more inline with Ruby standards/coding style and probably more future-proof, as well.
2009-09-06http: verbose assertions
This makes it easier for bug reporters to tell us what's wrong in case line numbers change.
2009-09-06http: extra assertion when advancing p manually
Just in case, it'll be easier to track down if bugs pop up.
2009-09-06http: remove needless goto
There's no need to use a goto here to avoid one level of nesting.
2009-09-06http: use explicit elses for readability
This should make code easier to read and follow.
2009-09-06http: refactor keepalive tracking to functions
In case we modify our struct to not use bitflags, this should make it easier to change the parser code. This also adds extra clarification for how we track keepalive and why we only do it for certain request methods.
2009-09-06http: switch to macros for bitflag handling
These are similar to the macros found in MRI, and can more easily allow us to swap out the bitflags for real struct members...
2009-09-06http: clarify the setting of the actual header in the hash
Avoid a negative conditional in the process and having an explicit else in there makes this piece easier to track. Also explain /why/ the Host: header can get ignored.
2009-09-06http: cleanup and avoid potential signedness warning
Just pass the http_parser struct pointer when checking for invalid headers in the trailer. The compiler should be smart enough to inline and not relookup the flags. This avoids having to worry about the flags being signed or not (they should never be) and also makes it easier to maintain if we move away from using bitfields.
2009-09-03http: add HttpParser#headers? method
This method determines if there are headers in the request. Simple HTTP/0.9 requests did not have headers in the request (and our responses we make should not have them, either).
2009-09-02http: SERVER_PROTOCOL matches HTTP_VERSION
And it'll default to HTTP/0.9 if HTTP_VERSION is not specified (as version-less HTTP requests imply HTTP/0.9.
2009-09-01http: support for simple HTTP/0.9 GET requests
HTTP/0.9 only supports GET requests and didn't require a version number in the request line. Additionally, only a single CRLF was required. Note: we don't correctly generate HTTP/0.9 responses, yet.
2009-09-01http: extension-methods allow any tokens
ref: rfc 2616, section 5.1.1 http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.1 Current version of Rack::Lint agrees with us, too. While I've yet to encounter actual usage of non-upper REQUEST_METHODs, we might as well support what Rack supports.
2009-08-29unicorn_http: "fix" const warning
neither buffer nor p should be const (since we modify buffer in $snake_upcase_char), but this is a much smaller change _for now_