Date | Commit message (Collapse) |
|
Ruby just recently bump the master version to 3.0.
This requirement bump is necessary to test unicorn
against ruby master.
[ew: wrap at <80 columns for hackers with poor eyesight]
Acked-by: Eric Wong <bofh@yhbt.net>
|
|
We need to favor "Transfer-Encoding: chunked" over
"Content-Length" in the request header if they both exist.
Furthermore, we now reject redundant chunking and cases where
"chunked" is not the final encoding.
We currently do not and have no plans to decode "gzip",
"deflate", or "compress" encoding as described by RFC 7230.
That's a job more appropriate for middleware, anyways.
cf. https://tools.ietf.org/html/rfc7230
https://www.rfc-editor.org/errata_search.php?rfc=7230
|
|
bogomips.org is due to expire, soon, and I'm not willing to pay
extortionist fees to Ethos Capital/PIR/ICANN to keep a .org. So
it's at yhbt.net, for now, but it will change again to
whatever's affordable... Identity is overrated.
Tor users can use .onions and kick ICANN to the curb:
torsocks w3m http://unicorn.ou63pmih66umazou.onion/
torsocks git clone http://ou63pmih66umazou.onion/unicorn.git/
torsocks w3m http://ou63pmih66umazou.onion/unicorn-public/
While we're at it, `s/news.gmane.org/news.gmane.io/g', too.
(but I suspect that'll need to be resynched since our mail
"List-Id:" header is changing).
|
|
Since Ruby 2.6, it's a documented part of the API and we may depend
on it: https://bugs.ruby-lang.org/issues/9894
It's been around since the early Ruby 1.9 days, and reduces
overhead compared to relying on rb_global_variable:
https://bogomips.org/unicorn-public/20170301002854.29198-1-e@80x24.org/
|
|
String#-@ deduplicates strings starting with Ruby 2.5.0
Hash#[]= deduplicates strings starting in Ruby 2.6.0-rc1
This allows us to save a small amount of memory by sharing
objects with other parts of the stack (e.g. Rack).
|
|
Hijackers may capture and reuse `env' indefinitely, so we must
not use it in those cases for future requests. For non-hijack
requests, we continue to reuse the `env' object to reduce
memory recycling.
Reported-and-tested-by: Sam Saffron <sam.saffron@gmail.com>
|
|
We need to add the array to ruby's global_list right after created it;
otherwise it probably gets GCed.
|
|
rb_global_variable registers the address of the variable which
refers to the object, instead of the object itself. This adds
extra overhead to each global variable for our case, where the
variable is frozen and never changed.
Given there are currently 59 elements in this array, this saves
58 singly-linked list entries and associated malloc calls and
associated overhead in the current mainline Ruby 2.x
implementation. On 64-bit GNU libc malloc, this is already
16 * 58 = 928 bytes; more than the extra object slot and array
slack space used by the new mark array.
Mainline Ruby 1.9+ currently has a rb_gc_register_mark_object
public function which would suite our needs, too, but it is
currently undocumented, and may not be available in the future.
|
|
While it is innocuous after compiling, it can be a confusing
source of errors for users with broken installations of Ruby
itself:
https://bogomips.org/unicorn-public/5ace6a20-e094-293d-93df-b557480e12d5@anyces.com/
https://bogomips.org/unicorn-public/02994a55-9c07-a3c5-f06b-a4c15551a67e@anyces.com/
rb_str_set_len has been provided since Ruby 1.8.7+, so we have
not needed it since we dropped all 1.8.x support in unicorn 5.x.
|
|
Hi all.
We're implementing client certificate authentication with nginx and
unicorn.
Nginx configured in the following way:
proxy_set_header X-SSL-Client-Cert $ssl_client_cert;
When client submits certificate and nginx passes it to the unicorn,
unicorn responds with 400 (Bad Request). This caused because nginx
doesn't use "\r\n" they using just "\n" and multilne headers is failed
to parse (I've added test).
Accorording to RFC2616 section 19.3:
https://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.3
"The line terminator for message-header fields is the sequence CRLF.
However, we recommend that applications, when parsing such headers,
recognize a single LF as a line terminator and ignore the leading CR."
CRLF changed to ("\r\n" | "\n")
Github commit
https://github.com/uno4ki/unicorn/commit/ed127b66e162aaf176de05720f6be758f8b41b1f
PS: Googling "nginx unicorn ssl_client_cert" shows the problem.
|
|
This provides some extra type safety if combined with other
C extensions, as well as allowing us to account for memory usage of
the HTTP parser in ObjectSpace.
This requires Ruby 1.9.3+ and has remained a stable API since
then. This will become officially supported when Ruby 2.3.0 is
released later this month.
This API has only been documented in doc/extension.rdoc (formerly
README.EXT) in the Ruby source tree since April 2015, r50318
|
|
They'll continue to be maintained, but we're no longer advertising
them. Also, favor lowercase "unicorn" while we're at it since that
matches the executable and gem name to avoid unnecessary escaping
for RDoc.
|
|
Combined with the previous commit to eliminate the `@socket'
instance variable, this eliminates the last instance variable
in the Unicorn::HttpRequest class.
Eliminating the last instance variable avoids the creation of a
internal hash table used for implementing the "generic" instance
variables found in non-pure-Ruby classes. Method entry overhead
remains the same.
While this change doesn't do a whole lot for unicorn memory usage
where the HttpRequest is a singleton, it helps other HTTP servers
which rely on this code where thousands of clients may be connected.
|
|
Calling the function directly avoids the overhead of Ruby method
table lookup and global method cache. The only downside is this is
now hidden from tracers and cannot be overridden from Ruby, but I
doubt anybody cares about that.
|
|
It was never used anywhere AFAIK and wastes precious bytes.
|
|
We use the `clear' method everywhere nowadays.
|
|
This allows requiring just the C extension part of "unicorn_http",
without requiring the rest of unicorn, allowing other HTTP servers
using the same parser to be slimmer.
On my x86-64 Debian 7.0 system:
text data bss dec hex filename
44026 1976 488 46490 b59a lib/unicorn_http.so
43930 1976 456 46362 b51a lib/unicorn_http.so
|
|
Tested on x86_64, clang version 3.5-1ubuntu1 (trunk) (LLVM 3.5)
These warnings were introduced on
commit 4b2782a926d8f131b1e7382be35e3abb77bf4be5
("http: reduce parser from 72 to 56 bytes on 64-bit")
and did not affect any releases.
These length checks should not be necessary in reality because
HTTP header sizes never come close to 4GB in size.
Fixup a minor coding style (inherited from Mongrel) violation
while we're at it (tabs => spaces).
|
|
This allows the parser struct to fit in one cache line on
x86-64 systems where cache lines are 64 bytes.
Using 32-bit integer lengths is safe here because these are only for
tracking offsets within the HTTP header buffer. We can safely limit
HTTP headers and in-memory buffers to be less than 4GB without
anybody complaining.
HTTP bodies continue to use off_t (usually 64-bit, even on 32-bit
systems) sizes and support as much as the OS/hardware can handle.
|
|
This was a hack for some event loops such as those found in nginx
and some Rainbows! concurrency models. Using epoll/kqueue with
one-shot notification (which yahns does) avoids all fairness
problems.
|
|
This has long been considered a mistake and not documented for very
long.
I considered removing X-Forwarded-Proto and X-Forwarded-SSL
handling, too, so rack.url_scheme is always "http", but that might
lead to compatibility issues in rare apps if Rack::Request#scheme is
not used.
|
|
There is currently no GPLv4, so this change has no effect at the
moment.
In case the GPLv4 arrives and I am not alive to approve/review it,
the lesser of evils is have give blanket approval of all future GPL
versions (as published by the FSF). The worse evil is to be stuck
with a license which cannot guarantee the Free-ness of this project
in the future.
This unfortunately means the FSF can theoretically come out with
license terms I do not agree with, but the GPLv2 and GPLv3 will
always be an option to all users.
|
|
This could allow servers with persistent connection support[1]
to support our check_client_connection in the future.
[1] - Rainbows!/zbatery, possibly others
|
|
Our rb_str_modify() became no-ops due to incomplete reverts
of workarounds for old Rubinius, causing rb_str_set_len to
fail with: can't set length of shared string (RuntimeError)
This bug was introduced due to improper workarounds for old
versions of Rubinius in 2009 and 2010:
commit 5e8979ad38efdc4de3a69cc53aea33710d478406
("http: cleanups for latest Rubinius")
commit f37c23704cb73d57e9e478295d1641df1d9104c7
("http: no-op rb_str_modify() for Rubies without it")
|
|
Extra pointers waste space in the DSO. Normally I wouldn't
care, but the string lengths are identical and this code
already made it into another project in this form.
size(1) output:
text data bss dec hex filename
before: 42881 2040 336 45257 b0c9 unicorn_http.so
after: 42499 1888 336 44723 aeb3 unicorn_http.so
ref: http://www.akkadia.org/drepper/dsohowto.pdf
|
|
This patch checks incoming connections and avoids calling the application
if the connection has been closed.
It works by sending the beginning of the HTTP response before calling
the application to see if the socket can successfully be written to.
By enabling this feature users can avoid wasting application rendering
time only to find the connection is closed when attempting to write, and
throwing out the result.
When a client disconnects while being queued or processed, Nginx will log
HTTP response 499 but the application will log a 200.
Enabling this feature will minimize the time window during which the problem
can arise.
The feature is disabled by default and can be enabled by adding
'check_client_connection true' to the unicorn config.
[ew: After testing this change, Tom Burns wrote:
So we just finished the US Black Friday / Cyber Monday weekend running
unicorn forked with the last version of the patch I had sent you. It
worked splendidly and helped us handle huge flash sales without
increased response time over the weekend.
Whereas in previous flash traffic scenarios we would see the number of
HTTP 499 responses grow past the number of real HTTP 200 responses,
over the weekend we saw no growth in 499s during flash sales.
Unexpectedly the patch also helped us ward off a DoS attack where the
attackers were disconnecting immediately after making a request.
ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com>
]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
The previous REQUEST_PATH limit of 1024 is relatively small and
some users encounter problems with long URLs. 4K is a common
limit for PATH_MAX on modern GNU/Linux systems and REQUEST_PATH
is likely to translate to a filesystem path name.
Thanks to Nuo Yan <yan.nuo@gmail.com> and Lawrence Pit
<lawrence.pit@gmail.com> for their feedback on this issue.
ref: http://mid.gmane.org/CB935F19-72B8-4EC2-8A1D-5084B37C09F2@gmail.com
|
|
Existing license terms (Ruby-specific) and GPLv2 remain
in place, but GPLv3 is preferred as it helps with
distribution of AGPLv3 code and is explicitly compatible
with Apache License (v2.0).
Many more reasons are documented by the FSF:
https://www.gnu.org/licenses/quick-guide-gplv3.html
http://gplv3.fsf.org/rms-why.html
ref: http://thread.gmane.org/gmane.comp.lang.ruby.unicorn.general/933
|
|
RFC 2616 doesn't appear to allow most CTL bytes even though
Mongrel always did. Rack::Lint disallows 0..31, too, though we
allow "\t" (HT, 09) since it's LWS and allowed by RFC 2616.
|
|
Not all invocations of filter_body will trigger CoW on the
given destination string. We can also avoid an unnecessary
rb_str_set_len() in the non-chunked path, too.
|
|
Needless line noise, kgio doesn't support tainting anyways.
|
|
chunk_ready! was my original name for it, but I'm indecisive
when it comes to naming things.
|
|
This allows one to enter the dechunker without parsing
HTTP headers beforehand. Since we skipped header parsing,
trailer parsing is not supported since we don't know
what trailers might be (to our knowledge, nobody uses trailers
anyways)
|
|
copy-on-write behavior doesn't help you if your common
use case triggers copies.
|
|
Makes things easier-to-understand since it's based on memcpy()
|
|
Ruby 1.9.3dev (trunk) requires it if the string size
is unchanged.
|
|
RFC 2616, section 4.2:
> The field-content does not include any leading or trailing LWS:
> linear white space occurring before the first non-whitespace
> character of the field-value or after the last non-whitespace
> character of the field-value. Such leading or trailing LWS MAY be
> removed without changing the semantics of the field value. Any LWS
> that occurs between field-content MAY be replaced with a single SP
> before interpreting the field value or forwarding the message
> downstream.
|
|
Rainbows! wants to be able to lower this eventually...
|
|
Combines the following sequence:
http_parser.buf << socket.readpartial(0x4000)
http_parser.parse
Into:
http_parser.add_parse(socket.readpartial(0x4000))
It was too damn redundant otherwise...
|
|
There's an HTTP status code allocated for it in
<http://www.iana.org/assignments/http-status-codes>, so
return that instead of 400.
|
|
Just in case we have people that don't use DNS, we can support
folks who enter ugly IPv6 addresses...
IPv6 uses brackets around the address to avoid confusing
the colons used in the address with the colon used to denote
the TCP port number in URIs.
|
|
But allows small optimizations to be made to avoid
constant/instance variable lookups later :)
|
|
This can return a static string and be significantly
faster as it reduces object allocations and Ruby method
calls for the fastest websites that serve thousands of
requests a second.
It assumes the Ruby runtime is single-threaded, but that
is the case of Ruby 1.8 and 1.9 and also what Unicorn
is all about. This change is safe for Rainbows! under 1.8
and 1.9.
|
|
We do not link against any external libraries
|
|
We need to preserve our internal flags and only clear them on
HttpParser#parse. This allows the async concurrency models in
Rainbows! to work properly.
|
|
More config bloat, sadly this is necessary for Rainbows! :<
|
|
Evil clients may be exposed to the Unicorn parser via
Rainbows!, so we'll allow people to turn off blindly
trusting certain X-Forwarded* headers for "rack.url_scheme"
and rely on middleware to handle it.
|
|
rack.url_scheme handling and SERVER_{NAME,PORT} handling
each deserve their own functions.
|
|
The first value of X-Forwarded-Proto in rack.url_scheme should
be used as it can be chained. This header can be set multiple
times via different proxies in the chain, but consider the first
one to be valid.
Additionally, respect X-Forwarded-SSL as it may be passed with
the "on" flag instead of X-Forwarded-Proto.
ref: rack commit 85ca454e6143a3081d90e4546ccad602a4c3ad2e
and 35bb5ba6746b5d346de9202c004cc926039650c7
|
|
This limits the number of keepalive requests of a single
connection to prevent a single client from monopolizing server
resources. On multi-process servers (e.g. Rainbows!) with many
keepalive clients per worker process, this can force a client to
reconnect and increase its chances of being accepted on a
less-busy worker process.
This directive is named after the nginx directive which
is identical in function.
|