Date | Commit message (Collapse) |
|
Bad connections or dead upstreams cannot be solved looking at a
backtrace, so avoid polluting logs with them and making other
problems less visible.
|
|
We don't need optimized dispatch methods in cold code, so use
the more space-efficient "nil?" method dispatch to save us one
word per-call site for a rough total of 24 bytes saving.
|
|
We may set binary mode upon open by passing "b" in the mode string,
so avoid the extra method dispatch and bytecode/cache overhead that
entails.
|
|
The entire idea of a one-shot-based design is all the mutual
exclusion is handled by the event dispatch mechanism (epoll or
kqueue) without burdening the user with extra locking. However, the
way the hijack works means we check the Rack env for the
'rack.hijack_io' key which is shared across requests and may
be cleared.
Ideally, this would not be a problem if the Rack dispatch allowed
returning a special value (e.g. ":ignore") instead of the normal
status-headers-body array, much like what the non-standard
"async.callback" API Thin started.
We could also avoid this problem by disallowing our "unhijack-ing"
of the socket but at a significant cost of crippling code
reusability, including that of existing middleware.
Thus, we allocate a new, empty request object here to avoid a TOCTTOU
in the following timeline:
original thread: | another thread
HttpClient#yahns_step |
r = k.app.call(env = @hs.env) # socket hijacked into epoll queue
<thread is scheduled away> | epoll_wait readiness
| ReqRes#yahns_step
| proxy dispatch ...
| proxy_busy_mod_done
************************** DANGER BELOW ********************************
| HttpClient#yahns_step
| # clears env
# sees empty env: |
return :ignore if env.include?('rack.hijack_io') |
In other words, we cannot ever touch the original env seen by the
original thread since it must see the 'rack.hijack_io' value because
both are operating in the same Yahns::HttpClient object. This will
happen regardless of GVL existence.
Avoiding errors like this is absolutely critical to every one-shot
based design.
|
|
Upstreams may shut us down while we're writing a request body,
attempt to forward any responses from the upstream back to the
client which may explain the rejection reason for giant uploads.
|
|
This giant method needs to spew an error response in case uploads
to the upstream fail, ensure the local variable is defined early.
|
|
Not all backends are capable of generating chunked responses
(especially not to HTTP/1.0 clients) nor can they generate
the Content-Length (especially when gzipping), so they'll
close the socket to signal EOF instead.
|
|
kgio_writev returns nil on success instead of the number of bytes
written, so we must manually calculate the number of bytes written
intead :x
This is triggerable when sending giant chunked responses.
|
|
A dumb string comparison will do here, so there's no point
in paying the memory and CPU cost of a regexp match when we
already extracted the suffix from a header key.
|
|
hijack seems incompatible with many middlewares, so return a
wonky response tuplet just in case...
|
|
Instance variable attr methods are cheaper and we can shove the
complexity down to tmpio by allowing it to accept a nil argument
for the temporary directory.
This adds 4 bytes to tmpio but removes over 1K in http_context
on a 32-bit system.
|
|
Literal regexps costs over 400 bytes of memory on x86-64 per-site,
so there's no point in using them to cause bloat at cold call sites
where runtime performance does not matter.
|
|
We cannot pass trailers from upstreams to HTTP/1.0 clients without
fully-buffering the response body AND trailers before forwarding the
response header.
Of course, one of the reasons yahns exists is to support lazy
buffering, so fully-buffering up front is wasteful and hurts
latency. So instead, degrade to 1.0 requests to upstreams for
HTTP/1.0 clients, this should prevent upstreams from sending
trailers in the first place.
HTTP/1.0 clients on Rails apps may suffer, but there probably
are not too many HTTP/1.0 clients out there.
|
|
Rack apps may (through a round-about way) send HTTP trailers
to HTTP/1.1 clients, and we need a way to forward those responses
through without losing the trailers.
|
|
We need to ensure more uncommon cases such as gigantic upstream
headers and truncated upstream responses are handled properly
and predictably.
|
|
We were incorrectly stashing the return value of detach_rbuf!
into the inter-thread buffer buffer which is bound to the client.
|
|
This allows our reverse proxy to avoid having an innefficient 1:1
relationship between threads and upstream connections, reducing
memory usage when there are many upstream connections (possibly to
multiple backend machines).
|
|
This allows us to write chunked response bodies without extra
copying to clients which support streaming.
|
|
We'll be supporting "un-hijacking" a client socket for proxy_pass
and we must preserve state for pipelined requests.
|
|
This should make it easier to track state for asynchronous
proxy_pass buffering.
|
|
This will allow us to write arrays for chunked output without
unnecessary data copies.
|
|
We will support "un-hijacking", so the repeated ep_insert/ep_remove
sequences in the kernel will get expensive and complicated for our
user-land code, too.
|
|
This will rely on rack.hijack in the future to support
asynchronous execution without tying up a thread when waiting
for upstreams. For now, this allows simpler code with fewer
checks and the use of monotonic time on newer versions of Ruby.
|
|
While rack.hijack usage during application dispatch normally
prevents yahns from writing an HTTP response out of the Rack
response array, this was not correctly prevented when the
application emitted a 100-continue response when the client
was was too slow to read the 100-continue response without
triggering response buffering the server.
This bug only affects exceeding rare apps which rely on both
rack.hijack use during app dispatch _and_ emits 100-continue
responses, and even then it only affects slow clients which
refuse to read the 100-continue response sent by yahns without
blocking.
|
|
We probably do not want env["rack.input"] to become unusable
upon hijacking. Only drop the internal reference to it so
it can eventually become garbage-collected, but there's no
point in making env["rack.input"] unreadable.
|
|
No point in bloating our bytecode for single-use variables.
|
|
When inheriting sockets from the parent via YAHNS_FD, we must close
sockets ASAP if they are unconfigured in the child. This bug exists
in yahns (and not unicorn) because of the trickier shutdown routine
we do for blocking accept system calls to work reliably with the
threading support in mainline Ruby 2.x. This bug would not exist
in a purely C server using blocking accept, either.
|
|
This saves over 400 bytes on x86-64.
|
|
The monotonic clock is immune to time adjustments so it is not
thrown off by misconfigured clocks. Process.clock_gettime also
generates less garbage on 64-bit systems due to the use of Flonum.
|
|
Install workarounds for running with unreleased versions of unicorn
for now, as unicorn 5.x will be dropping many needless features.
|
|
We may not need this temporary file if we've flushed everything
out and entered bypass mode.
|
|
The state management has evolved slightly over time,
so update the comments to reflect that.
|
|
Since we only support 1.9.3+, io.stat.size may be simplified
to io.size to reduce allocations of File::Stat objects.
|
|
Otherwise, we may encounter too much log spam from ordinary
shutdown or malicious (or dumb) clients which send us invalid
data to an SSL port.
|
|
Not everybody needs to serve or even buffer to regular files,
so make sendfile optional to avoid the extra memory use and
relocations.
|
|
If we're streaming large files and sendfile fails (due to a
client aborting the connection), we need to ensure middleware
proxies are closed to ensure proper logging of a partial request.
This affects users of the "clogger" gem serving static files.
Unfortunately with clogger (or any Rack API-compliant middleware
using "to_path"), we still cannot log the amount of bytes
transferred for a static file.
|
|
The current CA model and code quality of OpenSSL have long put me off
from supporting TLS; however but efforts such as "Let's Encrypt"
and the fallout from Heartbleed give me hope for the future.
This implements, as much as possible, a "hands-off" approach to TLS
support via OpenSSL. This implementation allows us to shift
responsibility away from us to users and upstreams (the Ruby 'openssl'
extension maintainers, software packagers, and OpenSSL project itself).
This is also perhaps the easiest way for now for us, while being most
powerful for users. It requires users to configure their own OpenSSL
context object which we'll use as-is.
This context object is used as the :ssl_ctx parameter to the "listen"
directive in the yahns configuration file:
require 'openssl' # we will not do this for the user, even
ctx = OpenSSL::SSL::SSLContext.new
# user must configure ctx here...
listen 443, ssl_ctx: ctx
This way, in case we support GnuTLS or other TLS libraries, there'll
be less confusion as to what a user is actually using.
Note: this feature requires Ruby 2.1 and later for non-kgio
{read,write}_nonblock(.. exception: false) support.
|
|
We only need to open files with O_APPEND to allow appending to the
temporary buffer while leaving the read offset unchanged.
|
|
This barely reduces garbage objects at startup, but less
garbage is usually better.
|
|
Ruby 2.1 optimizes String#freeze by deduplicating string literal
calls to freeze. Ruby 2.2 _may_ also optimize away allocations to
Hash#delete in the future. In any case, this is uncommon code and
not worth trading permanent space to reduce temporal garbage.
While this favors Ruby 2.1 and later, it remains completely
compatible with Ruby 1.9.3 and 2.0.0.
|
|
Until we drop 1.9.3 support, we'll save some bytecode by using
[ :literal, :symbols, :in, :arrays ]
In 2.0.0 and later, we may use %i(terser syntax)
|
|
We may not be able to support this in a more performant
way just yet. Since this was never documented, we'll remove
the the current knobs for silently setting and ignoring it.
Users should use Unicorn::HttpParser.max_header_len= for now,
instead. We may change Unicorn::HttpParser in the future if enough
people care about making this functionality per-app.
|
|
Our kv_str method already fails if `$,' is a non-empty string.
Rack::Chunked and likely other middlewares fails when `$,' is
not empty, too, so supporting apps which set `$,' is probably
not feasible.
|
|
Replacing a Regexp argument to a rarely-called String#split with a
literal String can save some memory. Each removed Regexp memsize is
469 bytes on Ruby 2.1, and Ruby does not currently deduplicate
literal Regexps.
On Ruby 2.1, the "objspace" extension shows:
ObjectSpace.memsize_of(/,/) => 469
Is slightly smaller at 453 bytes on 2.2.0dev (r48474), and
these numbers do not include the 40-byte object overhead.
Nevertheless, this is a waste for non-performance-critical code
during the startup phase. Identical literal strings are
automatically deduplicated by Ruby 2.1, and has no additional
overhead because Rack (and likely some real apps) also includes
several instances of the literal "," byte.
We'll drop the unnecessary "encoding: binary" magic comment
in yahns/server.rb as well, as that file includes no literal
binary strings.
The downside of using a literal string argument in these cases is
a 40-byte object gets allocated on every call, but the affected
pieces of code are only called once in a process lifetime.
|
|
HTTP headers are compared case-insensitively, so we must filter out
the Date header case-insensitively. Found via casual code
inspection, I doubt anybody sets a Date header in Rack apps.
|
|
String#clear has a simpler dispatch and requires no object
allocation. The only reason we used String#replace before
was because it was inherited from a Ruby 1.8-compatible
project where String#clear was not available.
|
|
This speeds up searching during startup and prevents accidentally
misloading different, potentially incompatible versions of yahns
code.
|
|
We have kqueue support since yahns 1.2.0 (March 2014).
|
|
We should no longer need HTTP parser or input body upon hijacking.
Remove references to it so the GC can clean those up. This relies
on the Rack application deleting "rack.input" from the Rack env,
too.
|
|
This bug is noticeable on a amd64 FreeBSD 9.2 VM, and possible under
Linux, too. This happens as a zero-copy sendfile implementation means
pages queued for transmission by the sendfile system call should not be
modified at any point after the sendfile syscall is made.
To prevent modification, we replace the temporary file with a new one.
This has a similar effect as truncate and can still prevent a dirty
flush in cases when a client consumes the response fast enough.
This reverts the misguided ade89b5142bedbcf07f38aa062bfdbfcb8bc48d3
commit ("wbuf: hack to avoid response corruption on FreeBSD")
Note: this bug was finally fixed because I finally noticed this flaw
in a different (non-Ruby, non-HTTP) server of mine.
|