Date | Commit message (Collapse) |
|
For some reason, I thought invalid descriptors passed to UNICORN_FD
would be automatically closed by the master process; but apparently
this hasn't been the case. On the other hand, this bug has been
around for over 6 years now and nobody noticed or cared enough to
tell us, so fixing it might break existing setups.
Since there may be users relying on this behavior, we cannot change
the behavior anymore; so update the documentation and write at test
to ensure we can never "fix" this bug at the expense of breaking
any working setups which may be out there.
Keep in mind that a before_exec hook may always be used to modify
the UNICORN_FD environment by setting the close_on_exec flag and
removing the appropriate descriptor from the environment.
I originally intended to add the ability to inherit new listeners
without a config file specification so systemd users can avoid
repeating themselves in the systemd and unicorn config files,
but apparently there is nothing to change in our code.
|
|
Re-enable and expand on the test case while we're at it for new
Rubies. The bug is now fixed in Ruby 2.3.0dev as of r51576. We
shall assume anybody running a pre-release 2.3.0 at this point is
running a fairly recent snapshot, so we won't bother doing a
finer-grained check in the test for an exact revision number.
|
|
Turns out ruby does have trouble emulating systemd, for now:
[ruby-core:69895] https://bugs.ruby-lang.org/issues/11336
When we re-enable this test, we'll only enable it for fixed Rubies.
The actual socket inheritance functionality works in any version of
Ruby, of course, it's just that emulating systemd won't work until
ruby-core fixes mainline Ruby.
|
|
systemd socket emulation shares FDs across execve, just like
the built-in SIGUSR2 upgrade process in unicorn. Thus it is
easy to support inheriting sockets from systemd.
Tested-by: Christos Trochalakis <yatiohi@ideopolis.gr>
|
|
This fixes a bug introduced in
commit fe83ead4eae6f011fa15f506cd80cb4256813a92
(GNUmakefile: fix clean gem build + reduce build cruft)
which broke clean Ruby installations without an existing
unicorn gem installed :x
|
|
This off-by-one error was incorrectly rejecting a line which
would've been readable without wrapping on an 80-column terminal.
|
|
This patch checks incoming connections and avoids calling the application
if the connection has been closed.
It works by sending the beginning of the HTTP response before calling
the application to see if the socket can successfully be written to.
By enabling this feature users can avoid wasting application rendering
time only to find the connection is closed when attempting to write, and
throwing out the result.
When a client disconnects while being queued or processed, Nginx will log
HTTP response 499 but the application will log a 200.
Enabling this feature will minimize the time window during which the problem
can arise.
The feature is disabled by default and can be enabled by adding
'check_client_connection true' to the unicorn config.
[ew: After testing this change, Tom Burns wrote:
So we just finished the US Black Friday / Cyber Monday weekend running
unicorn forked with the last version of the patch I had sent you. It
worked splendidly and helped us handle huge flash sales without
increased response time over the weekend.
Whereas in previous flash traffic scenarios we would see the number of
HTTP 499 responses grow past the number of real HTTP 200 responses,
over the weekend we saw no growth in 499s during flash sales.
Unexpectedly the patch also helped us ward off a DoS attack where the
attackers were disconnecting immediately after making a request.
ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com>
]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
It's better to show errors and backtraces when stuff
breaks
|
|
The testcase for this was broken, too, so we didn't notice
this :<
Reported-by: ghazel@gmail.com on the Rainbows! mailing list,
http://mid.gmane.org/BANLkTi=oQXK5Casq9SuGD3edeUrDPvRm3A@mail.gmail.com
|
|
for i in `git ls-files '*.rb'`; do ruby -w -c $i; done
|
|
These nasty hacks were breaking Rubinius compatibility.
This can be further cleaned up, too.
|
|
We do an extra check in the application dispatch to ensure
ENV['PWD'] is set correctly to match Dir.pwd (even if the
string path is different) as this is required for Capistrano
deployments.
These tests should now pass under OSX where /var is apparently
a symlink to /private/var.
|
|
As of rbx commit cf4a5a759234faa3f7d8a92d68fa89d8c5048f72,
most of the issues uncovered in our test suite are fixed.
|
|
They cannot be worked around, but tickets have been filed
upstream (I still hate all bug trackers besides Debian's).
TCPServer.for_fd (needed for zero-downtime upgrades):
http://github.com/evanphx/rubinius/issues/354
UnixServer.for_fd (needed for zero-downtime upgrades):
http://github.com/evanphx/rubinius/issues/355
Signal handling behavior seems broken (OOM or segfaults):
http://github.com/evanphx/rubinius/issues/356
|
|
I'm not sure what I was smoking when I originally (and
knowingly) wrote the racy code.
|
|
This behavior change also means our grandparent (launched
from a controlling terminal or script) will wait until
the master process is ready before returning.
Thanks to IƱaki Baz Castillo for the initial implementations
and inspiration.
|
|
On heavily loaded machines, this test can take a while,
fortunately our test suite is parallelization-friendly.
|
|
Shells already expand '~' before the executables see it, and
relative paths inside symlinks can get set incorrectly to the
actual directory name, and not the (usually desired) symlink
name for things like Capistrano.
Since our paths are now unexpanded, we must now check the
"working_directory" directive and raise an error if the user
specifies the config file in a way that makes the config file
unreloadable.
|
|
The relative working_directory test runs so quickly
that the master may not even have signal handlers
setup by the time we're done with it. The proper
way would be to not start workers until the master
is ready, but that breaks some test cases horribly.
|
|
Prevent ourselves from breaking things in case applications
start depending on this.
|
|
It makes more sense this way since users usually expect config
file directives to be order-independent.
|
|
This basically a prettier way of saying:
Dir.chdir(Unicorn::HttpServer::START_CTX[:cwd] = path)
In the config file. Unfortunately, this is configuration
directive where order matters and you should specify it
before any other path[1] directives if you're using relative
paths (relative paths are not recommended anyways)
[1] pid, stderr_path, stdout_path
|
|
There's always been a small window of opportunity for a script
to do File.read(pid).to_i would cause File.read() to read an
empty file and return "". This closes that window while
hopefully retaining backwards compatibility...
We've always checked for dirname(pid) writability in
Configurator, so we can safely write to a temporary file in the
intended directory and then atomically rename() it to the
destination path.
|
|
When SIGHUP reloads the config, we didn't account for the case
where the listen socket was completely unspecified. Thus the
default listener (0.0.0.0:8080), did not get preserved and
re-injected into the config properly.
Note that relying on the default listen or specifying listeners
on the command-line means it's /practically/ impossible to
_unbind_ those listeners with a configuration file reload. We
also need to preserve the (unspecified) default listener across
upgrades that later result in SIGHUP, too; so the easiest way is
to inject the default listener into the command-line for
upgrades.
Many thanks to James Golick for reporting and helping me track
down the bug since this behavior is difficult to write reliable
automated tests for.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
Just to ensure we handle HUP correctly since preload_app
changes the behavior of HUP handling a bit.
|
|
This ensures any string literals that pop up in *our* code will
just be a bag of bytes. This shouldn't affect/fix/break
existing apps in most cases, but most constants will always have
the "correct" encoding (none!) to be consistent with HTTP/socket
expectations. Since this comment affects things only on a
per-source basis, it won't affect existing apps with the
exception of strings we pass to the Rack application.
This will eventually allow us to get rid of that Unicorn::Z
constant, too.
|
|
Followup to commit 7f74a16406c92c4362ac20af4ccb8bc821cf978b,
we want to ensure error messages do not get swallowed up into
/dev/null when daemonizing; so we defer the default redirects
to "/dev/null" to as late as possible.
|
|
Otherwise Ruby could get confused and not be able to reap the
process correctly (and thus wait a long time for timeout).
|
|
This allows dynamic tuning of the worker_processes count without
having to restart existing ones. This also allows
worker_processes to be set to a low initial amount in the config
file for low-traffic deployments/upgrades and then scaled up as
the old processes are killed off.
Remove the proposed reexec_worker_processes from TODO since this
is far more flexible and powerful.
This will allow not-yet-existent third-party monitoring tools to
dynamically change and scale worker processes according to site
load without increasing the complexity of Unicorn itself.
|
|
|
|
We don't (and won't ever) do log rotation within the process.
That's the job of logrotate and tools like that. We just
reopen logs like other reasonable daemons out there.
|
|
By reraising SignalException in workers. Since we just rely on
default signal handlers for the majority of signals now, ensure
those signals actually exit the process.
|
|
Instead of just worker.nr. This is a configuration file/API
change and will break existing configurations.
This allows worker.tempfile to be exposed to the hooks
so ownership changes can still happen on it.
On the other hand, I don't know of many people actually
using this feature (or Unicorn).
|
|
We need to ensure children are spawned by waiting
until the master is ready.
|
|
Sockets may be unintentionally unlinked on the filesystem.
When reloading our config, ensure that the socket exists
on the filesystem. If not, close the listener (since it's
unusable by outside apps) and reopen it.
|
|
Rather than blindly appending to our listener set
with every "listen" directive read in the config
file, reset our internal array.
Listeners specified on the command-line are always
preserved between config reloads.
|
|
We'll allow before_exec to override that setting, however.
There are cases where someone setting
Logger.new("/path/to/file") will create new file descriptors in
the master process. This will prevent FD leakage
and a test case (for Linux only) proves it.
|
|
|
|
|
|
|
|
They were non-conformant for the longest time
|
|
We need to ensure the QUIT signal to the old processes
are processed when fixing the config.
Additionally, the log rotation checker was not reliable
because the master log emitted a similar message to
the workers and we were not distinguishing between
them. Check for all 5 logs (1 master + 4 workers)
to be rotated.
|
|
We still need to support "listeners" for easy use of
command-line options, but folks using the config file should use
"listen" as it is more flexible.
|
|
Instead of having global options for all listeners,
make all socket options per-listener. This allows
reverse-proxies to pick different listeners to get
different options on different sockets.
Given a cluster of machines (10.0.0.1, 10.0.0.2, 10.0.0.3)
running Unicorn with the following config:
------------------ 8< ----------------
listen "/tmp/local.sock", :backlog => 1
listen "*:8080" # use the backlog=1024 default
------------------ 8< ----------------
It is possible to configure a reverse proxy to try to use
"/tmp/local.sock" first and then fall back to using the
TCP listener on port 8080 in a failover configuration.
Thus the nginx upstream configuration on 10.0.0.1 to
compliment this would be:
------------------ 8< ----------------
upstream unicorn_cluster {
# reject connections ASAP if we are overloaded
server unix:/tmp/local.sock;
# fall back to other machines in the cluster via "backup"
# listeners which have a large backlog queue.
server 10.0.0.2:8080 backup;
server 10.0.0.3:8080 backup;
}
------------------ 8< ----------------
This removes the global "backlog" config option which
was inflexible with multiple machines in a cluster
and exposes the ability to change SO_SNDBUF/SO_RCVBUF
via setsockopt(2) for the first time.
|
|
They're easier for me to type and read and just barely faster
when doing comparisons on.
|
|
Instead of rotating logs immediately when SIGUSR1 is caught,
defer it until the current client is processing is complete.
This allows multi-line log messages generated by apps to not be
broken up if SIGUSR1 is received while the app is running.
If we're sleeping inside IO.select, we close a pipe in the
exceptfds set to cause EBADF to be raised.
This also adds a small reliability improvement to test_exec
so we wait until signals are ready before sending USR1
to rotate logs.
|
|
The master _may_ run with different user/group/umask than the
workers. Since the logs were always created by the master
process, the master should rotate them first to ensure correct
ownership and permissions.
This way if the workers fail log rotation and die, they'll
be automatically respawned with the new logs in place.
|
|
Although I didn't like the idea initially, signal queueing
allows test_exec to run more reliably and the limited signal
queue size will prevent scary queued signal behavior.
Also, always wakeup the master immediately when CHLD is trapped
to reduce the performance impact of SIGHUP-based config
reloading. Combined with an extra check in test_exec, this
should make test_exec run much more reliably than before.
|
|
Hopefully this increases test reliability
|
|
This reverts commit 4414d9bf34f162a604d2aacc765ab1ca2fc90404.
|