unicorn.git - Rack HTTP server for Unix and fast clients

Date	Commit message (Collapse)
2021-10-04	use EPOLLEXCLUSIVE on Linux 4.5+
	While the capabilities of epoll cannot be fully exploited given our primitive design; avoiding thundering herd wakeups on larger SMP machines while below 100% utilization is possible with Linux 4.5+. With this change, only one worker wakes up per-connect(2) (instead of all of them via select(2)), avoiding the thundering herd effect when the system is mostly idle. Saturated instances should not notice the difference if they rarely had multiple workers sleeping in select(2). This change benefits non-saturated instances. With 2 parallel clients and 8 workers on a nominally (:P) 8-core CPU (AMD FX-8320), the uconnect.perl test script invocation showed a reduction from ~3.4s to ~2.5s when reading an 11-byte response body: echo worker_processes 8 >u.conf.rb bs=11 ruby -I lib -I test/ruby-2.5.5/ext/unicorn_http/ bin/unicorn \ test/benchmark/dd.ru -E none -l /tmp/u.sock -c u.conf.rb time perl -I lib -w test/benchmark/uconnect.perl \ -n 100000 -c 2 /tmp/u.sock Times improve less as "-c" increases for uconnect.perl (system noise and timings are inconsistent). The benefit of this change should be more noticeable on systems with more workers (and more cores). I wanted to use EPOLLET (Edge-Triggered) to further reduce syscalls, here, (similar to the old select()-avoidance bet) but that would've either added too much complexity to deduplicate wakeup sources, or run into the same starvation problem we solved in April 2020[1]. Since the kernel already has the complexity and deduplication built-in for Level-Triggered epoll support, we'll just let the kernel deal with it. Note: do NOT take this as an example of how epoll should be used in a sophisticated server. unicorn is primitive by design and cannot use threads nor handle multiple clients at once, thus it it only uses epoll in this extremely limited manner. Linux 4.5+ users will notice a regression of one extra epoll FD per-worker and at least two epoll watches, so /proc/sys/fs/epoll/max_user_watches may need to be changed along with RLIMIT_NOFILE. This change has also been tested on Linux 3.10.x (CentOS 7.x) and FreeBSD 11.x to ensure compatibility with systems without EPOLLEXCLUSIVE. Various EPOLLEXCLUSIVE discussions over the years: https://yhbt.net/lore/lkml/?q=s:EPOLLEXCLUSIVE+d:..20211001&x=t&o=-1 [1] https://yhbt.net/unicorn-public/CAMBWrQ=Yh42MPtzJCEO7XryVknDNetRMuA87irWfqVuLdJmiBQ@mail.gmail.com/
2021-10-04	worker_loop: get rid of select() avoidance hack
	It doesn't seem to do anything since commit 221340c4ebc15666 (prevent single listener from monopolizing a worker, 2020-04-16).
2021-10-04	http_server: get rid of Process.ppid check
	It's actually been unnecessary since commit 6f6e4115b4bb03e5 (rework master-to-worker signaling to use a pipe, 2013-12-09)
2021-09-26	drop unnecessary IO#close_on_exec=true assignment
	Ruby 2.0+ sets FD_CLOEXEC by default on all FDs.
2021-09-26	drop Ruby 1.9.3 support, require 2.0+ for now
	Ruby 1.9.3 was released nearly a decade ago, so there's probably few (if any) legacy users left, and they can continue using old versions of unicorn. We'll be able to take advantage of some Ruby 2.0+-only features down the road (and hopefully 2.3+). Also, I no longer have a installation of Ruby 1.8 and getting it working probably isn't worth the effort, so 4.x support is gone.
2021-03-13	http_request: drop unnecessary #clear call
	Since we allocate a new request object for each request, the #clear call is now unnecessary
2021-03-13	Allocate a new request for each client
	This removes the reuse of the parser between requests. Reusing these is risky in the context of running any other threads within the unicorn process, also for threads that run background tasks. If any other thread accidentally grabs hold of the request it can modify things for the next request in flight. The downside here is that we allocate more for each request, but that is worth the trade off here and the security risk we otherwise would carry to leaking wrong and incorrect data.
2020-12-09	Add rack.after_reply functionality
	This adds `rack.after_reply` functionality which allows rack middleware to pass lambdas that will be executed after the client connection has been closed. This was driven by a need to perform actions in a request that shouldn't block the request from completing but also don't make sense as background jobs. There is prior art of this being supported found in a few gems, as well as this functionality existing in other rack based servers (e.g. Puma). [ew: check if `env' is set in ensure statement] Acked-by: Eric Wong <e@80x24.org>
2020-07-24	configurator: SIGHUP resets early_hints if unset
	If a user removes "early_hints" entirely from the config file, a SIGHUP needs to restore the default value. This is consistent with the behavior of all the other configuration variables. Cc: Jean Boussier <jean.boussier@gmail.com>
2020-07-16	Add early hints support
	While not part of the rack spec, this API is exposed by both puma and falcon, and Rails use it when available. The 103 Early Hints response code is specified in RFC 8297.
2020-04-16	prevent single listener from monopolizing a worker
	In setups with multiple listeners, it's possible for our greedy select(2)-avoidance optimization to get pinned on a single, busy listener and starve the other listener(s). Prevent starvation by retrying the select(2)-avoidance optimization if and only if all listeners were active. This should have no effect on the majority of deployments with only a single listener. Thanks to Stan Hu for reporting and testing. Reported-by: Stan Hu <stanhu@gmail.com> Tested-by: Stan Hu <stanhu@gmail.com> Link: https://yhbt.net/unicorn-public/CAMBWrQ=Yh42MPtzJCEO7XryVknDNetRMuA87irWfqVuLdJmiBQ@mail.gmail.com/
2020-03-19	http: improve RFC 7230 conformance
	We need to favor "Transfer-Encoding: chunked" over "Content-Length" in the request header if they both exist. Furthermore, we now reject redundant chunking and cases where "chunked" is not the final encoding. We currently do not and have no plans to decode "gzip", "deflate", or "compress" encoding as described by RFC 7230. That's a job more appropriate for middleware, anyways. cf. https://tools.ietf.org/html/rfc7230 https://www.rfc-editor.org/errata_search.php?rfc=7230
2020-01-20	doc: s/bogomips.org/yhbt.net/g
	bogomips.org is due to expire, soon, and I'm not willing to pay extortionist fees to Ethos Capital/PIR/ICANN to keep a .org. So it's at yhbt.net, for now, but it will change again to whatever's affordable... Identity is overrated. Tor users can use .onions and kick ICANN to the curb: torsocks w3m http://unicorn.ou63pmih66umazou.onion/ torsocks git clone http://ou63pmih66umazou.onion/unicorn.git/ torsocks w3m http://ou63pmih66umazou.onion/unicorn-public/ While we're at it, `s/news.gmane.org/news.gmane.io/g', too. (but I suspect that'll need to be resynched since our mail "List-Id:" header is changing).
2019-12-11	tmpio: workaround File#path being tainted on unlink
	Ruby mistakenly taints the file path, causing File.unlink to fail: https://bugs.ruby-lang.org/issues/14485 Workaround the Ruby bug by keeping the path as a local variable and passing that to File.unlink, instead of the return value of File#path. Link: https://bogomips.org/unicorn-public/CABg1sXrvGv9G6CDQxePDUqTe6N-5UpLXm7eG3YQO=dda-Cgg7A@mail.gmail.com/
2019-05-03	Rescue failed pipe resizes due to permissions
	When running: ``` require 'kgio' require 'raindrops' F_SETPIPE_SZ = 1031 if RUBY_PLATFORM =~ /linux/ Kgio::Pipe.new.each do \|io\| io.close_on_exec = true if defined?(F_SETPIPE_SZ) begin puts "setting" io.fcntl(F_SETPIPE_SZ, Raindrops::PAGE_SIZE) rescue Errno::EINVAL puts "rescued" rescue => e puts ["FAILED HARD", e].inspect end end end ``` on a few servers to test some Unicorn boot failures I saw: ``` ["FAILED HARD", #<Errno::EPERM: Operation not permitted>] ``` The `EPERM` error gets raised by the Linux kernel if: ``` (too_many_pipe_buffers_hard(pipe->user) \|\| too_many_pipe_buffers_soft(pipe->user)) && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN) ``` Given that the resize is not strictly necessary Unicorn should rescue the error and continue booting.
2018-10-18	doc: update more URLs to use HTTPS and avoid redirects
	Latency from redirects is painful, and HTTPS can protect privacy in some cases.
2018-09-21	Support default_middleware configuration option
	This allows for the equivalent of the -N/--no-default_middleware command line option to be specified in the configuration file so it doesn't need to be specified on the command line every time unicorn is executed. It explicitly excludes the use of -N/--no-default_middleware as an embedded configuration option in the rackup file, by ignoring the options after ARGV is parsed. In order to allow the configuration method to work, have the lambda that Unicorn.builder returns accept two arguments. Technically, only one argument is needed for the HttpServer instance, but I'm guessing if the lambda accepts a single argument, we expect that to be a rack application instead of a lambda that returns a rack application. The command line option option to disable default middleware will take precedence over the unicorn configuration file option if both are present. For backwards compatibility, if the lambda passed to HttpServer accepts 0 arguments, then call it without arguments. [ew: fix precedence for arity checking in build_app! configurator: ensure -N is respected when set in command-line]
2018-09-13	Make Worker#user support different process primary group and log file group
	Previously, Unicorn always used the process's primary group as the the group of the log file. However, there are reasons to use a separate group for the log files, such as when you have many applications where each application uses it's own user and primary group, but you want to be able to have a user read the log files for all applications. Some operating systems have a fairly small limit on the number of groups per user, and it may not be feasible to have a user be in the primary group for all applications. a primary group
2018-08-20	socket_helper: add hint for FreeBSD users for accf_http(9)
	Because I forget to load accf_http on new FreeBSD installs, too :x
2018-08-20	shrink pipes under Linux
	We have never had any need for pipes with the default 64K capacity on Linux. Our pipes are only used for tiny writes in signal handlers and to perform parent shutdown detection. With the current /proc/sys/fs/pipe-user-pages-soft default, only 1024 pipes can be created by an unprivileged user before Linux clamps down the pipe size to 4K (a single page) for newly-created pipes[1]. So avoid penalizing OTHER pipe users who could benefit from the increased capacity and use only a single page for ourselves. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pipe.c?h=v4.18#n642
2018-07-23	use IO#wait instead of kgio_wait_readable
	Slowly reducing dependencies on kgio. 'io/wait' is required by 'socket' these days, so it's no extra relocations for .so loading, either.
2018-07-23	remove random seed reset atfork
	It's not unicorn 6, yet, but we dropped Ruby 1.8 support at unicorn 5. Stable Ruby 1.9+ releases have always reseeded the PRNG at fork.
2018-05-01	quiet some mismatched indentation warnings
	Ruby trunk started warning about more mismatched indentations starting around r62836.
2017-12-16	avoid reusing env on hijack
	Hijackers may capture and reuse `env' indefinitely, so we must not use it in those cases for future requests. For non-hijack requests, we continue to reuse the `env' object to reduce memory recycling. Reported-and-tested-by: Sam Saffron <sam.saffron@gmail.com>
2017-11-16	require 'pp' if $DEBUG is set by Rack app
	While "unicorn -d" requires 'pp' when setting $DEBUG, we did not account for (rare) Rack applications setting $DEBUG at load time. Thanks-to: James P (Jim) Robinson Jr <James.Robinson3@Cigna.com>
2017-04-08	reduce method calls with String#start_with?
	These three cold call sites instruction sequence size by a few hundred bytes combined since we no longer support Ruby 1.8.6. The "?/" shorthand is esoteric and no longer avoids allocation in Ruby 1.9+ (not that this is hot code).
2017-03-26	Check for Socket::TCP_INFO constant before trying to get TCP_INFO
	The ruby constant Socket::TCP_INFO is only defined if TCP_INFO is defined in C, so we can just check for the presence of that ruby constant instead of rescuing SocketError from the call to getsockopt.
2017-03-24	Check for SocketError on first ccc attempt
	On OpenBSD, getsockopt(2) does not support TCP_INFO. With the current code, this results in a 500 for all clients if check_client_connection is enabled on OpenBSD. This patch rescues SocketError on the first getsockopt call, and if SocketError is raised, it doesn't check in the future. This should be the same behavior as if TCP_INFO was supported but inspect did not return a string in the expected format.
2017-03-24	doc: note after_worker_exit is also 5.3.0+
	Followup-to: 650e01ab0b118803486b56f3ee59521d59042dae ("doc: add version annotations for new features")
2017-03-23	doc: add version annotations for new features
	We will inevitably have people running old unicorn versions for many years to come; but they may be reading the latest documentation online. Annotate when the new features (will) appear to avoid misleading users on old versions.
2017-03-23	Merge remote-tracking branch 'origin/worker_exec'
	* origin/worker_exec: Don't pass a block for fork when forking workers Add worker_exec configuration option
2017-03-23	Merge branch 'ccc-tcp-v3'
	* ccc-tcp-v3: test_ccc: use a pipe to synchronize test http_request: support proposed Raindrops::TCP states on non-Linux
2017-03-23	http_server: initialize @pid ivar
	This quiets down warnings when run with '-w'
2017-03-23	input: update documentation and hide internals.
	rack 2.x exists nowadays still allows rewindable input as an option, and we will still enable it by default to avoid breaking any existing code. Hide the internal documentation since we do not want people depending on unicorn internals; so there's no reason to confuse or overwhelm people with documentation about it. Arguably, TeeInput and StreamInput should not be documented publically at all, but I guess that ship has long sailed...
2017-03-21	http_request: support proposed Raindrops::TCP states on non-Linux
	raindrops 0.18+ will have Raindrops::TCP state hash for portable mapping of TCP states to their respective numeric values. This was necessary because TCP state numbers (and even macro names) differ between FreeBSD and Linux (and possibly other OSes). Favor using the Raindrops::TCP state hash if available, but fall back to the hard-coded values since older versions of raindrops did not support TCP_INFO on non-Linux systems. While we're in the area, favor "const_defined?" over "defined?" to reduce the inline constant cache footprint for branches which are only evaluated once. Patches to implement Raindrops::TCP for FreeBSD are available at: https://bogomips.org/raindrops-public/20170316031652.17433-1-e@80x24.org/T/
2017-03-15	Merge remote-tracking branch 'origin/ccc-tcp-v3'
	* origin/ccc-tcp-v3: http_request: reduce insn size for check_client_connection support "struct tcp_info" on non-Linux and Ruby 2.2+ revert signature change to HttpServer#process_client new test for check_client_connection check_client_connection: use tcp state on linux
2017-03-14	doc: fix links to raindrops project
	bogomips.org is dropping prefixes to reduce subjectAltName bloat in TLS certificates.
2017-03-14	freebsd: avoid EINVAL when setting accept filter
	Accept filters can only be set on listen sockets, and it also fails with EINVAL if it's already set. Untested, but I suppose changing the accept filter on a listening socket is not supported, either; since that could affect in-flight sockets.
2017-03-14	http_request: reduce insn size for check_client_connection
	Unlike constants and instance variables, class variable access is not optimized in the mainline Ruby VM. Use a constant instead, to take advantage of inline constant caching. This further reduces runtime instruction size by avoiding a branch by allocating the Raindrops::TCP_Info object up front. This reduces the method size by roughly 300 bytes on 64-bit.
2017-03-13	Don't pass a block for fork when forking workers worker_exec
	This reduces the stack depth, making GC more efficient.
2017-03-10	Add worker_exec configuration option
	The worker_exec configuration option makes all worker processes exec after forking. This initializes the worker processes with separate memory layouts, defeating address space discovery attacks on operating systems supporting address space layout randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and Solaris. Support for execing workers is very similar to support for reexecing the master process. The main difference is the worker's to_i and master pipes also need to be inherited after worker exec just as the listening sockets need to be inherited after reexec. Because execing working is similar to reexecing the master, this extracts a couple of methods from reexec (listener_sockets and close_sockets_on_exec), so they can be reused in worker_spawn.
2017-03-08	oob_gc: rely on opt_aref_with optimization on Ruby 2.2+
	Maybe oob_gc probably isn't heavily used anymore, maybe some Ruby 2.2+ users will benefit from this constant reduction. Followup-to: fb2f10e1d7a72e67 ("reduce constants and optimize for Ruby 2.2")
2017-03-08	support "struct tcp_info" on non-Linux and Ruby 2.2+
	Ruby 2.2+ can show "struct tcp_info" as a string via Socket::Option#inspect, and we can attempt to parse it out to extract the information we need. Parsing this string is inefficient, but does not depend on the ordering of the tcp_info struct.
2017-03-08	revert signature change to HttpServer#process_client
	We can force kgio_tryaccept to return an internal class for TCP objects by subclassing Kgio::TCPServer. This avoids breakage in any unfortunate projects which depend on our undocumented internal APIs, such as gctools <https://github.com/tmm1/gctools>
2017-03-08	check_client_connection: use tcp state on linux ccc-tcp-v2
	* Use a frozen empty array and a class variable for TCP_Info to avoid garbage. As far as I can tell, this shouldn't result in any garbage on any requests (other than on the first request). * Pass listener socket to #read to only check the client connection on a TCP server. * Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's the most common state after ESTABLISHED, it makes the numbers un-ordered, though. But comment should make it OK. * Definition of of `check_client_connection` based on whether Raindrops::TCP_Info is defined, instead of the class variable approach. * Changed the unit tests to pass a `nil` listener. Tested on our staging environment, and still works like a dream. I should note that I got the idea between this patch into Puma as well! https://github.com/puma/puma/pull/1227 [ew: squashed in temporary change for oob_gc.rb, but we'll come up with a different change to avoid breaking gctools <https://github.com/tmm1/gctools>] Acked-by: Eric Wong <e@80x24.org>
2017-02-23	Add after_worker_ready configuration option chroot
	This adds a hook that is called after the application has been loaded by the worker process, directly before it starts accepting requests. This hook is necessary if your application needs to gain access to resources during initialization, and then drop privileges before serving requests. This is especially useful in conjunction with chroot support so the app can load all the normal ruby libraries it needs to function, and then chroot before accepting requests. If you are preloading the app, it's possible to drop privileges or chroot in after_fork, but if you are not preloading the app, the only way to currently do this is to override the private HttpServer#init_worker_process method, and overriding private methods is a recipe for future breakage if the internals are modified. This hook allows for such functionality to be supported and not break in future versions of Unicorn.
2017-02-23	Add support for chroot to Worker#user
	Any chrooting would need to happen inside Worker#user, because you can't chroot until after you have parsed the list of groups, and you must chroot before dropping root privileges. chroot adds an extra layer of security, so that if the unicorn process is exploited, file system access is limited to the chroot directory instead of the entire file system.
2017-02-23	Fix code example in after_worker_exit documentation
	Fixes: 2af91a1fef70d654 ("Add after_worker_exit configuration option")
2017-02-21	Add after_worker_exit configuration option
	This option is executed in the master process following all worker process exits. It is most useful in the case where the worker process crashes the ruby interpreter, as the worker process may not be able to send error notifications appropriately. For example, let's say you have a specific request that crashes a worker process, which you expect to be due to a improperly programmed C extension. By modifying your worker to save request related data in a temporary file and using this option, you can get a record of what request is crashing the application, which will make debugging easier. Example: after_worker_exit do \|server, worker, status\| server.logger.info "worker #{status.success? ? 'exit' : 'crash'}: #{status}" file = "request.#{status.pid}.txt" if File.exist?(file) do_something_with(File.read(file)) unless status.success? File.delete(file) end end
2017-02-13	http_request: freeze constant strings passed IO#write
	This ensures we won't have duplicate objects in Ruby 2.0-2.4. For Ruby 2.5.0dev+, this avoids any duplicate cleanup introduced as of r57471: https://bugs.ruby-lang.org/issues/13085