Documentation updates, prep for 0.4.1 release

author: Eric Wong <normalperson@yhbt.net> 2009-04-01 03:35:47 -0700
committer: Eric Wong <normalperson@yhbt.net> 2009-04-01 03:35:47 -0700
commit: c0d79dbb2e5f0f23236c60a0e7c5bb92be2512aa (patch)
tree: d8c0651521f1730af82e885734d985e5158dd499
parent: f30bfcff2564d6114db9a44cccbad87863dcb913 (diff)
download: unicorn-c0d79dbb2e5f0f23236c60a0e7c5bb92be2512aa.tar.gz
7 files changed, 219 insertions, 69 deletions
diff --git a/.document b/.document
index 24d584f..432d4bb 100644
--- a/.document
+++ b/.document
@@ -1,6 +1,6 @@
  README
+PHILOSOPHY
  DESIGN
-CHANGELOG
  CONTRIBUTORS
  LICENSE
  SIGNALS
diff --git a/CHANGELOG b/CHANGELOG
index c4d1289..9089148 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,4 @@
+v0.4.1 - Rails support, per-listener backlog and {snd,rcv}buf
  v0.2.3 - Unlink Tempfiles after use (they were closed, just not unlinked)
  v0.2.2 - small bug fixes, fix Rack multi-value headers (Set-Cookie:)
  v0.2.1 - Fix broken Manifest that cause unicorn_rails to not be bundled
diff --git a/DESIGN b/DESIGN
index 288502b..8f8c63d 100644
--- a/DESIGN
+++ b/DESIGN
@@ -79,3 +79,7 @@
  * If the master process dies unexpectedly for any reason,
    workers will notice within :timeout/2 seconds and follow
    the master to its death.
+
+* There is never any explicit real-time dependency or communication
+  between the worker processes themselves nor to the master process.
+  Synchronization is handled entirely by the OS kernel.
diff --git a/PHILOSOPHY b/PHILOSOPHY
new file mode 100644
index 0000000..ce7763a
--- /dev/null
+++ b/PHILOSOPHY
@@ -0,0 +1,139 @@
+= The Philosophy Behind Unicorn
+
+Being a server that only runs on Unix-like platforms, Unicorn is
+strongly tied to the Unix philosophy of doing one thing and (hopefully)
+doing it well.  Despite using HTTP, Unicorn is strictly a _backend_
+application server for running Rack-based Ruby applications.
+
+== Avoid Complexity
+
+Instead of attempting to be efficient at serving slow clients, Unicorn
+relies on a buffering reverse proxy to efficiently deal with slow
+clients.
+
+Unicorn uses an old-fashioned preforking worker model with blocking I/O.
+Our processing model is the antithesis of more modern (and theoretically
+more efficient) server processing models using threads or non-blocking
+I/O with events.
+
+=== Threads and Events Are Hard
+
+...to many developers.  Reasons for this is beyond the scope of this
+document.  Unicorn avoids concurrency within each worker process so you
+have fewer things to worry about when developing your application.  Of
+course Unicorn can use multiple worker processes to utilize multiple
+CPUs or spindles.  Applications can still use threads internally, however.
+
+== Slow Clients Are Problematic
+
+Most benchmarks we've seen don't tell you this, and Unicorn doesn't
+care about slow clients... but <i>you</i> should.
+
+A "slow client" can be any client outside of your datacenter.  Network
+traffic within a local network is always faster than traffic that
+crosses outside of it.  The laws of physics do not allow otherwise.
+
+Persistent connections were introduced in HTTP/1.1 reduce latency from
+connection establishment and TCP slow start.  They also waste server
+resources when clients are idle.
+
+Persistent connections mean one of the Unicorn worker processes
+(depending on your application, it can be very memory hungry) would
+spend a significant amount of its time idle keeping the connection alive
+<i>and not doing anything else</i>.  Being single-threaded and using
+blocking I/O, a worker cannot serve other clients while keeping a
+connection alive.  Thus Unicorn does not implement persistent
+connections.
+
+If your application responses are larger than the socket buffer or if
+you're handling large requests (uploads), worker processes will also be
+bottlenecked by the speed of the *client* connection.  You should
+not allow Unicorn to serve clients outside of your local network.
+
+== Application Concurrency != Network Concurrency
+
+Performance is asymmetric across the different subsystems of the machine
+and parts of the network.  CPUs and main memory can process gigabytes of
+data in a second; clients on the Internet are usually only capable of a
+tiny fraction of that.  Unicorn deployments should avoid dealing with
+slow clients directly and instead rely on a reverse proxy to shield it
+from the effects of slow I/O.
+
+== Improved Performance Through Reverse Proxying
+
+By acting as a buffer to shield Unicorn from slow I/O, a reverse proxy
+will inevitably incur overhead in the form of extra data copies.
+However, as I/O within a local network is fast (and faster still
+with local sockets), this overhead is neglible for the vast majority
+of HTTP requests and responses.
+
+The ideal reverse proxy complements the weaknesses of Unicorn.
+A reverse proxy for Unicorn should meet the following requirements:
+
+  1. It should fully buffer all HTTP requests (and large responses).
+     Each request should be "corked" in the reverse proxy and sent
+     as fast as possible to the backend Unicorn processes.  This is
+     the most important feature to look for when choosing a
+     reverse proxy for Unicorn.
+
+  2. It should spend minimal time in userspace.  Network (and disk) I/O
+     are system-level tasks and usually managed by the kernel.
+     This may change if userspace TCP stacks become more popular in the
+     future; but the reverse proxy should not waste time with
+     application-level logic.  These concerns should be separated
+
+  3. It should avoid context switches and CPU scheduling overhead.
+     In many (most?) cases, network devices and their interrupts are
+     only be handled by one CPU at a time.  It should avoid contention
+     within the system by serializing all network I/O into one (or few)
+     userspace procceses.  Network I/O is not a CPU-intensive task and
+     it is not helpful to use multiple CPU cores (at least not for GigE).
+
+  4. It should efficiently manage persistent connections (and
+     pipelining) to slow clients.  If you care to serve slow clients
+     outside your network, then these features of HTTP/1.1 will help.
+
+  5. It should (optionally) serve static files.  If you have static
+     files on your site (especially large ones), they are far more
+     efficiently served with as few data copies as possible (e.g. with
+     sendfile() to completely avoid copying the data to userspace).
+
+nginx is the only (Free) solution we know of that meets the above
+requirements.
+
+Indeed, the author of Unicorn has deployed nginx as a reverse-proxy not
+only for Ruby applications, but also for production applications running
+Apache/mod_perl, Apache/mod_php and Apache Tomcat.  In every single
+case, performance improved because application servers were able to use
+backend resources more efficiently and spend less time waiting on slow
+I/O.
+
+== Worse Is Better
+
+Requirements and scope for applications change frequently and
+drastically.  Thus languages like Ruby and frameworks like Rails were
+built to give developers fewer things to worry about in the face of
+rapid change.
+
+On the other hand, stable protocols which host your applications (HTTP
+and TCP) only change rarely.  This is why we recommend you NOT tie your
+rapidly-changing application logic directly into the processes that deal
+with the stable outside world.  Instead, use HTTP as a common RPC
+protocol to communicate between your frontend and backend.
+
+In short: separate your concerns.
+
+Of course a theoretical "perfect" solution would combine the pieces
+and _maybe_ give you better performance at the end of the day, but
+that is not the Unix way.
+
+== Just Worse in Some Cases
+
+Unicorn is not suited for all applications.  Unicorn is optimized for
+applications that are CPU/memory/disk intensive and spend little time
+waiting on external resources (e.g. a database server or external API).
+
+Unicorn is highly inefficient for Comet/reverse-HTTP/push applications
+where the HTTP connection spends a large amount of time idle.
+Nevertheless, the ease of troubleshooting, debugging, and management of
+Unicorn may still outweigh the drawbacks for these applications.
diff --git a/README b/README
index b53d7c6..4c7d1ab 100644
--- a/README
+++ b/README
@@ -1,36 +1,48 @@
-= Unicorn: UNIX + LAN/localhost-only fork of Mongrel
+= Unicorn: Unix + LAN/localhost-optimized fork of Mongrel
  
-Only run this behind a full-HTTP-request-buffering reverse proxy if
-you're serving slow clients.  That said, nginx is the only reverse
-proxy we know of that meets this requirement.
+Unicorn is designed to only serve fast clients.  See the PHILOSOPHY
+and DESIGN documents for more details regarding this.
  
  == Features
  
-* process management: Unicorn will reap and restart workers that
-  die because of broken apps and there is no need to manage
-  multiple processes yourself.
+* Built on the solid Mongrel code base and takes full advantage
+  of functionality exclusive to Unix-like operating systems.
  
-* does not care if your application is thread-safe or not, workers
+* Mostly written in Ruby, only the HTTP parser (stolen and trimmed
+  down from Mongrel) is written in C.  Unicorn is compatible with
+  both Ruby 1.8 and 1.9.
+
+* Process management: Unicorn will reap and restart workers that
+  die from broken apps.  There is no need to manage multiple processes
+  yourself.
+
+* Load balancing is done entirely by the operating system kernel.
+  Requests never pile up behind a busy worker.
+
+* Does not care if your application is thread-safe or not, workers
    all run within their own isolated address space and only serve one
    client at a time...
  
-* able to listen on multiple interfaces, including UNIX sockets,
-  each worker process can also bind to a private port via the
-  after_fork hook for easy debugging.
+* Supports all Rack applications, along with pre-Rack versions of
+  Ruby on Rails via a Rack wrapper.
  
-* supports all Rack applications
+* Builtin log rotation of all log files in your application via USR1
+  signal.
  
  * nginx-style binary re-execution without losing connections.
-  You can upgrade unicorn, your entire application, libraries
-  and even your Ruby interpreter as long as unicorn is
+  You can upgrade Unicorn, your entire application, libraries
+  and even your Ruby interpreter as long as Unicorn is
    installed in the same path.
  
  * before_fork and after_fork hooks in case your application
    has special needs when dealing with forked processes.
  
-* builtin log rotation via USR1 signal
+* Can be used with copy-on-write-friendly memory management
+  to save memory.
  
-* Ruby 1.9-compatible (at least the test cases all pass :>)
+* Able to listen on multiple interfaces including UNIX sockets,
+  each worker process can also bind to a private port via the
+  after_fork hook for easy debugging.
  
  == License
  
@@ -41,6 +53,8 @@ Mongrel is copyright 2007 Zed A. Shaw and contributors. It is licensed
  under the Ruby license and the GPL2. See the include LICENSE file for
  details.
  
+Unicorn is 100% Free Software.
+
  == Install
  
  The library consists of a C extension so you'll need a C compiler or at
@@ -74,45 +88,41 @@ your web browser and download the latest snapshot tarballs here:
  
  === non-Rails Rack applications
  
-Unicorn will look for the config.ru file used by rackup in APP_ROOT.
-Optionally, it can use a config file for unicorn-specific options
-specified by the --config-file/-c command-line switch.  See
-Unicorn::Configurator for the syntax of the unicorn-specific
-config options.
-
-In APP_ROOT, just run:
+In APP_ROOT, run:
  
    unicorn
  
-Unicorn should be capable of running most Rack applications.  Since this
-is a preforking webserver, you do not have to worry about thread-safety
-of your application or libraries. However, your Rack application may use
-threads internally (and should even be able to continue running threads
-after the request is complete).
-
  === for Rails applications (should work for all 1.2 or later versions)
  
  In RAILS_ROOT, run:
  
    unicorn_rails
  
+Unicorn will bind to all interfaces TCP port 8080 by default.
+You may use the '-l/--listen' switch to bind to a different
+address:port or a UNIX socket.
+
+=== Configuration File(s)
+
+Unicorn will look for the config.ru file used by rackup in APP_ROOT.
+
+For deployments, it can use a config file for Unicorn-specific options
+specified by the --config-file/-c command-line switch.  See
+Unicorn::Configurator for the syntax of the Unicorn-specific options.
+The default settings are designed for maximum out-of-the-box
+compatibility with existing applications.
+
  Most command-line options for other Rack applications (above) are also
-supported.  The unicorn_rails launcher attempts to combine the best
-features of the Rails-bundled "script/server" with the "rackup"-like
-functionality of the `unicorn' launcher.
+supported.  Run `unicorn -h` or `unicorn_rails -h` to see command-line
+options.
  
  == Disclaimer
  
-There are only a few instances of Unicorn deployed anywhere in the
-world.  The only public site known to run Unicorn at this time is
-http://git.bogomips.org/cgit which runs Unicorn::App::ExecCgi to
-fork()+exec() cgit.
+Like the creatures themselves, production deployments of Unicorn are rare.
+There is NO WARRANTY whatsoever if anything goes wrong, but let us know and
+we'll try our best to fix it.
  
-Be one of the first brave guinea pigs to run it on your production site!
-Of course there is NO WARRANTY whatsoever if anything goes wrong, but
-let us know and we'll try our best to fix it.  Unicorn is still in the
-early stages and testing + feedback would be *greatly* appreciated;
-maybe you'll get Rainbows as a reward!
+Rainbows are NOT included.
  
  == Known Issues
  
diff --git a/SIGNALS b/SIGNALS
index b1a3141..671e8a5 100644
--- a/SIGNALS
+++ b/SIGNALS
@@ -61,27 +61,27 @@ The procedure is exactly like that of nginx:
     of unicorn running now, both of which will have workers servicing
     requests.  Your process tree should look something like this:
  
-   unicorn master (old)
-   \_ unicorn worker[0]
-   \_ unicorn worker[1]
-   \_ unicorn worker[2]
-   \_ unicorn worker[3]
-   \_ unicorn master
-      \_ unicorn worker[0]
-      \_ unicorn worker[1]
-      \_ unicorn worker[2]
-      \_ unicorn worker[3]
-
-4. You can now send WINCH to the old master process so only the new workers
+     unicorn master (old)
+     \_ unicorn worker[0]
+     \_ unicorn worker[1]
+     \_ unicorn worker[2]
+     \_ unicorn worker[3]
+     \_ unicorn master
+        \_ unicorn worker[0]
+        \_ unicorn worker[1]
+        \_ unicorn worker[2]
+        \_ unicorn worker[3]
+
+3. You can now send WINCH to the old master process so only the new workers
     serve requests.  If your unicorn process is bound to an interactive
     terminal, you can skip this step.  Step 5 will be more difficult but
     you can also skip it if your process is not daemonized.
  
-5. You should now ensure that everything is running correctly with the
+4. You should now ensure that everything is running correctly with the
     new workers as the old workers die off.
  
-6a. If everything seems ok, then send QUIT to the old master.  You're done!
+5. If everything seems ok, then send QUIT to the old master.  You're done!
  
-6b. If something is broken, then send HUP to the old master to reload
-    the config and restart its workers.  Then send QUIT to the new master
-    process.
+   If something is broken, then send HUP to the old master to reload
+   the config and restart its workers.  Then send QUIT to the new master
+   process.
diff --git a/TODO b/TODO
index 9342cf1..2241ff9 100644
--- a/TODO
+++ b/TODO
@@ -1,7 +1,5 @@
  == 1.0.0
  
-  * integration tests for Rails 1.2.x-2.2.x and 2.3.x+
-
    * tests for preload_app boolean
  
    * reexec_worker_processes config option:
@@ -12,16 +10,8 @@
  
    * integration tests with nginx including bad client handling
  
-  * Unicorn philosophy documentation
-
-  * cleanup HttpParser handling of HTTP_BODY
-
-  * read CLI switches in config.ru at startup, not load time
-
    * tests for timeout
  
-  * QA behaviour on 1.9 (with Rails 2.3.x+)
-
    * manpages (why do so few Ruby executables come with proper manpages?)
  
  == 1.1.0
@@ -29,3 +19,9 @@
    * Transfer-Encoding: chunked request handling.  Testcase:
  
        curl -T- http://host:port/path < file_from_stdin
+
+  * code cleanups (launchers)
+
+  * Pure Ruby HTTP parser
+
+  * Rubinius support?
author	Eric Wong <normalperson@yhbt.net>	2009-04-01 03:35:47 -0700
committer	Eric Wong <normalperson@yhbt.net>	2009-04-01 03:35:47 -0700
commit	c0d79dbb2e5f0f23236c60a0e7c5bb92be2512aa (patch)
tree	d8c0651521f1730af82e885734d985e5158dd499
parent	f30bfcff2564d6114db9a44cccbad87863dcb913 (diff)
download	unicorn-c0d79dbb2e5f0f23236c60a0e7c5bb92be2512aa.tar.gz