From c0d79dbb2e5f0f23236c60a0e7c5bb92be2512aa Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson@yhbt.net>
Date: Wed, 1 Apr 2009 03:35:47 -0700
Subject: Documentation updates, prep for 0.4.1 release

---
 PHILOSOPHY | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 139 insertions(+)
 create mode 100644 PHILOSOPHY

(limited to 'PHILOSOPHY')
diff --git a/PHILOSOPHY b/PHILOSOPHY
new file mode 100644
index 0000000..ce7763a
--- /dev/null
+++ b/PHILOSOPHY
@@ -0,0 +1,139 @@
+= The Philosophy Behind Unicorn
+
+Being a server that only runs on Unix-like platforms, Unicorn is
+strongly tied to the Unix philosophy of doing one thing and (hopefully)
+doing it well.  Despite using HTTP, Unicorn is strictly a _backend_
+application server for running Rack-based Ruby applications.
+
+== Avoid Complexity
+
+Instead of attempting to be efficient at serving slow clients, Unicorn
+relies on a buffering reverse proxy to efficiently deal with slow
+clients.
+
+Unicorn uses an old-fashioned preforking worker model with blocking I/O.
+Our processing model is the antithesis of more modern (and theoretically
+more efficient) server processing models using threads or non-blocking
+I/O with events.
+
+=== Threads and Events Are Hard
+
+...to many developers.  Reasons for this is beyond the scope of this
+document.  Unicorn avoids concurrency within each worker process so you
+have fewer things to worry about when developing your application.  Of
+course Unicorn can use multiple worker processes to utilize multiple
+CPUs or spindles.  Applications can still use threads internally, however.
+
+== Slow Clients Are Problematic
+
+Most benchmarks we've seen don't tell you this, and Unicorn doesn't
+care about slow clients... but <i>you</i> should.
+
+A "slow client" can be any client outside of your datacenter.  Network
+traffic within a local network is always faster than traffic that
+crosses outside of it.  The laws of physics do not allow otherwise.
+
+Persistent connections were introduced in HTTP/1.1 reduce latency from
+connection establishment and TCP slow start.  They also waste server
+resources when clients are idle.
+
+Persistent connections mean one of the Unicorn worker processes
+(depending on your application, it can be very memory hungry) would
+spend a significant amount of its time idle keeping the connection alive
+<i>and not doing anything else</i>.  Being single-threaded and using
+blocking I/O, a worker cannot serve other clients while keeping a
+connection alive.  Thus Unicorn does not implement persistent
+connections.
+
+If your application responses are larger than the socket buffer or if
+you're handling large requests (uploads), worker processes will also be
+bottlenecked by the speed of the *client* connection.  You should
+not allow Unicorn to serve clients outside of your local network.
+
+== Application Concurrency != Network Concurrency
+
+Performance is asymmetric across the different subsystems of the machine
+and parts of the network.  CPUs and main memory can process gigabytes of
+data in a second; clients on the Internet are usually only capable of a
+tiny fraction of that.  Unicorn deployments should avoid dealing with
+slow clients directly and instead rely on a reverse proxy to shield it
+from the effects of slow I/O.
+
+== Improved Performance Through Reverse Proxying
+
+By acting as a buffer to shield Unicorn from slow I/O, a reverse proxy
+will inevitably incur overhead in the form of extra data copies.
+However, as I/O within a local network is fast (and faster still
+with local sockets), this overhead is neglible for the vast majority
+of HTTP requests and responses.
+
+The ideal reverse proxy complements the weaknesses of Unicorn.
+A reverse proxy for Unicorn should meet the following requirements:
+
+  1. It should fully buffer all HTTP requests (and large responses).
+     Each request should be "corked" in the reverse proxy and sent
+     as fast as possible to the backend Unicorn processes.  This is
+     the most important feature to look for when choosing a
+     reverse proxy for Unicorn.
+
+  2. It should spend minimal time in userspace.  Network (and disk) I/O
+     are system-level tasks and usually managed by the kernel.
+     This may change if userspace TCP stacks become more popular in the
+     future; but the reverse proxy should not waste time with
+     application-level logic.  These concerns should be separated
+
+  3. It should avoid context switches and CPU scheduling overhead.
+     In many (most?) cases, network devices and their interrupts are
+     only be handled by one CPU at a time.  It should avoid contention
+     within the system by serializing all network I/O into one (or few)
+     userspace procceses.  Network I/O is not a CPU-intensive task and
+     it is not helpful to use multiple CPU cores (at least not for GigE).
+
+  4. It should efficiently manage persistent connections (and
+     pipelining) to slow clients.  If you care to serve slow clients
+     outside your network, then these features of HTTP/1.1 will help.
+
+  5. It should (optionally) serve static files.  If you have static
+     files on your site (especially large ones), they are far more
+     efficiently served with as few data copies as possible (e.g. with
+     sendfile() to completely avoid copying the data to userspace).
+
+nginx is the only (Free) solution we know of that meets the above
+requirements.
+
+Indeed, the author of Unicorn has deployed nginx as a reverse-proxy not
+only for Ruby applications, but also for production applications running
+Apache/mod_perl, Apache/mod_php and Apache Tomcat.  In every single
+case, performance improved because application servers were able to use
+backend resources more efficiently and spend less time waiting on slow
+I/O.
+
+== Worse Is Better
+
+Requirements and scope for applications change frequently and
+drastically.  Thus languages like Ruby and frameworks like Rails were
+built to give developers fewer things to worry about in the face of
+rapid change.
+
+On the other hand, stable protocols which host your applications (HTTP
+and TCP) only change rarely.  This is why we recommend you NOT tie your
+rapidly-changing application logic directly into the processes that deal
+with the stable outside world.  Instead, use HTTP as a common RPC
+protocol to communicate between your frontend and backend.
+
+In short: separate your concerns.
+
+Of course a theoretical "perfect" solution would combine the pieces
+and _maybe_ give you better performance at the end of the day, but
+that is not the Unix way.
+
+== Just Worse in Some Cases
+
+Unicorn is not suited for all applications.  Unicorn is optimized for
+applications that are CPU/memory/disk intensive and spend little time
+waiting on external resources (e.g. a database server or external API).
+
+Unicorn is highly inefficient for Comet/reverse-HTTP/push applications
+where the HTTP connection spends a large amount of time idle.
+Nevertheless, the ease of troubleshooting, debugging, and management of
+Unicorn may still outweigh the drawbacks for these applications.
-- 
cgit v1.2.3-24-ge0c7