unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: unicorn-public@bogomips.org
Subject: Re: WTF is up with memory usage nowadays?
Date: Wed, 8 Feb 2017 20:00:51 +0000	[thread overview]
Message-ID: <20170208200050.GA10723@whir> (raw)
In-Reply-To: <20161212021000.GA15226@untitled>

Eric Wong <e@80x24.org> wrote:
>   The Rack response body only needs to respond to #each.
>   There should be no reason to build giant response
>   documents in memory before sending them to a client.
>   unicorn can't do the following for you automatically since
>   we don't know how/if a Rack app will reuse a string;
>   but upstack authors can String#clear after yielding
>   in #each to ensure any malloced heap memory is immediately
>   available for future use (but beware of downstream middlewares
>   which do not expect this, too(**)):
>     def each
>       # .. do something to generate a giant string
>       yield giant_string

... The yield above is so unicorn (or any server) can call
IO#write or similar (send(,_nonblock), write_nonblock, etc...).

That means once IO#write is complete, the contents of the string
is shipped off to the OS TCP stack and Ruby can forget about it:

>       giant_string.clear # String#clear
>     end

However, this is largely ineffective since Ruby 2.0.0 - 2.4.0
has a thread-safety fix which causes excessive garbage:


And it looks like that went unnoticed for a few years...
Shame on all of us! :<

Anyways, it looks like an acceptable fix finally got accepted
for Ruby 2.5 (December 2017): https://bugs.ruby-lang.org/issues/13085

I've considered backporting a workaround into unicorn; but I'm
leaning against it since most apps do not recycle buffers at the
moment, so they make a lot of garbage, anyways.

Another place unicorn uses IO#write is buffering large files in
the TeeInput class; but I'm not sure if enough people care about
large uploads.  Embarrassingly, most of the large I/O apps I
maintain are still on 1.9.3, so I did not notice it :x
In any case, anybody running an up-to-date trunk or willing
to wait until Ruby 2.5 won't have to worry or think about this.

But yeah, all of this means there's still a both a runtime and
code complexity cost for supporting 1:1 native threads (at least
the way MRI does it).  This cost is there regardless of whether
or not the code you run uses threads.

      parent reply	other threads:[~2017-02-08 20:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-12  2:10 WTF is up with memory usage nowadays? Eric Wong
2016-12-12  4:05 ` Sam Saffron
2016-12-12  5:48   ` Eric Wong
2016-12-12  9:49 ` hukl
2017-02-08 20:00 ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://yhbt.net/unicorn/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170208200050.GA10723@whir \
    --to=e@80x24.org \
    --cc=unicorn-public@bogomips.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).