From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <e@80x24.org>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-ASN:  
X-Spam-Status: No, score=-4.1 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,
	T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no
	version=3.4.2
Received: from localhost (dcvr.yhbt.net [127.0.0.1])
	by dcvr.yhbt.net (Postfix) with ESMTP id 38EC11F403;
	Thu,  7 Jul 2022 10:23:30 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org;
	s=selector1; t=1657189410;
	bh=gkh0wfCZ7ypdbMksajHOjjvfL3oxHbd4u1JM/vGsSBw=;
	h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
	b=PcVv0Di/XyiDDeYfaw92dTy4cM4KVR/iT+/NXBOZi9p4AZIoMaQCQtvAZsWF1UVco
	 PN3jDykVA1ZWfxMJma5DxEyAzRppTnjUOuNNXDeJm1JPsN9m5tUFW3y4K+ikhSiMR6
	 YLLKF2BqGdRW7/WH7pBCvIocdCdCQvgS3m+KNJcc=
Date: Thu, 7 Jul 2022 10:23:30 +0000
From: Eric Wong <e@80x24.org>
To: Jean Boussier <jean.boussier@shopify.com>
Cc: unicorn-public@yhbt.net
Subject: Re: [PATCH] Master promotion with SIGURG (CoW optimization)
Message-ID: <20220707102330.M110641@dcvr>
References: <b9a7c1b1-0242-af8c-71a2-4d659ab4647d@shopify.com>
 <20220706023352.M393316@dcvr>
 <CANPRWbHmZz42YLVBDfngU20ODfdVytQtv9LQ-nVn3b8bMP0c1w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <CANPRWbHmZz42YLVBDfngU20ODfdVytQtv9LQ-nVn3b8bMP0c1w@mail.gmail.com>
List-Id: <unicorn-public.yhbt.net>

Jean Boussier <jean.boussier@shopify.com> wrote:
> > OK, any numbers from Puma users which can be used to project
> > improvements for unicorn (and other servers)?
> 
> There are some user reports here: https://github.com/puma/puma/issues/2258
> but they are mixed in reports for two other new featurs.
> 
> Some reports are up to 20-30% savings, but I'd expect unicorn to benefit
> even more from it, given that typical puma users spawn less processes
> than with unicorn.

OK, those numbers sound very good (but I can't view
graphics/screenshots from w3m)

> > > They also include automatic reforking after a predetermined amount
> > > of requests (1k by default).
> >
> > 1k seems like a lot of requests (more on this below)
> 
> Agreed. Shared pages start being invalidated much faster than that.
> 
> > > - master: on `SIGURG`
> > >   - Forward `SIGURG` to a chosen worker.
> >
> > OK, I guess SIGURG is safe to use since nobody else relies on it (right?).
> 
> That's my understanding, an alternative could be to re-use USR2 and have a
> config flag to define wether it is a rexec or refork.

SIGURG is fine, having SIGUSR2 mean two things seems fragile and error-prone

> > Right.  PID file races and corner cases were painful to deal
> > with in the earliest days of this project, and I don't look
> > forward to having to deal with them ever again...
> >
> > It looked like the world was moving towards socket-activation
> > with dedicated process managers and away from fragile PID files;
> > so being self-daemonization dependent seems like a step back.
> 
> Agreed. We're currently running unicorn as PID 1 inside containers
> and I'm not exactly looking forward to have to minitor a PID file.
> 
> Another avenue I explored was to keep the existing master and
> refork from one of the worker like puma, but re-assign the parent
> to the orignal master using PR_SET_CHILD_SUBREAPER.
> 
> Here's my notes of how it works:
> 
> ### Requirements:
> 
> - Need `PR_SET_CHILD_SUBREAPER`, so Linux 3.4+ only (May 2012)

I'm fine with requiring Linux for this feature, existing
features need to continue working on *BSD...

> - Need `File.mkfifo`, so Ruby 2.3 (Dec 2015), but can be shimed for older
>   rubies.

I don't think it's necessary to use FIFOs at all (see below)

> ### Flow:
> 
> - master: set `PR_SET_CHILD_SUBREAPER` during boot.
> - master: create a temporary directory (`$TMP`).
> - master: spawn initial workers the old way.
> 
> - master: on `SIGCHLD` (a child died):
>   - Fake signal oldest worker (`SIGURG`).
>   - Write the new worker number in the fake signal pipe (at the same time).
> 
> - worker: on `SIGURG` (should spawn a sibling):
>   - Note: worker fake signals are processed after the current request is
>     completed.
>   - Run `before_fork` callback (MAYBE: need a special `before_refork`
>     callback?)
>   - create a pipe.
>   - daemonize (fork twice so that the new process is attributed to the
>     nearest `CHILD_SUBREAPER`, aka the original master).
>   - wait for the first child to exit.
>   - write into the pipe to signal the parent is dead.
>   - Run `after_fork` callback (MAYBE: need a special `after_refork` callback?)

*_fork callbacks already work for your current patch, though, right?;
so hopefully *_refork won't be needed...

> - new sibling after fork:
>   - wait for the parent to exit (though the temporary pipe).
>   - create a named pipe (`File.mkfifo`) at `$TMP/$WORKER_NUM.pipe`.
>   - create a pidfile at `$TMP/WORKER_NUM.pid`.
>   - Open the named pipe in `IO::RDONLY | IO::NONBLOCK` (otherwise it would
>     block until the master open in write mode).
>     - NB: Need to convert it to a `Kgio::Pipe` with
>       `Kgio::Pipe.for_fd(f.to_i)`, and keep `f` need `autoclose = false`.

sidenote: no need for kgio, I've been meaning to get rid of it
(Ruby 2.3+ has everything we'd need, I think...)

>   - Create the `Unicorn::Worker` instance with that pipe and worker number.
>   - Send `SIGURG` to the parent process (should be the master).
>   - Wait for `SIGCONT` in the named pipe.
>   - enter the worker loop.
> 
> - master: on `SIGURG`:
>   - for each outstanding refork order:
>     - Look for `$TMP/$WORKER_NUM.pid` and `$TMP/$WORKER_NUM.pipe`
>     - Read the pidfile
>     - Open the pipe with `IO::RDONLY | IO::NONBLOCK`
>       - NB: Need to convert it to a `Kgio::Pipe` with
>         `Kgio::Pipe.for_fd(f.to_i)`, and keep `f` need `autoclose = false`.
>     - Delete pidfile and named pipe.
>     - Register that new worker normally.
>     - Write `SIGCONT` into the pipe.
> 
> I can share the patch if you are interested, but the extra complexity
> and Linux only features made me prefer the master-promotion approach.

I would prefer this Linux-only approach if it gets rid of PID
files and doesn't introduce new temporary files/FIFOs.

It seems socketpair(..., SOCK_SEQPACKET) can be used for
packetized bidirectional IPC, perhaps with send_io + recv_io iff
necessary.  There shouldn't be any need for new, fragile FS
interactions.


Another idea, have you considered letting master work on some requests?
Signal handling would be delayed, and EINTR handling bugs in
gems could be exposed, but it'd be completely portable...

> > > However it work well enough for demonstration.
> > >
> > > So I figured I'd ask wether the feature is desired upstream
> > > before investing more effort into it.
> >
> > It really depends on:
> >
> > 1) what memory savings and/or speedups can be achieved
> >
> > 2) what support can we expect from you (and other users) for
> >    this feature and potential regressions going forward?
> >
> > I don't have the free time nor mental capacity to deal with
> > fallout from major new features such as this, myself.
> 
> Yeah, this is just a RFC, I wouldn't submit a final patch without
> first deploying it on our infra with good results. I'm just on the
> fence between trying to get this upstream vs maintaining our own
> server that solely work like this, hence somewhat simpler and that
> can make more assumptions.

Of course you also realize unicorn remains culturally-incompatible
with 99.9% of the open source world since I refuse to rely on
graphics drivers to work on a daemon, proprietary software,
terms-of-service, etc...

If anything goes wrong, I'd likely CC you anyways.

> > The hook interface would be yet another documentation+support
> > burden I'm worried about (and also more unicorn-specific stuff
> > to lock people in :<).
> 
> Understandable. I suppose it can be done with a monitoring process.
> Check `smaps_rollup` and send `SIGURG` when deemed necessary.
> 
> More moving pieces but keep unicorn simpler.

*shrug*  I've also been favoring more vertical integration in
other places to keep the overall stack simpler.

It depends on whether the monitoring process/library can work
with other Rack servers, probably; and signal compatibility.

> > A completely different idea which should achieve the same goal,
> > completely independent of unicorn:
> >
> >   App startup/loading can be made to trigger warmup paths which
> >   hit common endpoints.  IOW, app loading could run a small
> >   warmup suite which simulates common requests to heat up caches
> >   and trigger JIT.
> 
> Yeah, I don't really believe in this for a few reasons:
> 
>   - That would slow boot time.

Yes, and startup time is already a nasty thing, anyways...

>   - On large apps there's just too many codepath for this to be realistic.

I was thinking a lazy solution could be to have some middleware
measure coverage and log requests to support auto-replay, somehow.

>   - Many endpoints require database and other network access you probably
>     don't want in the master.

Wouldn't the *_fork hooks handle that, anyways?

>   - Endpoints may have side effects.

Yeah, this would optimize the GET endpoints, at least.
Not sure what percentage of POST needs to be optimized...

> > Ultimately (long-term for ruby-core), it would be better to make
> > JIT (and perhaps even VM cache) results persistent and shareable
> > across different processes, somehow...  That would help all Ruby
> > tools, even short-lived CLI tools.  Portable locking would be a
> > pain, I suppose, but proprietary OSes are always a pain :P
> 
> Based on my limited understanding of both JIT and VM caches, I don't think
> that's really possible.

Ugh, I still think it can be made possible, somehow.  But I no longer
use Ruby for new projects...

> The VM itself could definitely do better CoW wise, and I have some proposals
> on that front (https://bugs.ruby-lang.org/issues/18885) but that will take
> time and will never be perfect.
> 
> That's why I think periodically promoting a new master has potential.
> 
> > There's some line-wrapping issues caused by your MUA;
> 
> Urk. Ok, trying another client now, and I'll resend the patch.

Nope, still wrapped.  According to the git-format-patch(1) manpage,
Thunderbird can work w/o extensions, actually.

IME attachments might work mostly well, nowadays (test sending
to yourself and using "git am" to apply, first, of course).
Most on vger will disagree with me on attachments, though...

"git send-email" and "git imap-send" remain the recommended
options, but mutt works well, too.

> > Perhaps documenting this as experimental and subject to removal
> > at any time would make the addition of a major new feature an
> > easier pill to swallow; but I still think this is better outside
> > of app servers.
> 
> Up to you, if you don't feel like maintaining such a feature I would
> perfectly understand.

I'm fine with "maintaining it" if it just means Cc-ing you on
bugs related to this :>