From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-2.9 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, URIBL_BLOCKED shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: yahns-public@yhbt.net Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 0FF8B1F79E; Fri, 8 May 2015 17:03:12 +0000 (UTC) Date: Fri, 8 May 2015 17:03:11 +0000 From: Eric Wong To: "Lin Jen-Shin (godfat)" Cc: yahns-public@yhbt.net, wildjcrt@gmail.com Subject: Re: What would happen if a worker thread died? Message-ID: <20150508170311.GA1260@dcvr.yhbt.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: "Lin Jen-Shin (godfat)" wrote: > During the experiments, we found that whenever a worker > thread died due to LoadError raised from the application, > which is not a StandardError therefore was not rescued at all, > crashing the worker thread (assumed, not verified). Ugh, I guess since it happened in a thread, the error message got swallowed unless you were running in $DEBUG. Loading code after the server is ready and serving requests is a bad idea. It leads to really nasty thread-safety problems as well as invalidating the method/constant caches. > When this happened, the client just hanged forever with yahns. > Is there something we can do about this? Would yahns respawn > a new worker thread? Can we close the socket when this happen? It's unfortunately difficult to detect thread death from ruby (no SIGCHLD handler unlike for processes) besides polling Thread#join We had this issue in ruby-core a few years back, but apparently it was forgotten/ignored by matz. Care to chime in? https://bugs.ruby-lang.org/issues/6647 > I am aware that yahns is *extremely sensitive to fatal bugs in the > applications it hosts*, so I am just curious. > > For reference, Puma would immediately close the socket without > sending anything, and Unicorn would log the error backtrace and > kill the worker (If I read it correctly). > > In this case, Unicorn helped me figure out what's happened. yahns can probably rescue Exception (or Object(!) like puma) and then log + abort the entire process.