From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Status: No, score=-1.2 required=3.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_BLOCKED,URIBL_BLOCKED shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: yahns-public@yhbt.net Received: from mail-lb0-f180.google.com (mail-lb0-f180.google.com [209.85.217.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 192311FABD for ; Sat, 9 May 2015 07:27:09 +0000 (UTC) Received: by lbbqq2 with SMTP id qq2so66247977lbb.3 for ; Sat, 09 May 2015 00:27:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=W6djpYEiCoByfBeEJ8laeLIVki3nrdq4x6C+L2K+qRE=; b=GBYGXTmvTp4wYLNQRw4aoc2jidpZSXQu4/vfBppLGCjXbxzdt5THKkfpjJ1TXOaD3M goPOoX7uXEtf07C5f7V/HY3gDR6EteqsK1K97/LUMtcEI//xN/QadoqGotwc6PaDH8+K bTrzeiqWIm9y8hKPjJkpG2mki/CJrzl9KfQhLEU9VCMF/KXeDrh/wbVZiIbc7cnpqpeE lljvtLJXJx9erg0D7xnnkhCObiI4WGOvX9lej1bPK1snVOxxk86EsEUdfcuzUbY9B4pc wg35VdmVkL1OURsaXTZcmOyTxMaTDaeQ+4sbAllCPy3all1D5lPFH08kCir8XBOEyLqG A6dA== X-Gm-Message-State: ALoCoQn9BKY4Dgh5+OBUfIA/qwO8C4NRo5LFNYsZMa4k/21MQzv240rzow1x/SOGpLrsaVcfrfoN X-Received: by 10.152.43.110 with SMTP id v14mr1206569lal.4.1431156427454; Sat, 09 May 2015 00:27:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.149.71 with HTTP; Sat, 9 May 2015 00:26:36 -0700 (PDT) In-Reply-To: <20150509010349.GA23261@dcvr.yhbt.net> References: <20150508170311.GA1260@dcvr.yhbt.net> <20150509010349.GA23261@dcvr.yhbt.net> From: "Lin Jen-Shin (godfat)" Date: Sat, 9 May 2015 15:26:36 +0800 Message-ID: Subject: Re: What would happen if a worker thread died? To: Eric Wong Cc: yahns-public@yhbt.net, wildjcrt@gmail.com Content-Type: text/plain; charset=UTF-8 List-Id: On Sat, May 9, 2015 at 9:03 AM, Eric Wong wrote: > "Lin Jen-Shin (godfat)" wrote: >> On Sat, May 9, 2015 at 1:03 AM, Eric Wong wrote: >> > It's unfortunately difficult to detect thread death from ruby (no >> > SIGCHLD handler unlike for processes) besides polling Thread#join >> > >> > We had this issue in ruby-core a few years back, but apparently >> > it was forgotten/ignored by matz. Care to chime in? >> > https://bugs.ruby-lang.org/issues/6647 >> >> I just sent a few characters, hope that would speed up the process. > > Thanks for reminding us of this, care to examine/fix some of the MRI > test failures in the patch I posted to MRI? :) Haha, cool. Probably not now though. I just took some look, ignoring warnings, I guess some of the tests were trying to capture stdout or stderr and assert on messages. Along with abort_on_exception and using join to peek the exception, this probably breaks those tests. So I assume most of them were bugs in the tests, not in MRI itself. Testing error messages is hard :( >> I think rescuing Object is misleading. AFAIK, we cannot raise >> an instance which is not a kind of Exception. > > I guess, there's some internal non-object interrupts in MRI for threads > (eKillSignal, eTerminateSignal) but I don't think those get exposed to > Ruby-land... Got it, makes sense. >> However for a worker thread, I guess that might be ok? > > Maybe limiting it to the common types {Standard,Load,Syntax}Error > is sufficient. Those are what I can think of right now, too. > Below, I'm choosing to both leave the socket open and keep the worker > running to slow down a potentially malicious client if this happens and > to hopefully prevent an evil client from taking others down with it. I am curious how this could slow down a malicious client? Because this might somehow confuse them that the worker is still working? > The process may be in bad state from Load/SyntaxErrors anyways with > partially loaded code, though. > > yahns cannot be made error-tolerant when given buggy code, but it should > at least allow users to find problems since the Ruby default behavior > sucks right now: > > diff --git a/lib/yahns/queue_epoll.rb b/lib/yahns/queue_epoll.rb > index 4f3289e..2875920 100644 > --- a/lib/yahns/queue_epoll.rb > +++ b/lib/yahns/queue_epoll.rb > @@ -64,7 +64,7 @@ class Yahns::Queue < SleepyPenguin::Epoll::IO # :nodoc: > raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}" > end > end > - rescue => e > + rescue StandardError, LoadError, SyntaxError => e > break if closed? # can still happen due to shutdown_timeout > Yahns::Log.exception(logger, 'queue loop', e) > end while true > diff --git a/lib/yahns/queue_kqueue.rb b/lib/yahns/queue_kqueue.rb > index 4176f7a..33f5f8b 100644 > --- a/lib/yahns/queue_kqueue.rb > +++ b/lib/yahns/queue_kqueue.rb > @@ -72,7 +72,7 @@ class Yahns::Queue < SleepyPenguin::Kqueue::IO # :nodoc: > raise "BUG: #{io.inspect}#yahns_step returned: #{rv.inspect}" > end > end > - rescue => e > + rescue StandardError, LoadError, SyntaxError => e > break if closed? # can still happen due to shutdown_timeout > Yahns::Log.exception(logger, 'queue loop', e) > end while true > > Thoughts? A backtrace for knowing what's happening I think is quite enough for me now. Still curious though, could this worker do anything else if this happened? I am guessing that if the application no longer does anything, then this worker would not do anything. Or the socket might timeout eventually?