From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from mail-ot0-x22f.google.com (mail-ot0-x22f.google.com [IPv6:2607:f8b0:4003:c0f::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 04FF41FADF for ; Mon, 15 Jan 2018 02:18:40 +0000 (UTC) Received: by mail-ot0-x22f.google.com with SMTP id 37so9456262otv.6 for ; Sun, 14 Jan 2018 18:18:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=YePeuTfPk/uf0Fti7DzLNSgmNwJozxnqSfLdW+UvKwU=; b=mA//fwQTpPEezXm//rX6+grVIZIVh3xW3uztT7zOXBjx7GYnop9iwIMPSHXi6jkeA/ 3xvdWYzb8sXqVbgyyr0KTP9ciKITdGg67c0i+zreGBHDKHYYLPZ2peM3EIr6Er1WAoV0 8Nn4UOZF/c/Ijh7loomY+fhBSUH9rcSQS2n4vD0hiSdUWI2/1xdvs2Rn3DfKmtut8RT9 wTp3jKs2QaEgLDByu+/sfXcLsT8NQgZuDVRLuua4t9MvyQLXZUapW98+n0x2bEvlZE2z DL/KJ1nXpWlTzRCh06geP1iIpx89i9+i2ljwtwkj4yn/mhXc4vgoACP5tNoTs2hVUQsW R+mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=YePeuTfPk/uf0Fti7DzLNSgmNwJozxnqSfLdW+UvKwU=; b=oInZ9ecHQmxhhTRTk2qYcDzSxhFXStO+B/r3COFEi/bZoFB1jlpC5gKHRzAvxlEI4F yreFNvem3bXyBOl3O5xTtcGjJba5P+WkmYdE9qwdeKhl4RV8bOaTT9JAvKJ2Y6LNLDVD BbwHDbh+YTxhWM4G8dk4Ja9Cwzfk54oZ1ItK4sUZ9nPDuDTArxTAEgiLBx6QRgd3PvzO KaZTZUE+SdpaSAQigRdFMUajtvYZOdDYHtFpSqPPW62dq+hhq5N9WxLPVgMfL3Ah198O 5Lo7/YOutC4dMQenQj4eOdCCtFk9hpnVAqD8pEpOsAyW0sqPI2E1FlSM4OI4qCwRESsN Oliw== X-Gm-Message-State: AKwxytfwXgAZG9X1DRZC18d+6UOaDX8I2T0vT6vHOnTHGCjqY0WIpmiW bcsv7gycMH3vLhlG0kvkQzMjPBuubVUW6y1gbkE= X-Google-Smtp-Source: ACJfBouiovAzYF7mZA21akcDiLRBBlPuB8k+70Vi7ECLPbB/E9z5MzFBEVV6Y5AOCTlv6w5OSAfH3NC77OZML76nD2c= X-Received: by 10.157.52.100 with SMTP id v91mr20084239otb.121.1515982718896; Sun, 14 Jan 2018 18:18:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.74.121.78 with HTTP; Sun, 14 Jan 2018 18:18:38 -0800 (PST) In-Reply-To: <20180115015740.GA850@dcvr> References: <20180115015740.GA850@dcvr> From: Sam Saffron Date: Mon, 15 Jan 2018 13:18:38 +1100 Message-ID: Subject: Re: Log URL with murder_lazy_workers To: Eric Wong Cc: unicorn-public Content-Type: text/plain; charset="UTF-8" List-Id: > I seem to remember it was possible to get that information out of Rails logs pretty easily, already; and I seem recall doing that back many years ago when I used Rails. (This is probably why USR1 log reopening waits until a response is done before triggering...) Yes I am able to get the info out of our logs, I just read through haproxy logs filtering on stuff that takes longer than N. Getting alerts though is a bit trickier, but I can probably workaround without hacking at unicorn. > I also believe the unicorn `timeout' is a misfeature that probably set the entire Rack/Ruby ecosystem back 10 years or more This is super duper tricky and I do not think this is deficiency of timeout. It is merely being employed as a workaround for design problems. There is usually only 1 or 2 spots that could cause timeouts, 99% of the time it is slow database queries under extreme load. It is super likely this could be handled in the app if we had: db_connection.timeout_at Time.now + 29 Then the connection could trigger the timeout and kill off the request without needing to tear down the entire process and re-forking. Making this happen is a bit tricky cause it would require some hacking on the pg gem. I am just not sure how hacking at timeout can make stuff any better, it is an escape hatch, just in case code misbehaves. On Mon, Jan 15, 2018 at 12:57 PM, Eric Wong wrote: > Sam Saffron wrote: >> I would love to start logging the actual URL that timed out when >> murder_lazy_workers does its thing. >> >> Clearly the master process has no knowledge here, but perhaps if we >> had a named pipe from each child to master we could quickly post >> current url down the pipe so we would have something to log when we >> murder a url. > > That would make the master a bottleneck. > > Instead, I suggest logging a START action with the > URL+PID+Thread(*)+serial number; and then matching it a > corresponding END action in the response_body#close Anything > without a corresponding END action can be deemed a loss and > matched up with the KILL action based on PID. > > (*) Log Thread/Fiber so it can work with other servers, too. > > > I seem to remember it was possible to get that information out of > Rails logs pretty easily, already; and I seem recall doing that > back many years ago when I used Rails. (This is probably why > USR1 log reopening waits until a response is done before > triggering...) > > > And as I've stated many times before, I don't want any sort of > lock-in or even guide-in to make people feel like they're stuck > using unicorn (by having code which depends on it). I also > believe the unicorn `timeout' is a misfeature that probably set > the entire Rack/Ruby ecosystem back 10 years or more, so I'd > rather people stop depending on it and fix their timeouts. > > (To that end, I may see about making timeout.rb in stdlib better > for Ruby 2.6...) > >> Clearly an opt-in thing, but would be very handy for quick diagnostics >> cause we can then avoid deeper log analysis and raise events just as >> this happens. > > Sorry, I prefer generic solutions which work with other servers, too.