From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mx1.dfw.automattic.com (mx1.dfw.automattic.com [192.0.84.151]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 22A1C1F4B4 for ; Wed, 20 Jan 2021 21:13:22 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mx1.dfw.automattic.com (Postfix) with ESMTP id 7E7E31C1C26 for ; Wed, 20 Jan 2021 21:13:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=automattic.com; h=content-type:content-type:subject:subject:message-id:date :date:from:from:in-reply-to:references:mime-version:received :received:received:received:received; s=automattic1; t= 1611177201; bh=iIixaRRzuyhj8eOoP4HRtNR+Ofa35qR9bot6kMT+Axo=; b=d 3KAwG8gLA1UU34SUpkdb8lNzFFIJKHnAuQy2JgB+cjOqzgVXiCHFZpi5ZDcyCsr9 /mDPMUerrWl5zAj3nyQlHd0AmfAxwD8a79T75vDz1WBB1/E7C9Q/NSEHVN9h+Ial Th2pDKZetGThGPnZgxBxRsKyJGFe+FVKITZ4lPdn3p1XBosSa5ljrHVVncytogxO w9cyY0ambOpVf+T0YzNLY/N1jFuJYqzmW76CFhI8wtKUqoeVARK/WekN5G3XJ3it REJmK5Oy4FJU2d3taqv/RJSuA36/YqYjMsqTWs5ciL2VgAuLyBDmus5iHDxRcydE ecGIcgNmPjxLixYFiothQ== X-Virus-Scanned: Debian amavisd-new at wordpress.com Received: from mx1.dfw.automattic.com ([127.0.0.1]) by localhost (mx1.dfw.automattic.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VSH3YAh4Y-dr for ; Wed, 20 Jan 2021 21:13:21 +0000 (UTC) Received: from smtp-gw.dfw.automattic.com (smtp-gw.dfw.automattic.com [192.0.95.210]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx1.dfw.automattic.com (Postfix) with ESMTPS id 086DA1C1A5D for ; Wed, 20 Jan 2021 21:13:21 +0000 (UTC) Authentication-Results: mail.automattic.com; dkim=pass (2048-bit key; unprotected) header.d=automattic.com header.i=@automattic.com header.b="Up5Lp3qY"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=a8c-com.20150623.gappssmtp.com header.i=@a8c-com.20150623.gappssmtp.com header.b="OUhGhWNv"; dkim-atps=neutral Received: from smtp-gw.dfw.automattic.com (localhost.localdomain [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-gw.dfw.automattic.com (Postfix) with ESMTPS id E5E1321C44 for ; Wed, 20 Jan 2021 21:13:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=automattic.com; s=automattic1; t=1611177200; bh=iIixaRRzuyhj8eOoP4HRtNR+Ofa35qR9bot6kMT+Axo=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Up5Lp3qYFsIuuhzEdefD3YYVPMFOhY/BL76Wu9RbSM17E0qFsjNw9oE2bxF3iRpAt nYpOJMqzDzOP4yWZywIwCJtZ9llxkEWPmsjLlU4XQeBpMuYEkE/JMTeNvG9LfSIuYy spcdQ+updKmlLEX7zxJsst/TP/qSDo19hSI60Swj+DyUMXs1fhRaufP5Sm91JamAhl g6zuoJS2tmEr6DnohOof/ebOku9fKWcc7Mwx3guuQR9+V5q08H8PdRqZ18BwxKW6lp 613LFMIsav1O4/j2cyZbwh1Ztf/CJov8D26tGffJflQ+SEBMafXvILYv0M1WRAYj16 rv6QWbaLjvRtQ== Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-gw.dfw.automattic.com (Postfix) with ESMTPS id DAB3321C34 for ; Wed, 20 Jan 2021 21:13:20 +0000 (UTC) Received: by mail-wr1-f70.google.com with SMTP id w5so12088104wrl.9 for ; Wed, 20 Jan 2021 13:13:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=a8c-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rVlVKzcVbSuOjeKH6HEsVN6wtDXqtZgTzojeonx6QHA=; b=OUhGhWNvreHH4EHF3P6nKfT2Etahz+tFLup2xLMnOAWe86qysVdmQlueg+/CeEMR9y E3sKQ9/9FgfZv1dns3fYv48Sgkn6dyQWj5mMomgUHPDgENdwAVlxCmzfgSqqcossq7DI FcPsg09qb85XbBAyt9+53nbfWjL4APMdEPJhhm4ZxQbgF1x6nhExYyLXPxEdmfJ6zf2H 8JNLvzwXo17NYPh7Y2l/K4JLNBi3MZJ5OQkhdEqmYUfaPIsD5AM4jyOm/zGgyCactS+P 44nYqmicaCZdYbwdCdSXtmVlaFV3f/sfoGE5VESz+2QeXtlKT55Gz/mz3F2g0D0elMBB KYMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rVlVKzcVbSuOjeKH6HEsVN6wtDXqtZgTzojeonx6QHA=; b=NmPWi2+2h+125SOIso8s2kwzTdD5fIwr1OofgeblQlGrW4fYsZ//AF1MUV5UhwKGbB xPTIo5duVLFqMIP/EmmH7Lm0I11sHIAKf491kwaqjFvJRdpNY8jJgRkxKmQbNhOxXN+H XP74z9JLpjww55ws1RUMxSVJGVKZpSreoLbENLty+/VbS6noYsqYLmxhhsP1vTIwG8i9 K1p3IKnLlIj9dHhnJvqZCIWYkIboB6WKQjbpEpyrZ+4NUN0Ckxg7jQkjbe0+SUCcxgUU e02TfI9ztccQKoqk8fhvhOS4ENZfcOzrDa3n4PwQyrTAOnfLSLBFlPlors8aqdQtawir vZOg== X-Gm-Message-State: AOAM533bvgMT9M9be0iGrnfX0erz3Z3gcnHWith1pB2R+Bs1og4bZhd/ 5DzU8KCXRifQn6EIjNFwRVOaex+IlFt3dDUeZ60QYEYE3XAsbjaCrSwRHLY58bEgrIgrBnbq/lb 3hBz9WUvv4Lnhznpxg3vyLXYwpTu7/labbtVxTg== X-Received: by 2002:a05:600c:2601:: with SMTP id h1mr6104247wma.31.1611177199576; Wed, 20 Jan 2021 13:13:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJwrMAKeDOs3VBGDbUbj6mfUa/XkCCwAepUfqot3gyUnShMhnNSr6hST9uxQfhg8KH4uWC0/D1f/aPQtRjdBT9U= X-Received: by 2002:a05:600c:2601:: with SMTP id h1mr6104228wma.31.1611177199303; Wed, 20 Jan 2021 13:13:19 -0800 (PST) MIME-Version: 1.0 References: <20210111212621.GA12555@dcvr> <20210117095109.GA28219@dcvr> <20210120085745.GB29704@dcvr> In-Reply-To: <20210120085745.GB29704@dcvr> From: Xiao Yu Date: Wed, 20 Jan 2021 21:13:05 +0000 Message-ID: Subject: Re: Segfaults on http_close? To: Eric Wong Cc: Xiao Yu , Arkadi Colson , cmogstored-public@yhbt.net Content-Type: text/plain; charset="UTF-8" List-Id: On Wed, Jan 20, 2021 at 8:57 AM Eric Wong wrote: > > Xiao Yu wrote: > > Thanks for the quick response! Sorry about the delay but I ran into a > > couple issues (sorry kinda learning gdb and compiling binaries in > > general on the go here) and have not been able to capture more useful > > logs yet as crashes have seem to have slowed / stopped since > > recompiling and reloading. For the record I recompiled cmogstored with > > the newer RH `devtoolset-9-toolchain` (9.1) and it has not crash > > since. > > No worries. Based on the toolchain change being a success, I > more strongly suspect the tiny patch I sent you guys will fix > the problem with all compilers. That's good to know, would it help if I applied the patch and then compiled with the older compiler to check? > > Also sorry about the lack of useful logs in my initial message but > > neither kernel logs nor messages contained anything interesting around > > the segfaults. Making matters worse, we didn't consistently reload > > cmogstored as various versions of the compiled binary was installed > > across the cluster and didn't really save the debugging symbols from > > each of the compilations so can't really reply with a more useful > > stack trace with the current core dumps. > > That's fine. One thing is I suggest is NOT using --daemonize/-d > flag and instead rely on systemd or something similar that can > capture stderr. > > The --daemonize/-d flag (unfortunately) matches Perlbal > mogstored behavior, which causes stderr to be unconditionally > redirected to /dev/null and hides some errors. I noticed the lack of verbosity / logs :) so right now we're running daemonized but also using syslog to capture logs, it's still pretty quiet overall with nothing around when the segfaults happened and pretty much only logs about premature EOFs when timeouts happen and things like the following when we reload cmogstored: ``` 2021-01-14T15:04:15.888261-05:00 host cmogstored[44351]: upgrade spawned PID:27575 2021-01-14T15:04:15.891551-05:00 host cmogstored[27575]: inherited 0.0.0.0:7501 on fd=4 2021-01-14T15:04:15.891564-05:00 host cmogstored[27575]: inherited 0.0.0.0:7500 on fd=5 ``` Would non-daemonized mode and capturing stderr provide more verbosity? Thanks again, Xiao