From: Xiao Yu <xyu@automattic.com>
To: cmogstored-public@yhbt.net
Subject: Segfaults on http_close?
Date: Mon, 11 Jan 2021 20:48:53 +0000 [thread overview]
Message-ID: <CABfxMcXPr7q8o1ayRdn1x-Fukuh7-s3YG=04KX=vAzz4DYqhuQ@mail.gmail.com> (raw)
Howdy, we are running a 96 node cmogstored cluster and have noticed
that when the cluster is busy with lots of writes we occasionally get
segfaults in cmogstored. This has happened 7 times in the past week
each time on a random and different cmogstored node. Looking at the
abrt backtrace of the core dump shows something similar to the
following in each instance:
---
{ "signal": 11
, "executable": "/usr/local/bin/cmogstored"
, "stacktrace":
[ { "crash_thread": true
, "frames":
[ { "address": 140389358944542
, "build_id": "3c61131d1dac9da79b73188e7702bef786c2ad54"
, "build_id_offset": 528670
, "function_name": "_int_free"
, "file_name": "/usr/lib64/libc-2.17.so"
}
, { "address": 4225373
, "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
, "build_id_offset": 31069
, "function_name": "http_close"
, "file_name": "/usr/local/bin/cmogstored"
}
, { "address": 4228819
, "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
, "build_id_offset": 34515
, "function_name": "mog_http_queue_step"
, "file_name": "/usr/local/bin/cmogstored"
}
, { "address": 4256381
, "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
, "build_id_offset": 62077
, "function_name": "mog_queue_loop"
, "file_name": "/usr/local/bin/cmogstored"
}
, { "address": 140389362433493
, "build_id": "3d9441083d079dc2977f1bd50c8068d11767232d"
, "build_id_offset": 32213
, "function_name": "start_thread"
, "file_name": "/usr/lib64/libpthread-2.17.so"
}
, { "address": 140389359455917
, "build_id": "3c61131d1dac9da79b73188e7702bef786c2ad54"
, "build_id_offset": 1040045
, "function_name": "__clone"
, "file_name": "/usr/lib64/libc-2.17.so"
} ]
} ]
}
---
We are using the latest 1.8.0 release on SL 7
(5.8.7-1.el7.elrepo.x86_64) and here's what it's linked against:
---
# ldd -v /usr/local/bin/cmogstored
linux-vdso.so.1 => (0x00007ffc2898d000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5c0ccff000)
libc.so.6 => /lib64/libc.so.6 (0x00007f5c0c932000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5c0cf1b000)
Version information:
/usr/local/bin/cmogstored:
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.10) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.17) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
/lib64/libpthread.so.0:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
/lib64/libc.so.6:
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
---
Looking at http_close() it does not appear to really do all that much
and mog_rbuf_free() appears to already test to see if the rbuf pointer
is null before freeing it so I'm not sure what the issue is. (Sorry
I'm not really a C dev so don't have a strong grasp on what is
happening.) I'm not really sure how to debug this issue further, is
there any other data I could collect or something I can do to try and
track down the issue?
Thanks!
Xiao Yu
next reply other threads:[~2021-01-11 20:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-11 20:48 Xiao Yu [this message]
2021-01-11 21:26 ` Segfaults on http_close? Eric Wong
2021-01-17 9:51 ` Eric Wong
2021-01-20 5:21 ` Xiao Yu
2021-01-20 8:57 ` Eric Wong
2021-01-20 21:13 ` Xiao Yu
2021-01-20 21:22 ` Eric Wong
2021-01-25 17:36 ` Xiao Yu
2021-01-25 17:47 ` Eric Wong
2021-01-25 19:27 ` Xiao Yu
2021-02-12 6:54 ` Eric Wong
2021-02-12 21:18 ` Xiao Yu
2021-02-13 2:19 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://yhbt.net/cmogstored/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABfxMcXPr7q8o1ayRdn1x-Fukuh7-s3YG=04KX=vAzz4DYqhuQ@mail.gmail.com' \
--to=xyu@automattic.com \
--cc=cmogstored-public@yhbt.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhbt.net/cmogstored.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).