cmogstored dev/user discussion/issues/patches/etc
 help / color / mirror / code / Atom feed
* Segfaults on http_close?
@ 2021-01-11 20:48 Xiao Yu
  2021-01-11 21:26 ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Xiao Yu @ 2021-01-11 20:48 UTC (permalink / raw)
  To: cmogstored-public

Howdy, we are running a 96 node cmogstored cluster and have noticed
that when the cluster is busy with lots of writes we occasionally get
segfaults in cmogstored. This has happened 7 times in the past week
each time on a random and different cmogstored node. Looking at the
abrt backtrace of the core dump shows something similar to the
following in each instance:

---
{   "signal": 11
,   "executable": "/usr/local/bin/cmogstored"
,   "stacktrace":
      [ {   "crash_thread": true
        ,   "frames":
              [ {   "address": 140389358944542
                ,   "build_id": "3c61131d1dac9da79b73188e7702bef786c2ad54"
                ,   "build_id_offset": 528670
                ,   "function_name": "_int_free"
                ,   "file_name": "/usr/lib64/libc-2.17.so"
                }
              , {   "address": 4225373
                ,   "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
                ,   "build_id_offset": 31069
                ,   "function_name": "http_close"
                ,   "file_name": "/usr/local/bin/cmogstored"
                }
              , {   "address": 4228819
                ,   "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
                ,   "build_id_offset": 34515
                ,   "function_name": "mog_http_queue_step"
                ,   "file_name": "/usr/local/bin/cmogstored"
                }
              , {   "address": 4256381
                ,   "build_id": "9ca387b687027c0bac678943337d72b109fdf1e7"
                ,   "build_id_offset": 62077
                ,   "function_name": "mog_queue_loop"
                ,   "file_name": "/usr/local/bin/cmogstored"
                }
              , {   "address": 140389362433493
                ,   "build_id": "3d9441083d079dc2977f1bd50c8068d11767232d"
                ,   "build_id_offset": 32213
                ,   "function_name": "start_thread"
                ,   "file_name": "/usr/lib64/libpthread-2.17.so"
                }
              , {   "address": 140389359455917
                ,   "build_id": "3c61131d1dac9da79b73188e7702bef786c2ad54"
                ,   "build_id_offset": 1040045
                ,   "function_name": "__clone"
                ,   "file_name": "/usr/lib64/libc-2.17.so"
                } ]
        } ]
}
---

We are using the latest 1.8.0 release on SL 7
(5.8.7-1.el7.elrepo.x86_64) and here's what it's linked against:

---
# ldd -v /usr/local/bin/cmogstored
  linux-vdso.so.1 =>  (0x00007ffc2898d000)
  libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5c0ccff000)
  libc.so.6 => /lib64/libc.so.6 (0x00007f5c0c932000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f5c0cf1b000)

  Version information:
  /usr/local/bin/cmogstored:
    libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
    libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
    libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.10) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.17) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
  /lib64/libpthread.so.0:
    ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
    ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
    ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
    libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
    libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
    libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
  /lib64/libc.so.6:
    ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
    ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
---

Looking at http_close() it does not appear to really do all that much
and mog_rbuf_free() appears to already test to see if the rbuf pointer
is null before freeing it so I'm not sure what the issue is. (Sorry
I'm not really a C dev so don't have a strong grasp on what is
happening.) I'm not really sure how to debug this issue further, is
there any other data I could collect or something I can do to try and
track down the issue?

Thanks!
Xiao Yu

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-02-13  2:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-11 20:48 Segfaults on http_close? Xiao Yu
2021-01-11 21:26 ` Eric Wong
2021-01-17  9:51   ` Eric Wong
2021-01-20  5:21     ` Xiao Yu
2021-01-20  8:57       ` Eric Wong
2021-01-20 21:13         ` Xiao Yu
2021-01-20 21:22           ` Eric Wong
2021-01-25 17:36             ` Xiao Yu
2021-01-25 17:47               ` Eric Wong
2021-01-25 19:27                 ` Xiao Yu
2021-02-12  6:54                   ` Eric Wong
2021-02-12 21:18                     ` Xiao Yu
2021-02-13  2:19                       ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/cmogstored.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).