kgio RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: Ngan Pham <nganpham@gmail.com>
To: Eric Wong <e@80x24.org>
Cc: kgio-public@yhbt.net
Subject: Re: Patch for GC.compact memory issue
Date: Mon, 24 May 2021 12:00:26 -0700	[thread overview]
Message-ID: <CAAvYYt5Z5f2rMuXO5DMpR1-6uRvu_gXKDvqcyoZ+oNcLiTH39g@mail.gmail.com> (raw)
In-Reply-To: <20210524172208.GA7627@dcvr>

Hey Eric, actually there's 2 more references that need to be marked.
I've removed the comments since it's not really worth mentioning.
Here's the final patch:

diff --git a/ext/kgio/accept.c b/ext/kgio/accept.c
index f1de39d..4bd9cfe 100644
--- a/ext/kgio/accept.c
+++ b/ext/kgio/accept.c
@@ -498,9 +498,12 @@ void init_kgio_accept(void)
  rb_define_const(mKgio, "SOCK_CLOEXEC", INT2NUM(SOCK_CLOEXEC));

  localhost = rb_const_get(mKgio, rb_intern("LOCALHOST"));
+ rb_gc_register_mark_object(localhost);
  cKgio_Socket = rb_const_get(mKgio, rb_intern("Socket"));
+ rb_gc_register_mark_object(cKgio_Socket);
  cClientSocket = cKgio_Socket;
  mSocketMethods = rb_const_get(mKgio, rb_intern("SocketMethods"));
+ rb_gc_register_mark_object(mSocketMethods);

  rb_define_method(mSocketMethods, "kgio_addr!", addr_bang, 0);

Twitter conversation:
@tenderlove: It introduces a new function `GC.compact` which we added
to the `before_fork` Unicorn hook
@nganpham: Have you tried this on Unicorn 6.0? We seem to be getting
memory corruption. Specifically, the `env` hash is getting unexpected
data. More specifically, `env["REMOTE_ADDR"]` is resolving to random
non-string objects.
@nganpham: This was the big change:
https://github.com/defunkt/unicorn/commit/c917ac526df214611ec33c21de2b070452ec8434
@tenderlove: Can you send me a repro? I haven't heard of anything like it
@nganpham: Not at the moment :-(. We're only seeing it happen pretty
randomly...1 pod out of ~100 pods will exhibit this problem. I'm
trying to reproduce it. Will keep you posted.
@nganpham: No reproduction yet...but I've been digging into kgio and
noticed `addr` is never rb_gc_mark'd:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n25. maybe GC.compact
in before_fork botches the pointer?
https://github.com/defunkt/unicorn/blob/master/lib/unicorn/http_request.rb#L74…
seems to return the same "random" object for subsequent requests.
@tenderlove: Hmm. `addr` isn’t a Ruby object, so it needn’t be marked.
I would find the implementation of `kgio_addr`. It’s probably pointing
at a global that doesn’t get pinned.
@nganpham: So `kgio_addr` always returns the value of
`Kgio::LOCALHOST`, which is defined in Ruby:
https://yhbt.net/kgio.git/tree/lib/kgio.rb#n10. It's then assigned on
`accept` with `rb_const_get`:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n500. Pinning needed?
Or something has overwritten that global constant?
@nganpham: Yea, I think that's it. Ptr to `localhost` is only resolved
once at boot: https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n500
but doesn't get pinned. If object is moved during compact, `localhost`
now references a different Ruby obj:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n220
@tenderlove: Yep, that's the problem. The localhost C global needs to
be pinned. Just pass it to `rb_gc_register_mark_object` and that
should solve the issue. Thanks for finding this!!
@nganpham: Downgrading to Unicorn 5 seems to mitigated the issue for
us. My hypothesis is that in 5, `Unicorn::http://HttpRequest.new` is
allocated much earlier in the process so its internals are less likely
to be moved during compact. In 6, it’s initialized on request.
@nganpham: Regarding actually fixing this on kgio, do you have
contacts to maintainer(s)? Or should I follow the instructions on the
(super confusing) site? https://yhbt.net/kgio/
@tenderlove: Unfortunately I don't have any better contact with the
author than you. Though if you follow the instructions I'm sure he'll
help. My only advice is to make very sure you send a plain text email.
Alternatively I can send a patch tomorrow (up to you just lmk!)
@nganpham: I can take a crack it in the morning (will lyk). Also, I’ve
confirmed that this was affecting Unicorn 5, just much less so. Not
sure what version you’re running on, but I’d be curious to know if
you’re seeing `NoMethodError`s.
@nganpham: Patch submitted:
https://yhbt.net/kgio-public/CAAvYYt480L4izvVmLnOvQCe5N=bq3cBZ=FrDyQoo=jD3e0qfsg@mail.gmail.com/T/#u
@tenderlove: I think the other "const_gets" in that file probably need
the same treatment.
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n501 and
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n503 If a constant is
defined in pure Ruby (not with rb_define_class), then it's also
allowed to move.
@nganpham: oof! You're right! I will amend the patch. Good catch.

Thanks,
Ngan

On Mon, May 24, 2021 at 10:22 AM Eric Wong <e@80x24.org> wrote:
>
> Ngan Pham <nganpham@gmail.com> wrote:
> > Hello, I'd like to submit a patch to fix an issue after running
> > `GC.compact`. The LOCALHOST variable is defined in Ruby and is
> > resolved and memoized (to `localhost`) in C once. However, the pointer
> > in C is not GC marked. Running `GC.compact` potentially moves the
> > `LOCALHOST` object in Ruby causing `localhost` to point to a bad or
> > empty slot. This causes `env['REMOTE_ADDR']` to  a bad object in
> > Unicorn.
>
> Seems to make sense.
>
> > Some useful links:
> > - `GC.compact` introduced: https://bugs.ruby-lang.org/issues/15626.
> > - My conversation with tenderlove over twitter on this issue:
> > https://twitter.com/tenderlove/status/1396625048861954049
>
> Can you please copy+paste the conversation here?
> Accessibility-challenged users such as myself cannot
> read it without JavaScript.  Thanks.

  reply	other threads:[~2021-05-24 19:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-24 14:14 Patch for GC.compact memory issue Ngan Pham
2021-05-24 17:22 ` Eric Wong
2021-05-24 19:00   ` Ngan Pham [this message]
2021-05-24 20:33     ` Eric Wong
2021-05-24 21:11       ` Ngan Pham
2021-05-24 21:48         ` [PATCH] HACKING: update docs with "git request-pull" info Eric Wong
2021-05-25 16:41         ` Patch for GC.compact memory issue Aaron Patterson
2021-05-25 22:50           ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/kgio/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAvYYt5Z5f2rMuXO5DMpR1-6uRvu_gXKDvqcyoZ+oNcLiTH39g@mail.gmail.com \
    --to=nganpham@gmail.com \
    --cc=e@80x24.org \
    --cc=kgio-public@yhbt.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/kgio.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).