kgio RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* Patch for GC.compact memory issue
@ 2021-05-24 14:14 Ngan Pham
  2021-05-24 17:22 ` Eric Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Ngan Pham @ 2021-05-24 14:14 UTC (permalink / raw)
  To: kgio-public

Hello, I'd like to submit a patch to fix an issue after running
`GC.compact`. The LOCALHOST variable is defined in Ruby and is
resolved and memoized (to `localhost`) in C once. However, the pointer
in C is not GC marked. Running `GC.compact` potentially moves the
`LOCALHOST` object in Ruby causing `localhost` to point to a bad or
empty slot. This causes `env['REMOTE_ADDR']` to  a bad object in
Unicorn.

Some useful links:
- `GC.compact` introduced: https://bugs.ruby-lang.org/issues/15626.
- My conversation with tenderlove over twitter on this issue:
https://twitter.com/tenderlove/status/1396625048861954049

Thanks,
Ngan

diff --git a/ext/kgio/accept.c b/ext/kgio/accept.c
index f1de39d..5a5f15a 100644
--- a/ext/kgio/accept.c
+++ b/ext/kgio/accept.c
@@ -497,7 +497,14 @@ void init_kgio_accept(void)
  */
  rb_define_const(mKgio, "SOCK_CLOEXEC", INT2NUM(SOCK_CLOEXEC));

+ /*
+ * Resolve LOCALHOST from Ruby land and memoize it for fast lookup.
+ * We have to GC mark the object since it's defined in Ruby and we
+ * don't want GC.compact to move it.
+ */
  localhost = rb_const_get(mKgio, rb_intern("LOCALHOST"));
+ rb_gc_register_mark_object(localhost);
+
  cKgio_Socket = rb_const_get(mKgio, rb_intern("Socket"));
  cClientSocket = cKgio_Socket;
  mSocketMethods = rb_const_get(mKgio, rb_intern("SocketMethods"));

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-24 14:14 Patch for GC.compact memory issue Ngan Pham
@ 2021-05-24 17:22 ` Eric Wong
  2021-05-24 19:00   ` Ngan Pham
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2021-05-24 17:22 UTC (permalink / raw)
  To: Ngan Pham; +Cc: kgio-public

Ngan Pham <nganpham@gmail.com> wrote:
> Hello, I'd like to submit a patch to fix an issue after running
> `GC.compact`. The LOCALHOST variable is defined in Ruby and is
> resolved and memoized (to `localhost`) in C once. However, the pointer
> in C is not GC marked. Running `GC.compact` potentially moves the
> `LOCALHOST` object in Ruby causing `localhost` to point to a bad or
> empty slot. This causes `env['REMOTE_ADDR']` to  a bad object in
> Unicorn.

Seems to make sense.

> Some useful links:
> - `GC.compact` introduced: https://bugs.ruby-lang.org/issues/15626.
> - My conversation with tenderlove over twitter on this issue:
> https://twitter.com/tenderlove/status/1396625048861954049

Can you please copy+paste the conversation here?
Accessibility-challenged users such as myself cannot
read it without JavaScript.  Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-24 17:22 ` Eric Wong
@ 2021-05-24 19:00   ` Ngan Pham
  2021-05-24 20:33     ` Eric Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Ngan Pham @ 2021-05-24 19:00 UTC (permalink / raw)
  To: Eric Wong; +Cc: kgio-public

Hey Eric, actually there's 2 more references that need to be marked.
I've removed the comments since it's not really worth mentioning.
Here's the final patch:

diff --git a/ext/kgio/accept.c b/ext/kgio/accept.c
index f1de39d..4bd9cfe 100644
--- a/ext/kgio/accept.c
+++ b/ext/kgio/accept.c
@@ -498,9 +498,12 @@ void init_kgio_accept(void)
  rb_define_const(mKgio, "SOCK_CLOEXEC", INT2NUM(SOCK_CLOEXEC));

  localhost = rb_const_get(mKgio, rb_intern("LOCALHOST"));
+ rb_gc_register_mark_object(localhost);
  cKgio_Socket = rb_const_get(mKgio, rb_intern("Socket"));
+ rb_gc_register_mark_object(cKgio_Socket);
  cClientSocket = cKgio_Socket;
  mSocketMethods = rb_const_get(mKgio, rb_intern("SocketMethods"));
+ rb_gc_register_mark_object(mSocketMethods);

  rb_define_method(mSocketMethods, "kgio_addr!", addr_bang, 0);

Twitter conversation:
@tenderlove: It introduces a new function `GC.compact` which we added
to the `before_fork` Unicorn hook
@nganpham: Have you tried this on Unicorn 6.0? We seem to be getting
memory corruption. Specifically, the `env` hash is getting unexpected
data. More specifically, `env["REMOTE_ADDR"]` is resolving to random
non-string objects.
@nganpham: This was the big change:
https://github.com/defunkt/unicorn/commit/c917ac526df214611ec33c21de2b070452ec8434
@tenderlove: Can you send me a repro? I haven't heard of anything like it
@nganpham: Not at the moment :-(. We're only seeing it happen pretty
randomly...1 pod out of ~100 pods will exhibit this problem. I'm
trying to reproduce it. Will keep you posted.
@nganpham: No reproduction yet...but I've been digging into kgio and
noticed `addr` is never rb_gc_mark'd:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n25. maybe GC.compact
in before_fork botches the pointer?
https://github.com/defunkt/unicorn/blob/master/lib/unicorn/http_request.rb#L74…
seems to return the same "random" object for subsequent requests.
@tenderlove: Hmm. `addr` isn’t a Ruby object, so it needn’t be marked.
I would find the implementation of `kgio_addr`. It’s probably pointing
at a global that doesn’t get pinned.
@nganpham: So `kgio_addr` always returns the value of
`Kgio::LOCALHOST`, which is defined in Ruby:
https://yhbt.net/kgio.git/tree/lib/kgio.rb#n10. It's then assigned on
`accept` with `rb_const_get`:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n500. Pinning needed?
Or something has overwritten that global constant?
@nganpham: Yea, I think that's it. Ptr to `localhost` is only resolved
once at boot: https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n500
but doesn't get pinned. If object is moved during compact, `localhost`
now references a different Ruby obj:
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n220
@tenderlove: Yep, that's the problem. The localhost C global needs to
be pinned. Just pass it to `rb_gc_register_mark_object` and that
should solve the issue. Thanks for finding this!!
@nganpham: Downgrading to Unicorn 5 seems to mitigated the issue for
us. My hypothesis is that in 5, `Unicorn::http://HttpRequest.new` is
allocated much earlier in the process so its internals are less likely
to be moved during compact. In 6, it’s initialized on request.
@nganpham: Regarding actually fixing this on kgio, do you have
contacts to maintainer(s)? Or should I follow the instructions on the
(super confusing) site? https://yhbt.net/kgio/
@tenderlove: Unfortunately I don't have any better contact with the
author than you. Though if you follow the instructions I'm sure he'll
help. My only advice is to make very sure you send a plain text email.
Alternatively I can send a patch tomorrow (up to you just lmk!)
@nganpham: I can take a crack it in the morning (will lyk). Also, I’ve
confirmed that this was affecting Unicorn 5, just much less so. Not
sure what version you’re running on, but I’d be curious to know if
you’re seeing `NoMethodError`s.
@nganpham: Patch submitted:
https://yhbt.net/kgio-public/CAAvYYt480L4izvVmLnOvQCe5N=bq3cBZ=FrDyQoo=jD3e0qfsg@mail.gmail.com/T/#u
@tenderlove: I think the other "const_gets" in that file probably need
the same treatment.
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n501 and
https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n503 If a constant is
defined in pure Ruby (not with rb_define_class), then it's also
allowed to move.
@nganpham: oof! You're right! I will amend the patch. Good catch.

Thanks,
Ngan

On Mon, May 24, 2021 at 10:22 AM Eric Wong <e@80x24.org> wrote:
>
> Ngan Pham <nganpham@gmail.com> wrote:
> > Hello, I'd like to submit a patch to fix an issue after running
> > `GC.compact`. The LOCALHOST variable is defined in Ruby and is
> > resolved and memoized (to `localhost`) in C once. However, the pointer
> > in C is not GC marked. Running `GC.compact` potentially moves the
> > `LOCALHOST` object in Ruby causing `localhost` to point to a bad or
> > empty slot. This causes `env['REMOTE_ADDR']` to  a bad object in
> > Unicorn.
>
> Seems to make sense.
>
> > Some useful links:
> > - `GC.compact` introduced: https://bugs.ruby-lang.org/issues/15626.
> > - My conversation with tenderlove over twitter on this issue:
> > https://twitter.com/tenderlove/status/1396625048861954049
>
> Can you please copy+paste the conversation here?
> Accessibility-challenged users such as myself cannot
> read it without JavaScript.  Thanks.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-24 19:00   ` Ngan Pham
@ 2021-05-24 20:33     ` Eric Wong
  2021-05-24 21:11       ` Ngan Pham
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2021-05-24 20:33 UTC (permalink / raw)
  To: Ngan Pham; +Cc: kgio-public, Aaron Patterson

Ngan Pham <nganpham@gmail.com> wrote:
> Hey Eric, actually there's 2 more references that need to be marked.

Thanks, will apply those.  More about this at bottom.. +Cc tenderlove

> @nganpham: Regarding actually fixing this on kgio, do you have
> contacts to maintainer(s)? Or should I follow the instructions on the
> (super confusing) site? https://yhbt.net/kgio/

Side note: Sorry if you found it confusing, I figured "README"
  and "HACKING" are clearly labeled.  Not sure how to improve on
  a directory listing for ease-of-navigation (works with "lftp" :>)

> @tenderlove: I think the other "const_gets" in that file probably need
> the same treatment.
> https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n501 and
> https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n503 If a constant is
> defined in pure Ruby (not with rb_define_class), then it's also
> allowed to move.
> @nganpham: oof! You're right! I will amend the patch. Good catch.

OK, so rb_const_get for things like Socket, Errno::EPIPE,
Errno::ECONNRESET are fine (for now)?

Anyways, I hope to find time to drop kgio from unicorn later
this year (I started a few years ago...).

And I really wish unicorn never became popular, it was only
created to support buggy legacy apps that were too expensive to
fix.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-24 20:33     ` Eric Wong
@ 2021-05-24 21:11       ` Ngan Pham
  2021-05-24 21:48         ` [PATCH] HACKING: update docs with "git request-pull" info Eric Wong
  2021-05-25 16:41         ` Patch for GC.compact memory issue Aaron Patterson
  0 siblings, 2 replies; 8+ messages in thread
From: Ngan Pham @ 2021-05-24 21:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: kgio-public, Aaron Patterson

> Side note: Sorry if you found it confusing, I figured "README"
>   and "HACKING" are clearly labeled.  Not sure how to improve on
>   a directory listing for ease-of-navigation (works with "lftp" :>)

My apologies! After actually spending time and reading your
docs, it's pretty straight forward. Perhaps I'm spoiled with GitHub
pull requests and whatnot. The HACKING doc also says "pull
requests" welcome, although I'm not sure where...

> OK, so rb_const_get for things like Socket, Errno::EPIPE,
> Errno::ECONNRESET are fine (for now)?

I'll defer to Aaron. :-)

> Anyways, I hope to find time to drop kgio from unicorn later
> this year (I started a few years ago...).

Cool.

> And I really wish unicorn never became popular, it was only
> created to support buggy legacy apps that were too expensive to
> fix.

Shame you feel this way. Unicorn is amazing work and empowered
people/businesses across the world to do their job. You should be proud.

On Mon, May 24, 2021 at 1:33 PM Eric Wong <e@80x24.org> wrote:
>
> Ngan Pham <nganpham@gmail.com> wrote:
> > Hey Eric, actually there's 2 more references that need to be marked.
>
> Thanks, will apply those.  More about this at bottom.. +Cc tenderlove
>
> > @nganpham: Regarding actually fixing this on kgio, do you have
> > contacts to maintainer(s)? Or should I follow the instructions on the
> > (super confusing) site? https://yhbt.net/kgio/
>
> Side note: Sorry if you found it confusing, I figured "README"
>   and "HACKING" are clearly labeled.  Not sure how to improve on
>   a directory listing for ease-of-navigation (works with "lftp" :>)
>
> > @tenderlove: I think the other "const_gets" in that file probably need
> > the same treatment.
> > https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n501 and
> > https://yhbt.net/kgio.git/tree/ext/kgio/accept.c#n503 If a constant is
> > defined in pure Ruby (not with rb_define_class), then it's also
> > allowed to move.
> > @nganpham: oof! You're right! I will amend the patch. Good catch.
>
> OK, so rb_const_get for things like Socket, Errno::EPIPE,
> Errno::ECONNRESET are fine (for now)?
>
> Anyways, I hope to find time to drop kgio from unicorn later
> this year (I started a few years ago...).
>
> And I really wish unicorn never became popular, it was only
> created to support buggy legacy apps that were too expensive to
> fix.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] HACKING: update docs with "git request-pull" info
  2021-05-24 21:11       ` Ngan Pham
@ 2021-05-24 21:48         ` Eric Wong
  2021-05-25 16:41         ` Patch for GC.compact memory issue Aaron Patterson
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Wong @ 2021-05-24 21:48 UTC (permalink / raw)
  To: Ngan Pham; +Cc: kgio-public

Ngan Pham <nganpham@gmail.com> wrote:
> Eric Wong wrote:
> > Side note: Sorry if you found it confusing, I figured "README"
> >   and "HACKING" are clearly labeled.  Not sure how to improve on
> >   a directory listing for ease-of-navigation (works with "lftp" :>)
> 
> My apologies! After actually spending time and reading your
> docs, it's pretty straight forward. Perhaps I'm spoiled with GitHub
> pull requests and whatnot. The HACKING doc also says "pull
> requests" welcome, although I'm not sure where...

No worries, thanks for the feedback; I guess "git request-pull"
is not well-known.  I don't care where people host their repos,
as long as it's anonymously accessible from any country via
"git fetch".

----------8<---------
Subject: [PATCH] HACKING: update docs with "git request-pull" info

"git request-pull" is not a well-known command, so add a
mention about it to avoid misleading people into thinking
we use a centralized service.

While we're in the area, put an extra link to the HTTPS
archives, since SMTP-based subscriptions are not
centralization-resistant.

Thanks to Ngan Pham for the feedback.

Link: https://yhbt.net/kgio-public/CAAvYYt6ruhhZcA4OmF=NQZ+-j0RJ0vT5f33tUCrREezo5mUCVQ@mail.gmail.com/
---
 HACKING | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/HACKING b/HACKING
index e124c33..55bdd99 100644
--- a/HACKING
+++ b/HACKING
@@ -23,11 +23,12 @@ characters wide) and NOT the indentation style of Matz Ruby.
 
 == Contributing
 
-Contributions are welcome in the form of patches, pull requests, code
-review, testing, documentation, user support or any other feedback.  The
-{kgio mailing list}[mailto:kgio-public@yhbt.net] is the
-central coordination point for all user and developer feedback and bug
-reports.
+Contributions are welcome in the form of patches, pull requests
+(format email with "git request-pull"), code review, testing,
+documentation, user support or any other feedback.
+<mailto:kgio-public@yhbt.net> <https://yhbt.net/kgio-public/> is the
+current coordination point for all user and developer feedback and
+bug reports (domain is subject to change since ICANN cannot be trusted).
 
 === Submitting Patches
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-24 21:11       ` Ngan Pham
  2021-05-24 21:48         ` [PATCH] HACKING: update docs with "git request-pull" info Eric Wong
@ 2021-05-25 16:41         ` Aaron Patterson
  2021-05-25 22:50           ` Eric Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Aaron Patterson @ 2021-05-25 16:41 UTC (permalink / raw)
  To: Ngan Pham; +Cc: Eric Wong, kgio-public

On Mon, May 24, 2021 at 2:11 PM Ngan Pham <nganpham@gmail.com> wrote:
>
> > Side note: Sorry if you found it confusing, I figured "README"
> >   and "HACKING" are clearly labeled.  Not sure how to improve on
> >   a directory listing for ease-of-navigation (works with "lftp" :>)
>
> My apologies! After actually spending time and reading your
> docs, it's pretty straight forward. Perhaps I'm spoiled with GitHub
> pull requests and whatnot. The HACKING doc also says "pull
> requests" welcome, although I'm not sure where...
>
> > OK, so rb_const_get for things like Socket, Errno::EPIPE,
> > Errno::ECONNRESET are fine (for now)?
>
> I'll defer to Aaron. :-)


Yes, they should be fine.  It looks like they're all defined with
`rb_define_const`, and that function adds them to a list that gets
pinned.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Patch for GC.compact memory issue
  2021-05-25 16:41         ` Patch for GC.compact memory issue Aaron Patterson
@ 2021-05-25 22:50           ` Eric Wong
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2021-05-25 22:50 UTC (permalink / raw)
  To: Aaron Patterson, Ngan Pham; +Cc: kgio-public

Thanks all, pushed as 4db359156adb5aed102ec853f56c030ba40b1eca
with the commit message below.  Will release in a bit..

-----8<-----
From: Ngan Pham <nganpham@gmail.com>
Date: Mon, 24 May 2021 20:34:36 +0000
Subject: [PATCH] fix compatibility with GC.compact

Constants defined in Ruby need to be explicitly marked
when made static inside a C extension.

Link: https://yhbt.net/kgio-public/CAAvYYt5Z5f2rMuXO5DMpR1-6uRvu_gXKDvqcyoZ+oNcLiTH39g@mail.gmail.com/T/
Commit-message-by: Eric Wong <e@80x24.org>
---
 ext/kgio/accept.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/ext/kgio/accept.c b/ext/kgio/accept.c
index f1de39d..4bd9cfe 100644
--- a/ext/kgio/accept.c
+++ b/ext/kgio/accept.c
@@ -498,9 +498,12 @@ void init_kgio_accept(void)
 	rb_define_const(mKgio, "SOCK_CLOEXEC", INT2NUM(SOCK_CLOEXEC));
 
 	localhost = rb_const_get(mKgio, rb_intern("LOCALHOST"));
+	rb_gc_register_mark_object(localhost);
 	cKgio_Socket = rb_const_get(mKgio, rb_intern("Socket"));
+	rb_gc_register_mark_object(cKgio_Socket);
 	cClientSocket = cKgio_Socket;
 	mSocketMethods = rb_const_get(mKgio, rb_intern("SocketMethods"));
+	rb_gc_register_mark_object(mSocketMethods);
 
 	rb_define_method(mSocketMethods, "kgio_addr!", addr_bang, 0);
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-25 22:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-24 14:14 Patch for GC.compact memory issue Ngan Pham
2021-05-24 17:22 ` Eric Wong
2021-05-24 19:00   ` Ngan Pham
2021-05-24 20:33     ` Eric Wong
2021-05-24 21:11       ` Ngan Pham
2021-05-24 21:48         ` [PATCH] HACKING: update docs with "git request-pull" info Eric Wong
2021-05-25 16:41         ` Patch for GC.compact memory issue Aaron Patterson
2021-05-25 22:50           ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/kgio.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).