raindrops RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
To: Eric Wong <e@80x24.org>
Cc: raindrops-public@yhbt.net
Subject: Re: [PATCH] Allow Raindrops objects to be backed by a memfd file
Date: Mon, 22 Nov 2021 21:13:13 +1100	[thread overview]
Message-ID: <CAJ7wOOt20nZaeT-UM=cB5MXb_nogCZSOuNEYRdR73N8AFqfBng@mail.gmail.com> (raw)
In-Reply-To: <20211122084200.GA10182@dcvr>

On Mon, Nov 22, 2021 at 7:48 PM Eric Wong <e@80x24.org> wrote:
>
> KJ Tsanaktsidis <ktsanaktsidis@zendesk.com> wrote:
> > My use-case for this feature is that I want to collect memory statistics
> > of a Unicorn master process in a way that does not itself cause any
> > allocations. This is both because that would bias the measurement, but
> > more importantly because we very tightly control when the GC runs in our
> > Unicorn masters and any garbage created by high-frequency polling of
> > GC.stat would potentially live for a long time.
> >
> > With this solution, we can simply store values from rb_objspace into a
> > Raindrops counter directly, and read the values out-of-process for
> > submission to our metrics collection system.
>
> Thanks for the description and justification.  I will accept
> this feature, but some tweaks to the patch will be needed.
> If this is too much for you, I'll try to take care of
> adjustments (but still give you credit) sometime this week.
>
> See comments inline below...
>
> > diff --git a/ext/raindrops/extconf.rb b/ext/raindrops/extconf.rb
> > index 792e509..1ddcdf2 100644
> > --- a/ext/raindrops/extconf.rb
> > +++ b/ext/raindrops/extconf.rb
> > @@ -7,6 +7,7 @@
> >
> >  $CPPFLAGS += " -D_GNU_SOURCE "
> >  have_func('mremap', 'sys/mman.h')
> > +have_func('memfd_create', 'sys/mman.h')
> >  headers = %w(sys/types.h netdb.h string.h sys/socket.h netinet/in.h)
> >  if have_header('linux/tcp.h')
> >    headers << 'linux/tcp.h'
> > diff --git a/ext/raindrops/raindrops.c b/ext/raindrops/raindrops.c
> > index 837084c..c9bf8c8 100644
> > --- a/ext/raindrops/raindrops.c
> > +++ b/ext/raindrops/raindrops.c
> > @@ -34,6 +34,7 @@ struct raindrops {
> >       size_t size;
> >       size_t capa;
> >       pid_t pid;
> > +     int fd;
> >       struct raindrop *drops;
> >  };
>
> I suggest using "VALUE io" here.  I think it'll make some
> code below easier-to-follow and maintain while offering
> some flexibility for portability.
>
> The downside is higher memory use (few hundred for the IO object
> if the IO is used, 4 bytes if not on 64-bit).  Adding an rd_mark
> callback to call rb_gc_mark will be required, too.
>
> > @@ -47,6 +48,11 @@ static void rd_free(void *ptr)
> >               if (rv != 0)
> >                       rb_bug("munmap failed in gc: %s", strerror(errno));
> >       }
> > +     if (r->fd != -1) {
> > +       int rv = close(r->fd);
> > +       if (rv != 0)
> > +         rb_bug("close failed in gc: %s", strerror(errno));
> > +     }
>
> The using an IO here would eliminate this block.
>
> Regardless, whenever possible and reasonable, favor comparisons
> against zero instead of -1 since compilers can generate smaller
> code.  Pedantically it's not POSIX in some cases, but glibc does
> it, at least.
>
> Also, follow existing indentation conventions (this applies to
> any project(*)).  The C code here is mostly formatted the same way
> as git.git or the Linux kernel, so it uses hard tabs.
>
> (*) I have deal with indentation styles which annoy me for other
>     projects, too.
>
> >       xfree(ptr);
> >  }
> > @@ -88,34 +94,74 @@ static struct raindrops *get(VALUE self)
> >
> >  /*
> >   * call-seq:
> > - *   Raindrops.new(size)     -> raindrops object
> > + *   Raindrops.new(size, name: nil)  -> raindrops object
> >   *
> >   * Initializes a Raindrops object to hold +size+ counters.  +size+ is
> >   * only a hint and the actual number of counters the object has is
> >   * dependent on the CPU model, number of cores, and page size of
> >   * the machine.  The actual size of the object will always be equal
> >   * or greater than the specified +size+.
> > + * If +name+ is provided, and the platform is supported, the raindrop
> > + * memory region will be backed by a memfd object with the provided
> > + * name, so that it can be shared with other, non-child processes.
> >   */
> > -static VALUE init(VALUE self, VALUE size)
> > +static VALUE init(int argc, VALUE *argv, VALUE self)
> >  {
> >       struct raindrops *r = DATA_PTR(self);
> >       int tries = 1;
> >       size_t tmp;
> > +  VALUE size;
> > +  VALUE kwargs_hash;
> > +  ID kwargs_ids[1] = { rb_intern_const("name") };
>
> That may be C99-specific, and I think Ruby itself only went to
> C99 in 2.7 or 3.x?  In any case, I don't think it's appropriate
> to introduce C99 at this time.
>
> (I like using C99 in other projects, too, but I don't think it's
>  appropriate for us, just yet).
>
> > +  VALUE kwargs[1];
> > +  VALUE name;
> >
> >       if (r->drops != MAP_FAILED)
> >               rb_raise(rb_eRuntimeError, "already initialized");
> >
> > +  rb_scan_args(argc, argv, "1:", &size, &kwargs_hash);
> > +  rb_get_kwargs(kwargs_hash, kwargs_ids, 0, 1, kwargs);
>
> rb_get_kwargs is 2.2+ only, isn't it?
>
> I suggest using a Ruby wrapper and rename the C method.
>
> Since you mentioned you were using this for unicorn, that's still
> Ruby 1.9.3-compatible and will be 2.0+ for the next release.
>
> I'm OK with bumping raindrops to Ruby 2.0+, for now; but the
> requirement should be <= the Ruby version used by its users
> (e.g. unicorn).
>
> Fwiw, some other popular gems I keep an eye on (e.g. `sequel',
> `mail') tend to be conservative when it comes to Ruby version
> compatibility, too.
>
> For portability to non-Linux systems, I wonder if:
>
>         Raindrops.new(size, io: Tempfile.new('foo'))
>
> can also be supported.  I can give this a shot on a FreeBSD VM
> I sometimes have access to...
>
> And then maybe drop support for `name:' entirely:
>
>         Raindrops.new(size, io: Raindrops::Memfd.new('foo'))
>
> > +  name = kwargs[0];
> > +
> >       r->size = NUM2SIZET(size);
> >       if (r->size < 1)
> >               rb_raise(rb_eArgError, "size must be >= 1");
> >
> > +  if (name != Qundef && name != Qnil) {
>
> RTEST() reads more easily, and covers Qfalse.
>
> > +#ifdef HAVE_MEMFD_CREATE
> > +    r->fd = memfd_create(StringValueCStr(name), MFD_CLOEXEC);
> > +    if (r->fd == -1) {
> > +      int err = errno;
> > +      if (err == ENOSYS) {
> > +        rb_raise(rb_eRuntimeError, "system does not support memfd_create");
> > +      } else {
> > +        rb_raise(rb_eRuntimeError, "error calling memfd_create: %s", strerror(err));
> > +      }
>
> I don't think the special case for ENOSYS is necessary.  And
> rb_sys_fail("memfd_create") is sufficient (it matches the
> rb_sys_fail(funcname) pattern used in Ruby code.
>
> Something like:
>
>         int fd = memfd_create(StringValueCStr(name), MFD_CLOEXEC);
>         if (fd < 0)
>                 rb_sys_fail("memfd_create");
>         rd->io = rb_funcall(klass, rb_intern("for_fd"), 1, INT2NUM(fd));
>
> > +    }
> > +#else
> > +    rb_raise(rb_eRuntimeError, "extension not compiled with memfd_create");
> > +#endif
> > +  } else {
> > +    r->fd = -1;
> > +  }
> > +
> >       tmp = PAGE_ALIGN(raindrop_size * r->size);
> >       r->capa = tmp / raindrop_size;
> >       assert(PAGE_ALIGN(raindrop_size * r->capa) == tmp && "not aligned");
> >
> >  retry:
> > -     r->drops = mmap(NULL, tmp,
> > -                     PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
> > +  if (r->fd != -1) {
> > +    if (ftruncate(r->fd, tmp) == -1) {
> > +      r->drops = MAP_FAILED;
> > +    } else {
> > +    r->drops = mmap(NULL, tmp,
> > +                    PROT_READ|PROT_WRITE, MAP_SHARED, r->fd, 0);
> > +    }
> > +  } else {
> > +     r->drops = mmap(NULL, tmp,
> > +                     PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
> > +  }
> > +
> >       if (r->drops == MAP_FAILED) {
> >               int err = errno;
> >
> > @@ -153,6 +199,10 @@ static void resize(struct raindrops *r, size_t new_rd_size)
> >       if (r->pid != getpid())
> >               rb_raise(rb_eRuntimeError, "cannot mremap() from child");
> >
> > +     if (r->fd != -1) {
> > +       rb_raise(rb_eRuntimeError, "resize not implemented with named raindrops");
> > +     }
> > +
> >       rv = mremap(old_address, old_size, new_size, MREMAP_MAYMOVE);
> >       if (rv == MAP_FAILED) {
> >               int err = errno;
> > @@ -213,6 +263,18 @@ static VALUE capa(VALUE self)
> >       return SIZET2NUM(get(self)->capa);
> >  }
> >
> > +/*
> > + * call-seq:
> > + *   rd.fd    -> Integer
> > + *
> > + * Returns the file descriptor number associated with this Raindrop, if
> > + * it was created with a name.
> > + */
> > +static VALUE fd(VALUE self)
> > +{
> > +  return INT2NUM(get(self)->fd);
> > +}
>
> If we go with "VALUE io" as I suggested above, I suggest a
> "to_io" method.  That would allow compatibility with other APIs
> (e.g. IO.select), and access to IO#stat, IO#close_on_exec=,
> IO#ioctl, IO#fileno, and any number of potentially useful
> methods.
>
> I've never used memfd before, but the mmap-ed regions are usable
> after a regular file FD is closed.  So an FD-starved process
> could use rd.to_io.close and continue using the mmap-ed section
> as normal.
>
> Regardless, naming a method "fd" is unnecessary, the
> function/method called "fileno" means the same thing across
> C stdio.h, Ruby, Perl, and likely other languages.
>
> >  /*
> >   * call-seq:
> >   *   rd.dup          -> rd_copy
> > @@ -223,8 +285,9 @@ static VALUE init_copy(VALUE dest, VALUE source)
> >  {
> >       struct raindrops *dst = DATA_PTR(dest);
> >       struct raindrops *src = get(source);
> > +     VALUE init_argv[1] = { SIZET2NUM(src->size) };
> >
> > -     init(dest, SIZET2NUM(src->size));
> > +     init(1, init_argv, dest);
> >       memcpy(dst->drops, src->drops, raindrop_size * src->size);
>
> Same comments as before wrt. C compiler and Ruby compatibility.
>
> >       return dest;
> > @@ -372,6 +435,11 @@ static VALUE evaporate_bang(VALUE self)
> >       r->drops = MAP_FAILED;
> >       if (munmap(addr, raindrop_size * r->capa) != 0)
> >               rb_sys_fail("munmap");
> > +     if (r->fd != -1) {
> > +       if (close(r->fd) != 0)
> > +         rb_sys_fail("close");
> > +       r->fd = -1;
> > +     }
>
> This could just use rb_io_close.  No need to comment on the
> rest... (or I'm too sleepy atm and missed something :x)
>
> Anyways thanks again for bringing this up.  Let us know if you
> want to tackle handling my suggestions or if you want to wait
> for me to do it.
>

I think I committed the two cardinal sins of mailing list etiquette at
once - replied off list and in HTML /facepalm.

Copying my reply in here again.... sorry!

--------------------------------------------------------------

Hi Eric,

Thank you for the really detailed feedback on this patch! I haven’t really
written much in the way of Ruby C extensions before, so it’s really
valuable to me to get your thoughts.

I’m definitely happy to send another version of this later this week.

At a high level, I think your idea of allowing any file backing, rather than
just memfd, makes sense - both for portability and also because it’s easier
to share an actually linked file between processes in eg Kubernetes.

Anyway - I’ll be in touch soon with another patch, thanks!


  reply	other threads:[~2021-11-22 10:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22  1:03 [PATCH] Allow Raindrops objects to be backed by a memfd file KJ Tsanaktsidis
2021-11-22  8:42 ` Eric Wong
2021-11-22 10:13   ` KJ Tsanaktsidis [this message]
2021-11-22 17:56     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/raindrops/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ7wOOt20nZaeT-UM=cB5MXb_nogCZSOuNEYRdR73N8AFqfBng@mail.gmail.com' \
    --to=ktsanaktsidis@zendesk.com \
    --cc=e@80x24.org \
    --cc=raindrops-public@yhbt.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/raindrops.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).