From: Eric Wong <e@80x24.org>
To: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Cc: raindrops-public@yhbt.net
Subject: Re: [PATCH v2] Allow Raindrops objects to be backed by a file
Date: Fri, 26 Nov 2021 18:34:01 +0000 [thread overview]
Message-ID: <20211126183400.GA9579@dcvr> (raw)
In-Reply-To: <20211125065618.3432-1-ktsanaktsidis@zendesk.com>
KJ Tsanaktsidis <ktsanaktsidis@zendesk.com> wrote:
> Thanks for your feedback on my previous patch Eric, I think backing
> Raindrops by an arbitrary file (which can actually be on tmpfs) is much
> neater than harcdoding some stuff in there for memfd.
>
> Looking forward to hearing your thoughts on this one!
Thanks. A few minor things, more inline, but it looks good
and I'm inclined to squash the changes in the --range-diff
(below).
> ---------------------------------------
>
> Currently, all memory used by Raindrops is mapped as MAP_ANONYMOUS. This
<snip>
Note that applies to other projects, too: instead of a solid
"-------" line, using a "----8<----" line as below
(without indentation) for --scissors:
<prose not intended for a git commit message>
----8<----
Subject: [PATCH] Allow Raindrops objects to be backed by a file
<rest of the commit message + patch...>
The above allows maintainers to use `git am --scissors' to apply
a patch while ignoring any text above the "----8<----" line.
<snip>
> Note theat the provided IO object _must_ implement #truncate; this is
s/theat/that/ (I can squash this in on my end)
<snip>
> On Linux, counter values can easily be shared between processes
> in-memory only without touching the disk by passing in a File on a
> tmpfs as the io object.
AFAIK this applies to any OS with mmap support, and raindrops
already requires mmap. I'll remove the "On Linux," part before
pushing.
<snip>
> - * Raindrops.new(size) -> raindrops object
> - *
> - * Initializes a Raindrops object to hold +size+ counters. +size+ is
> - * only a hint and the actual number of counters the object has is
> - * dependent on the CPU model, number of cores, and page size of
> - * the machine. The actual size of the object will always be equal
> - * or greater than the specified +size+.
> + * This is the _acutal_ implementation of #initialize - the Ruby wrapper
s/acutal/actual/
> + * handles keyword-argument handling then calls this method.
> */
> -static VALUE init(VALUE self, VALUE size)
> +static VALUE init_cimpl(VALUE self, VALUE size, VALUE io, VALUE zero)
> {
> struct raindrops *r = DATA_PTR(self);
> int tries = 1;
> @@ -113,9 +116,18 @@ static VALUE init(VALUE self, VALUE size)
> r->capa = tmp / raindrop_size;
> assert(PAGE_ALIGN(raindrop_size * r->capa) == tmp && "not aligned");
>
> + r->io = io;
> +
> retry:
> - r->drops = mmap(NULL, tmp,
> - PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
> + if (RTEST(r->io)) {
> + rb_funcall(r->io, rb_intern("truncate"), 1, SIZET2NUM(tmp));
> + int fd = NUM2INT(rb_funcall(r->io, rb_intern("fileno"), 0));
I'll swap the lines so "int fd = ..." is first in the block
to avoid declaration-after-statement problems with older compilers.
> + r->drops = mmap(NULL, tmp,
> + PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> + } else {
> + r->drops = mmap(NULL, tmp,
> + PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
And I'll wrap this at 80-columns wrap (I need giant fonts).
> @@ -127,6 +139,10 @@ retry:
> }
> r->pid = getpid();
>
> + if (RTEST(zero)) {
> + memset(r->drops, 0, tmp);
> + }
No need for {} to waste vertical space. Again, I need giant
fonts :x Omitting unnecessary {} also matches Linux kernel
and git.git coding style.
> - * Duplicates and snapshots the current state of a Raindrops object.
> + * Duplicates and snapshots the current state of a Raindrops object. Even
> + * if the given Raindrops object is backed by a file, the copy will be backed
> + * by independent, anonymously mapped memory.
> */
> static VALUE init_copy(VALUE dest, VALUE source)
> {
> struct raindrops *dst = DATA_PTR(dest);
> struct raindrops *src = get(source);
>
> - init(dest, SIZET2NUM(src->size));
> + init_cimpl(dest, SIZET2NUM(src->size), Qnil, Qfalse);
> memcpy(dst->drops, src->drops, raindrop_size * src->size);
Switching from file-backed to anonymous seems strange to me,
here, but I'm not sure of a better solution that's compatible
with Object#dup ... I also don't know if anybody uses
Raindrops#dup, either. So if you're satisfied with this
behavior, that's good enough for me :>
> + * to_io -> IO
> + *
> + * Returns the IO object backing the memory for this raindrop, if
> + * one was specified when constructing this Raindrop. If this
> + * Raindrop is backed by anonymous memory, this method returns nil.
> + */
> +static VALUE to_io(VALUE self)
> +{
> + struct raindrops *r = get(self);
> + return r->io;
> +}
OK. There may be a need to call "to_io" on r->io itself
(because Tempfile delegates, but isn't actually an IO object
itself). I'm not sure, as examples found in the Ruby stdlib
seem to contradict each other:
* Zlib::GzipFile returns the ->io as-is:
https://public-inbox.org/ruby-core/be5f148bcd9f/s/?b=ext/zlib/zlib.c#n3233
* csv calls #to_io:
https://public-inbox.org/ruby-core/42e99435cbf2/s/?b=lib/csv.rb#n2102
I'm not sure which is correct, here, actually :x I'm inclined to
just leave it as-is, for now, following zlib's example. Any
future changes here can be done separately.
Everything else looked good. Below is the range-diff of the
minor things I'll squash in before pushing (probably in a few
days). No need to resend.
Also, was there anything else you had planned? (e.g. for memfd)
Otherwise, I'm thinking of tagging and releasing a new version
in a week or so (at least before Ruby 3.1 on Dec 25)
Thanks again.
Range-diff:
1: 79be82a ! 1: 18ee38a Allow Raindrops objects to be backed by a file
@@ Commit message
either by mmap'ing the file itself (or using Raindrops to do so), or by
making ordinary seek()/read() calls if performance is not a concern.
- Note theat the provided IO object _must_ implement #truncate; this is
+ Note that the provided IO object _must_ implement #truncate; this is
used to set the size of the file to be right-sized for the memory
mapping that is created.
@@ Commit message
attempt to msync the values, so they are not durable to e.g. system
crashes).
- On Linux, counter values can easily be shared between processes
+ Counter values can easily be shared between processes
in-memory only without touching the disk by passing in a File on a
tmpfs as the io object.
@@ ext/raindrops/raindrops.c: static struct raindrops *get(VALUE self)
- * dependent on the CPU model, number of cores, and page size of
- * the machine. The actual size of the object will always be equal
- * or greater than the specified +size+.
-+ * This is the _acutal_ implementation of #initialize - the Ruby wrapper
++ * This is the _actual_ implementation of #initialize - the Ruby wrapper
+ * handles keyword-argument handling then calls this method.
*/
-static VALUE init(VALUE self, VALUE size)
@@ ext/raindrops/raindrops.c: static VALUE init(VALUE self, VALUE size)
- r->drops = mmap(NULL, tmp,
- PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
+ if (RTEST(r->io)) {
-+ rb_funcall(r->io, rb_intern("truncate"), 1, SIZET2NUM(tmp));
+ int fd = NUM2INT(rb_funcall(r->io, rb_intern("fileno"), 0));
++ rb_funcall(r->io, rb_intern("truncate"), 1, SIZET2NUM(tmp));
+ r->drops = mmap(NULL, tmp,
+ PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+ } else {
+ r->drops = mmap(NULL, tmp,
-+ PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0);
++ PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED,
++ -1, 0);
+ }
if (r->drops == MAP_FAILED) {
int err = errno;
@@ ext/raindrops/raindrops.c: retry:
}
r->pid = getpid();
-+ if (RTEST(zero)) {
++ if (RTEST(zero))
+ memset(r->drops, 0, tmp);
-+ }
+
return self;
}
next prev parent reply other threads:[~2021-11-26 18:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-25 6:56 [PATCH v2] Allow Raindrops objects to be backed by a file KJ Tsanaktsidis
2021-11-26 18:34 ` Eric Wong [this message]
2021-11-29 0:13 ` KJ Tsanaktsidis
2021-11-30 6:49 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://yhbt.net/raindrops/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211126183400.GA9579@dcvr \
--to=e@80x24.org \
--cc=ktsanaktsidis@zendesk.com \
--cc=raindrops-public@yhbt.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhbt.net/raindrops.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).