Some benchmarks

io_splice RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed

* Some benchmarks
       [not found] <AANLkTi=M+qGa7G1PvYyB+fbJUzrersJQwtDkct3hZEiy@mail.gmail.com>
@ 2010-12-22 14:01 ` Iñaki Baz Castillo
  2010-12-22 19:56   ` Eric Wong
  0 siblings, 1 reply; 7+ messages in thread
From: Iñaki Baz Castillo @ 2010-12-22 14:01 UTC (permalink / raw)
  To: ruby.io.splice

Hi, I've done some benchamarks comparing FileUtils vs io_splice when
copying files in my computer (AMD 64 Phenom II X4 965):


Test data:
- Source file size:  496 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.00013375282287597656
- FileUtils.copy_stream:   0.0001666545867919922
- IO.splice:               6.341934204101562e-05


Test data:
- Source file size:  496 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.0027534961700439453
- FileUtils.copy_stream:   0.002769947052001953
- IO.splice:               0.0014998912811279297


Test data:
- Source file size:  496 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            0.02746438980102539
- FileUtils.copy_stream:   0.018292665481567383
- IO.splice:               0.010256767272949219


Test data:
- Source file size:  496 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            0.25453615188598633
- FileUtils.copy_stream:   0.13935613632202148
- IO.splice:               0.05126452445983887


Test data:
- Source file size:  16102 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.00014328956604003906
- FileUtils.copy_stream:   0.0004949569702148438
- IO.splice:               0.00028967857360839844


Test data:
- Source file size:  16102 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.0018320083618164062
- FileUtils.copy_stream:   0.001699686050415039
- IO.splice:               0.0004978179931640625


Test data:
- Source file size:  16102 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            0.039102792739868164
- FileUtils.copy_stream:   0.0330507755279541
- IO.splice:               0.01671743392944336


Test data:
- Source file size:  16102 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            0.3610107898712158
- FileUtils.copy_stream:   0.35822200775146484
- IO.splice:               0.08929753303527832


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.001172780990600586
- FileUtils.copy_stream:   0.0011954307556152344
- IO.splice:               0.0009520053863525391


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.08811688423156738
- FileUtils.copy_stream:   0.09790825843811035
- IO.splice:               0.014630317687988281


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            1.1194334030151367
- FileUtils.copy_stream:   1.5543222427368164
- IO.splice:               0.0931394100189209


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            12.707785606384277
- FileUtils.copy_stream:   13.745135068893433
- IO.splice:               9.723489761352539



The script is below.

It's clear that using io_splice is good for big files (or lot of
copies from same small source file).

I don't understand why in the last test (big file, 1000 copies)
io_splice takes so long, maybe it takes more time initializing each
object within the benchmark block?




Script:
---------------------------------------------------------
#!/usr/bin/ruby

require "fileutils"
require "benchmark"
require "io/splice"


SRC_FILE = ARGV[0]
DST_FILE = ARGV[1]
TIMES = ( ARGV[2] ? ARGV[2].to_i : 1 )


puts "Test data:"
puts "- Source file size:  #{File.size(SRC_FILE)} bytes"
puts "- Number of iterations:  #{TIMES}"


puts "Results:"

print "- FileUtils.cp:            "
puts Benchmark.realtime {
 TIMES.times do
   FileUtils.cp SRC_FILE, DST_FILE
 end
}


print "- FileUtils.copy_stream:   "
puts Benchmark.realtime {
 TIMES.times do
   FileUtils.copy_stream SRC_FILE, DST_FILE
 end
}


print "- IO.splice:               "
puts Benchmark.realtime {
 TIMES.times do |n|
   source = File.open(SRC_FILE, 'rb')
   dest = File.open(DST_FILE + "_#{n}", 'wb')
   source_fd = source.fileno
   dest_fd = dest.fileno

   # We use a pipe as a ring buffer in kernel space.
   # pipes may store up to IO::Splice::PIPE_CAPA bytes
   pipe = IO.pipe
   rfd, wfd = pipe.map { |io| io.fileno }

   begin
     nread = begin
       # first pull as many bytes as possible into the pipe
       IO.splice(source_fd, nil, wfd, nil, IO::Splice::PIPE_CAPA, 0)
     rescue EOFError
       break
     end

     # now move the contents of the pipe buffer into the destination file
     # the copied data never enters userspace
     nwritten = IO.splice(rfd, nil, dest_fd, nil, nread, 0)

     nwritten == nread or
       abort "short splice to destination file: #{nwritten} != #{nread}"
   end while true
 end
}
---------------------------------------------------------


I've tryed to declare source, source_fd and pipe = IO.pipe before the
benchmark block but then I get empty copied files (I need to declare
all of them within the benchmark block). I assume the test script can
be improved.



Regards.


--
Iñaki Baz Castillo
<ibc@aliax.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-22 14:01 ` Some benchmarks Iñaki Baz Castillo
@ 2010-12-22 19:56   ` Eric Wong
  2010-12-23 15:41     ` Iñaki Baz Castillo
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Wong @ 2010-12-22 19:56 UTC (permalink / raw)
  To: ruby.io.splice

Iñaki Baz Castillo <ibc@aliax.net> wrote:
> Hi, I've done some benchamarks comparing FileUtils vs io_splice when
> copying files in my computer (AMD 64 Phenom II X4 965):

<snip>
> Test data:
> - Source file size:  1156222 bytes
> - Number of iterations:  1000
> Results:
> - FileUtils.cp:            12.707785606384277
> - FileUtils.copy_stream:   13.745135068893433
> - IO.splice:               9.723489761352539
> 
> 
> The script is below.
> 
> It's clear that using io_splice is good for big files (or lot of
> copies from same small source file).
> 
> I don't understand why in the last test (big file, 1000 copies)
> io_splice takes so long, maybe it takes more time initializing each
> object within the benchmark block?

It'ls likely the test wrote enough to force blocking writes to disk,
and your disk is now the bottleneck.  In that case, all the memory
tricks in the world won't help :)

Try it on a tmpfs mount.

Also, relying on GC to close file descriptors could be a small
performance problem given the number of iterations.

> Script:
> ---------------------------------------------------------
> #!/usr/bin/ruby
> 
> require "fileutils"
> require "benchmark"
> require "io/splice"
> 
> 
> SRC_FILE = ARGV[0]
> DST_FILE = ARGV[1]
> TIMES = ( ARGV[2] ? ARGV[2].to_i : 1 )
> 
> 
> puts "Test data:"
> puts "- Source file size:  #{File.size(SRC_FILE)} bytes"
> puts "- Number of iterations:  #{TIMES}"
> 
> 
> puts "Results:"
> 
> print "- FileUtils.cp:            "
> puts Benchmark.realtime {
>  TIMES.times do
>    FileUtils.cp SRC_FILE, DST_FILE
>  end
> }
> 
> 
> print "- FileUtils.copy_stream:   "
> puts Benchmark.realtime {
>  TIMES.times do
>    FileUtils.copy_stream SRC_FILE, DST_FILE
>  end
> }
> 
> 
> print "- IO.splice:               "
> puts Benchmark.realtime {
>  TIMES.times do |n|
>    source = File.open(SRC_FILE, 'rb')
>    dest = File.open(DST_FILE + "_#{n}", 'wb')
>    source_fd = source.fileno
>    dest_fd = dest.fileno
> 
>    # We use a pipe as a ring buffer in kernel space.
>    # pipes may store up to IO::Splice::PIPE_CAPA bytes
>    pipe = IO.pipe

You can reuse the pipe object through multiple runs assuming you drain
it properly.

>    rfd, wfd = pipe.map { |io| io.fileno }
> 
>    begin
>      nread = begin
>        # first pull as many bytes as possible into the pipe
>        IO.splice(source_fd, nil, wfd, nil, IO::Splice::PIPE_CAPA, 0)
>      rescue EOFError
>        break
>      end
> 
>      # now move the contents of the pipe buffer into the destination file
>      # the copied data never enters userspace
>      nwritten = IO.splice(rfd, nil, dest_fd, nil, nread, 0)
> 
>      nwritten == nread or
>        abort "short splice to destination file: #{nwritten} != #{nread}"
>    end while true
>  end
> }
> ---------------------------------------------------------
> 
> I've tryed to declare source, source_fd and pipe = IO.pipe before the
> benchmark block but then I get empty copied files (I need to declare
> all of them within the benchmark block). I assume the test script can
> be improved.

You need to rewind source if you don't specify an input offset for splice.
Likewise for the dest and destination offset.  The open+close is part of
the test for the FileUtils things, though, so I would just open close
(you were leaving the close up to GC, which usually fine for MRI but not
for benchmark purposes).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-22 19:56   ` Eric Wong
@ 2010-12-23 15:41     ` Iñaki Baz Castillo
  2010-12-23 18:06       ` Eric Wong
  2010-12-27 17:33       ` Iñaki Baz Castillo
  0 siblings, 2 replies; 7+ messages in thread
From: Iñaki Baz Castillo @ 2010-12-23 15:41 UTC (permalink / raw)
  To: ruby.io.splice

2010/12/22 Eric Wong <normalperson@yhbt.net>:
>> I don't understand why in the last test (big file, 1000 copies)
>> io_splice takes so long, maybe it takes more time initializing each
>> object within the benchmark block?
>
> It'ls likely the test wrote enough to force blocking writes to disk,
> and your disk is now the bottleneck.  In that case, all the memory
> tricks in the world won't help :)

Aha, it makes sense.


> Try it on a tmpfs mount.

Yes, that would be my next test.


> Also, relying on GC to close file descriptors could be a small
> performance problem given the number of iterations.

Well, this is just a strange use case test. Indeed I should call
File.close by myself.



>>    # We use a pipe as a ring buffer in kernel space.
>>    # pipes may store up to IO::Splice::PIPE_CAPA bytes
>>    pipe = IO.pipe
>
> You can reuse the pipe object through multiple runs assuming you drain
> it properly.

So I need to learn what "draining a pipe" means. I assume I need to
empty/rewind it. I suppose there ar API methods for this.



>> I've tryed to declare source, source_fd and pipe = IO.pipe before the
>> benchmark block but then I get empty copied files (I need to declare
>> all of them within the benchmark block). I assume the test script can
>> be improved.
>
> You need to rewind source if you don't specify an input offset for splice.
> Likewise for the dest and destination offset.  The open+close is part of
> the test for the FileUtils things, though, so I would just open close
> (you were leaving the close up to GC, which usually fine for MRI but not
> for benchmark purposes).

Ok. Thanks a lot!



-- 
Iñaki Baz Castillo
<ibc@aliax.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-23 15:41     ` Iñaki Baz Castillo
@ 2010-12-23 18:06       ` Eric Wong
  2010-12-27 10:01         ` Iñaki Baz Castillo
  2010-12-27 17:33       ` Iñaki Baz Castillo
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Wong @ 2010-12-23 18:06 UTC (permalink / raw)
  To: ruby.io.splice

Iñaki Baz Castillo <ibc@aliax.net> wrote:
> 2010/12/22 Eric Wong <normalperson@yhbt.net>:
> > You can reuse the pipe object through multiple runs assuming you drain
> > it properly.
> 
> So I need to learn what "draining a pipe" means. I assume I need to
> empty/rewind it. I suppose there ar API methods for this.

Just ensure you read everything from the pipe between each iteration
(which you seem to be doing).  A pipe is just a ring buffer inside the
kernel.  Traditionally write() puts data into the buffer, read() removes
it, and nowadays splice() can do it without copying the data into
userspace.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-23 18:06       ` Eric Wong
@ 2010-12-27 10:01         ` Iñaki Baz Castillo
  0 siblings, 0 replies; 7+ messages in thread
From: Iñaki Baz Castillo @ 2010-12-27 10:01 UTC (permalink / raw)
  To: ruby.io.splice

2010/12/23 Eric Wong <normalperson@yhbt.net>:
> Just ensure you read everything from the pipe between each iteration
> (which you seem to be doing).  A pipe is just a ring buffer inside the
> kernel.  Traditionally write() puts data into the buffer, read() removes
> it, and nowadays splice() can do it without copying the data into
> userspace.

Ok, thanks a lot. I'm coding a server using EventMachine, and the
reactor process communicates with other processes by Posix_MQ. In case
the message body is too big then I plan the reactor to copy the body
data to a file (maybe ramfs) and pass the file path to the worker
processes who would read it.

So I'll come back with some questions soon :)



-- 
Iñaki Baz Castillo
<ibc@aliax.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-23 15:41     ` Iñaki Baz Castillo
  2010-12-23 18:06       ` Eric Wong
@ 2010-12-27 17:33       ` Iñaki Baz Castillo
  2010-12-27 21:38         ` Eric Wong
  1 sibling, 1 reply; 7+ messages in thread
From: Iñaki Baz Castillo @ 2010-12-27 17:33 UTC (permalink / raw)
  To: ruby.io.splice

[-- Attachment #1: Type: text/plain, Size: 1850 bytes --]

2010/12/23 Iñaki Baz Castillo <ibc@aliax.net>:
>> Try it on a tmpfs mount.
>
> Yes, that would be my next test.

Hummm, I get "non expected" results when using a tmpfs as destination
for the copy. I attach a new test file which makes usage of IO splice
in different ways ("manual", using IO objects and using file names).

Results:


Test data:
- Source file size:  464 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:                     0.0004868507385253906
- FileUtils.copy_stream:            7.581710815429688e-05
- IO.splice (manual):               0.0004825592041015625
- IO.splice (using IO):             0.00017380714416503906
- IO.splice (using file names):     0.00012922286987304688


Test data:
- Source file size:  464 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:                     0.005738258361816406
- FileUtils.copy_stream:            0.0026025772094726562
- IO.splice (manual):               0.038815975189208984
- IO.splice (using IO):             0.016385793685913086
- IO.splice (using file names):     0.009386062622070312


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:                     0.0017499923706054688
- FileUtils.copy_stream:            0.0020220279693603516
- IO.splice (manual):               0.001650094985961914
- IO.splice (using IO):             0.001832723617553711
- IO.splice (using file names):     0.0017614364624023438


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:                     0.20581936836242676
- FileUtils.copy_stream:            0.19777441024780273
- IO.splice (manual):               0.1709754467010498
- IO.splice (using IO):             0.19835829734802246
- IO.splice (using file names):     0.1958324909210205

-- 
Iñaki Baz Castillo
<ibc@aliax.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some benchmarks
  2010-12-27 17:33       ` Iñaki Baz Castillo
@ 2010-12-27 21:38         ` Eric Wong
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2010-12-27 21:38 UTC (permalink / raw)
  To: ruby.io.splice

Iñaki Baz Castillo <ibc@aliax.net> wrote:
> 2010/12/23 Iñaki Baz Castillo <ibc@aliax.net>:
> >> Try it on a tmpfs mount.
> >
> > Yes, that would be my next test.
> 
> Hummm, I get "non expected" results when using a tmpfs as destination
> for the copy. I attach a new test file which makes usage of IO splice
> in different ways ("manual", using IO objects and using file names).

The setup of the splice methods is rather expensive (both the pure Ruby
portion and also the kernel pipe()).

I don't think a 1M file is really enough to test with (consider the size
of L2 caches these days), try playing around with bigger files, maybe
tens or even hundreds of megs (if it can fit in tmpfs).

You should also get Benchmark to report system and user time,
realtime can be misleading

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-12-27 21:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <AANLkTi=M+qGa7G1PvYyB+fbJUzrersJQwtDkct3hZEiy@mail.gmail.com>
2010-12-22 14:01 ` Some benchmarks Iñaki Baz Castillo
2010-12-22 19:56   ` Eric Wong
2010-12-23 15:41     ` Iñaki Baz Castillo
2010-12-23 18:06       ` Eric Wong
2010-12-27 10:01         ` Iñaki Baz Castillo
2010-12-27 17:33       ` Iñaki Baz Castillo
2010-12-27 21:38         ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/ruby_io_splice.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).