unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* [PATCH] Add early hints support
@ 2020-07-16 10:05 Jean Boussier
  2020-07-16 10:50 ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-16 10:05 UTC (permalink / raw)
  To: unicorn-public

While not part of the rack spec, this API is exposed
by both puma and falcon, and Rails use it when available.

The 103 Early Hints response code is specified in RFC 8297.
---
 lib/unicorn/configurator.rb |  5 +++++
 lib/unicorn/http_server.rb  | 33 +++++++++++++++++++++++++++++++--
 test/unit/test_server.rb    | 30 ++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index c3a4f2d..43e91f4 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -276,6 +276,11 @@ def default_middleware(bool)
     set_bool(:default_middleware, bool)
   end
 
+  # sets wether to enable rack's early hints API.
+  def early_hints(bool)
+    set_bool(:early_hints, bool)
+  end
+
   # sets listeners to the given +addresses+, replacing or augmenting the
   # current set.  This is for the global listener pool shared by all
   # worker processes.  For per-worker listeners, see the after_fork example
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 45a2e97..3e0c9a4 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
                 :before_fork, :after_fork, :before_exec,
                 :listener_opts, :preload_app,
                 :orig_app, :config, :ready_pipe, :user,
-                :default_middleware
+                :default_middleware, :early_hints
   attr_writer   :after_worker_exit, :after_worker_ready, :worker_exec
 
   attr_reader :pid, :logger
@@ -588,6 +588,27 @@ def handle_error(client, e)
   rescue
   end
 
+  def e103_response_write(client, headers)
+    response = if @request.response_start_sent
+      "103 Early Hints\r\n"
+    else
+      "HTTP/1.1 103 Early Hints\r\n"
+    end
+
+    headers.each_pair do |k, vs|
+      if vs.respond_to?(:to_s) && !vs.to_s.empty?
+        vs.to_s.split("\n".freeze).each do |v|
+          response << "#{k}: #{v}\r\n"
+        end
+      else
+        response << "#{k}: #{vs}\r\n"
+      end
+    end
+    response << "\r\n".freeze
+    response << "HTTP/1.1 ".freeze if @request.response_start_sent
+    client.write(response)
+  end
+
   def e100_response_write(client, env)
     # We use String#freeze to avoid allocations under Ruby 2.1+
     # Not many users hit this code path, so it's better to reduce the
@@ -602,7 +623,15 @@ def e100_response_write(client, env)
   # once a client is accepted, it is processed in its entirety here
   # in 3 easy steps: read request, call app, write app response
   def process_client(client)
-    status, headers, body = @app.call(env = @request.read(client))
+    env = @request.read(client)
+
+    if early_hints
+      env["rack.early_hints"] = lambda do |headers|
+        e103_response_write(client, headers)
+      end
+    end
+
+    status, headers, body = @app.call(env)
 
     begin
       return if @request.hijacked?
diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb
index 8096955..d706243 100644
--- a/test/unit/test_server.rb
+++ b/test/unit/test_server.rb
@@ -23,6 +23,16 @@ def call(env)
   end
 end
 
+class TestEarlyHintsHandler
+  def call(env)
+    while env['rack.input'].read(4096)
+    end
+    env['rack.early_hints'].call(
+      "Link" => "</style.css>; rel=preload; as=style\n</script.js>; rel=preload"
+    )
+    [200, { 'Content-Type' => 'text/plain' }, ['hello!\n']]
+  end
+end
 
 class WebServerTest < Test::Unit::TestCase
 
@@ -84,6 +94,26 @@ def test_preload_app_config
     tmp.close!
   end
 
+  def test_early_hints
+    teardown
+    redirect_test_io do
+      @server = HttpServer.new(TestEarlyHintsHandler.new,
+                               :listeners => [ "127.0.0.1:#@port"],
+                               :early_hints => true)
+      @server.start
+    end
+
+    sock = TCPSocket.new('127.0.0.1', @port)
+    sock.syswrite("GET / HTTP/1.0\r\n\r\n")
+
+    responses = sock.sysread(4096)
+    assert_match %r{\AHTTP/1.[01] 103\b}, responses
+    assert_match %r{^Link: </style\.css>}, responses
+    assert_match %r{^Link: </script\.js>}, responses
+
+    assert_match %r{^HTTP/1.[01] 200\b}, responses
+  end
+
   def test_broken_app
     teardown
     app = lambda { |env| raise RuntimeError, "hello" }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-16 10:05 [PATCH] Add early hints support Jean Boussier
@ 2020-07-16 10:50 ` Eric Wong
  2020-07-16 11:41   ` Jean Boussier
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2020-07-16 10:50 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
> While not part of the rack spec, this API is exposed
> by both puma and falcon, and Rails use it when available.
> 
> The 103 Early Hints response code is specified in RFC 8297.

Thanks, I can probably accept it with some minor tweaks.
Some comment below...

> ---
>  lib/unicorn/configurator.rb |  5 +++++
>  lib/unicorn/http_server.rb  | 33 +++++++++++++++++++++++++++++++--
>  test/unit/test_server.rb    | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
> index c3a4f2d..43e91f4 100644
> --- a/lib/unicorn/configurator.rb
> +++ b/lib/unicorn/configurator.rb
> @@ -276,6 +276,11 @@ def default_middleware(bool)
>      set_bool(:default_middleware, bool)
>    end
>  
> +  # sets wether to enable rack's early hints API.

spelling: "whether".

Since it's not officially part of Rack, yet, perhaps something
like:

	Enable the proposed early hints Rack API.

I'm no grammar expert, so I also rephrased that sentence to
avoid the apostrophe.

Since this RDoc ends up on the website, links to any relevant
Rails documentation and RFC 8297 would also be appropriate.
Otherwise non-Rails users like me might have no clue what
it's for.

> +  def early_hints(bool)
> +    set_bool(:early_hints, bool)
> +  end
> +
>    # sets listeners to the given +addresses+, replacing or augmenting the
>    # current set.  This is for the global listener pool shared by all
>    # worker processes.  For per-worker listeners, see the after_fork example
> diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
> index 45a2e97..3e0c9a4 100644
> --- a/lib/unicorn/http_server.rb
> +++ b/lib/unicorn/http_server.rb
> @@ -15,7 +15,7 @@ class Unicorn::HttpServer
>                  :before_fork, :after_fork, :before_exec,
>                  :listener_opts, :preload_app,
>                  :orig_app, :config, :ready_pipe, :user,
> -                :default_middleware
> +                :default_middleware, :early_hints
>    attr_writer   :after_worker_exit, :after_worker_ready, :worker_exec
>  
>    attr_reader :pid, :logger
> @@ -588,6 +588,27 @@ def handle_error(client, e)
>    rescue
>    end
>  
> +  def e103_response_write(client, headers)
> +    response = if @request.response_start_sent
> +      "103 Early Hints\r\n"
> +    else
> +      "HTTP/1.1 103 Early Hints\r\n"
> +    end
> +
> +    headers.each_pair do |k, vs|
> +      if vs.respond_to?(:to_s) && !vs.to_s.empty?
> +        vs.to_s.split("\n".freeze).each do |v|

Are the method calls for .to_s necessary?  At least for regular
Rack responses, this bit from unicorn/http_response.rb seems to
handle odd apps which (improperly) give `nil' value:

          if value =~ /\n/
            # avoiding blank, key-only cookies with /\n+/
            value.split(/\n+/).each { |v| buf << "#{key}: #{v}\r\n" }
          else
            buf << "#{key}: #{value}\r\n"
          end

The split would never be attempted on nil since it wouldn't
match /\n/, and the /\n+/ was needed in the Rails 2.3.5 days:

https://public-inbox.org/rack-devel/20100105235845.GB3377@dcvr.yhbt.net/
https://yhbt.net/unicorn-public/52400de1c9e9437b5c9df899f273485f663bb5b5/s/

> +          response << "#{k}: #{v}\r\n"
> +        end
> +      else
> +        response << "#{k}: #{vs}\r\n"
> +      end
> +    end
> +    response << "\r\n".freeze
> +    response << "HTTP/1.1 ".freeze if @request.response_start_sent
> +    client.write(response)
> +  end
> +
>    def e100_response_write(client, env)
>      # We use String#freeze to avoid allocations under Ruby 2.1+
>      # Not many users hit this code path, so it's better to reduce the
> @@ -602,7 +623,15 @@ def e100_response_write(client, env)
>    # once a client is accepted, it is processed in its entirety here
>    # in 3 easy steps: read request, call app, write app response
>    def process_client(client)
> -    status, headers, body = @app.call(env = @request.read(client))
> +    env = @request.read(client)
> +
> +    if early_hints
> +      env["rack.early_hints"] = lambda do |headers|
> +        e103_response_write(client, headers)
> +      end
> +    end

Eep, extra branch...  What's the performance impact for existing
users when not activated? (on Unix sockets).

Perhaps bypassing the method and accessing the @early_hints ivar
directly can be slightly faster w/o method dispatch.  That
should also allow using attr_writer instead of attr_accessor,
I think.

Not sure how much people here care about minor performance
regressions, here...  I don't really upgrade or touch old
Rack apps, anymore; and I'm certainly never going to buy
a faster CPU.

> +    status, headers, body = @app.call(env)
>  
>      begin
>        return if @request.hijacked?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-16 10:50 ` Eric Wong
@ 2020-07-16 11:41   ` Jean Boussier
  2020-07-16 12:16     ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-16 11:41 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public

Thanks for the very timely response.

> Since this RDoc ends up on the website, links to any relevant
> Rails documentation and RFC 8297 would also be appropriate.
> Otherwise non-Rails users like me might have no clue what
> it's for.

I updated the documentation, let me know what you think of it.

> Are the method calls for .to_s necessary?

I don't think they are, I mostly took inspiration from the puma implementation
that does all this defensive checks. Based on how that interface is
used by Rails, we could assume both keys and values are strings already.

I simplified the implementation.

> Eep, extra branch...  What's the performance impact for existing
> users when not activated? (on Unix sockets).

Extremely small.

> 
> Perhaps bypassing the method and accessing the @early_hints ivar
> directly can be slightly faster w/o method dispatch.  That
> should also allow using attr_writer instead of attr_accessor,
> I think.

 attr_reader is very optimized in MRI, it's barely slower than @early_hints.
Also it ensure that it doesn't emit a warning in verbose mode if the variable
isn't initialized.

From e0494e10de6549d1b513eef03e68bfa58a6b26ec Mon Sep 17 00:00:00 2001
From: Jean Boussier <jean.boussier@gmail.com>
Date: Thu, 16 Jul 2020 11:39:30 +0200
Subject: [PATCH] Add early hints support

While not part of the rack spec, this API is exposed
by both puma and falcon, and Rails use it when available.

The 103 Early Hints response code is specified in RFC 8297.
---
 lib/unicorn/configurator.rb |  9 +++++++++
 lib/unicorn/http_server.rb  | 31 +++++++++++++++++++++++++++++--
 test/unit/test_server.rb    | 30 ++++++++++++++++++++++++++++++
 3 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index c3a4f2d..b0606af 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -276,6 +276,15 @@ def default_middleware(bool)
     set_bool(:default_middleware, bool)
   end
 
+  # sets whether to enable the proposed early hints Rack API.
+  # If enabled, Rails 5.2+ will automatically send a 103 Early Hint
+  # for all the `javascript_include_tag` and `stylesheet_link_tag`
+  # in your response. See: https://api.rubyonrails.org/v5.2/classes/ActionDispatch/Request.html#method-i-send_early_hints
+  # See also https://tools.ietf.org/html/rfc8297
+  def early_hints(bool)
+    set_bool(:early_hints, bool)
+  end
+
   # sets listeners to the given +addresses+, replacing or augmenting the
   # current set.  This is for the global listener pool shared by all
   # worker processes.  For per-worker listeners, see the after_fork example
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 45a2e97..05dad99 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
                 :before_fork, :after_fork, :before_exec,
                 :listener_opts, :preload_app,
                 :orig_app, :config, :ready_pipe, :user,
-                :default_middleware
+                :default_middleware, :early_hints
   attr_writer   :after_worker_exit, :after_worker_ready, :worker_exec
 
   attr_reader :pid, :logger
@@ -588,6 +588,25 @@ def handle_error(client, e)
   rescue
   end
 
+  def e103_response_write(client, headers)
+    response = if @request.response_start_sent
+      "103 Early Hints\r\n"
+    else
+      "HTTP/1.1 103 Early Hints\r\n"
+    end
+
+    headers.each_pair do |k, vs|
+      next if !vs || vs.empty?
+      values = vs.to_s.split("\n".freeze)
+      values.each do |v|
+        response << "#{k}: #{v}\r\n"
+      end
+    end
+    response << "\r\n".freeze
+    response << "HTTP/1.1 ".freeze if @request.response_start_sent
+    client.write(response)
+  end
+
   def e100_response_write(client, env)
     # We use String#freeze to avoid allocations under Ruby 2.1+
     # Not many users hit this code path, so it's better to reduce the
@@ -602,7 +621,15 @@ def e100_response_write(client, env)
   # once a client is accepted, it is processed in its entirety here
   # in 3 easy steps: read request, call app, write app response
   def process_client(client)
-    status, headers, body = @app.call(env = @request.read(client))
+    env = @request.read(client)
+
+    if early_hints
+      env["rack.early_hints"] = lambda do |headers|
+        e103_response_write(client, headers)
+      end
+    end
+
+    status, headers, body = @app.call(env)
 
     begin
       return if @request.hijacked?
diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb
index 8096955..d706243 100644
--- a/test/unit/test_server.rb
+++ b/test/unit/test_server.rb
@@ -23,6 +23,16 @@ def call(env)
   end
 end
 
+class TestEarlyHintsHandler
+  def call(env)
+    while env['rack.input'].read(4096)
+    end
+    env['rack.early_hints'].call(
+      "Link" => "</style.css>; rel=preload; as=style\n</script.js>; rel=preload"
+    )
+    [200, { 'Content-Type' => 'text/plain' }, ['hello!\n']]
+  end
+end
 
 class WebServerTest < Test::Unit::TestCase
 
@@ -84,6 +94,26 @@ def test_preload_app_config
     tmp.close!
   end
 
+  def test_early_hints
+    teardown
+    redirect_test_io do
+      @server = HttpServer.new(TestEarlyHintsHandler.new,
+                               :listeners => [ "127.0.0.1:#@port"],
+                               :early_hints => true)
+      @server.start
+    end
+
+    sock = TCPSocket.new('127.0.0.1', @port)
+    sock.syswrite("GET / HTTP/1.0\r\n\r\n")
+
+    responses = sock.sysread(4096)
+    assert_match %r{\AHTTP/1.[01] 103\b}, responses
+    assert_match %r{^Link: </style\.css>}, responses
+    assert_match %r{^Link: </script\.js>}, responses
+
+    assert_match %r{^HTTP/1.[01] 200\b}, responses
+  end
+
   def test_broken_app
     teardown
     app = lambda { |env| raise RuntimeError, "hello" }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-16 11:41   ` Jean Boussier
@ 2020-07-16 12:16     ` Eric Wong
  2020-07-16 12:24       ` Jean Boussier
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2020-07-16 12:16 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
> Thanks for the very timely response.
> 
> > Since this RDoc ends up on the website, links to any relevant
> > Rails documentation and RFC 8297 would also be appropriate.
> > Otherwise non-Rails users like me might have no clue what
> > it's for.
> 
> I updated the documentation, let me know what you think of it.

It's fine, pushed for now.  Will release in a few days or a week
in case there's comments from others.

> > Are the method calls for .to_s necessary?
> 
> I don't think they are, I mostly took inspiration from the puma implementation
> that does all this defensive checks. Based on how that interface is
> used by Rails, we could assume both keys and values are strings already.
> 
> I simplified the implementation.

OK.

> > Eep, extra branch...  What's the performance impact for existing
> > users when not activated? (on Unix sockets).
> 
> Extremely small.

Alright, numbers would've been helpful but I'll take your word
for it.

If we get more branches in the main loop for other new features;
I wonder if generating the code for process_client dynamically
at initialize-time is worth it to avoid branches in the main
loop... (or if there's some way to hint MJIT to do that).

> > Perhaps bypassing the method and accessing the @early_hints ivar
> > directly can be slightly faster w/o method dispatch.  That
> > should also allow using attr_writer instead of attr_accessor,
> > I think.
> 
>  attr_reader is very optimized in MRI, it's barely slower than @early_hints.
> Also it ensure that it doesn't emit a warning in verbose mode if the variable
> isn't initialized.

Thanks again.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-16 12:16     ` Eric Wong
@ 2020-07-16 12:24       ` Jean Boussier
  2020-07-17  1:19         ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-16 12:24 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public

> Alright, numbers would've been helpful but I'll take your word
> for it.

My bad, I didn't get the hint:

```
require 'benchmark/ips'

class Foo
  attr_accessor :foo

  def initialize
    @foo = 1
  end

  def missing_ivar
    @bar
  end

  def ivar
    @foo
  end
  
  def accessor
    foo
  end
end

foo = Foo.new

Benchmark.ips do |x|
  x.report('ivar') { foo.ivar }
  x.report('attr') { foo.accessor }
  x.report('ivar:missing') { foo.missing_ivar }
  x.compare!
end
```

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]

Warming up --------------------------------------
                ivar     2.105M i/100ms
                attr     1.824M i/100ms
        ivar:missing     1.795M i/100ms
Calculating -------------------------------------
                ivar     20.935M (± 1.5%) i/s -    105.248M in   5.028625s
                attr     18.622M (± 0.8%) i/s -     94.845M in   5.093471s
        ivar:missing     18.408M (± 1.1%) i/s -     93.357M in   5.072160s

Comparison:
                ivar: 20934740.7 i/s
                attr: 18622122.9 i/s - 1.12x  (± 0.00) slower
        ivar:missing: 18408273.1 i/s - 1.14x  (± 0.00) slower

So yeah, the difference between raw @ivar and an attr_reader is barely noticeable, and both are very fast.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-16 12:24       ` Jean Boussier
@ 2020-07-17  1:19         ` Eric Wong
  2020-07-20  9:18           ` Jean Boussier
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2020-07-17  1:19 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
>                 ivar: 20934740.7 i/s
>                 attr: 18622122.9 i/s - 1.12x  (± 0.00) slower
>         ivar:missing: 18408273.1 i/s - 1.14x  (± 0.00) slower
> 
> So yeah, the difference between raw @ivar and an attr_reader
> is barely noticeable, and both are very fast.

I guess... I was wondering more in terms of the big picture with
HTTP parsing, I/O, and response generation taken into account.
(and there may be improvements in those areas later this year,
 assuming the world doesn't end sooner...)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-17  1:19         ` Eric Wong
@ 2020-07-20  9:18           ` Jean Boussier
  2020-07-20 10:09             ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-20  9:18 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public

> I guess... I was wondering more in terms of the big picture with
> HTTP parsing, I/O, and response generation taken into account.

I wasn't too sure how to benchmark the entire loop. But regardless
18/20M i/s shows that it should be meaningless. It's only executed
once per request, so would "waste" 1 second for 18 million requests
processed.

> (and there may be improvements in those areas later this year,
> assuming the world doesn't end sooner...)

If you are interested in general optimizations, I did spot a bunch
of sub optimal patterns, such as `=~` in places where a string
comparison, or `match?` would do.

However many optimizations are only available on more recent
rubies.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-20  9:18           ` Jean Boussier
@ 2020-07-20 10:09             ` Eric Wong
  2020-07-20 10:27               ` Jean Boussier
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2020-07-20 10:09 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
> > I guess... I was wondering more in terms of the big picture with
> > HTTP parsing, I/O, and response generation taken into account.
> 
> I wasn't too sure how to benchmark the entire loop. But regardless
> 18/20M i/s shows that it should be meaningless. It's only executed
> once per request, so would "waste" 1 second for 18 million requests
> processed.

Yeah, it can't support persistent connections, so Unix sockets
are required to avoid TCP port exhaustion and ensure repeatable
results.

> > (and there may be improvements in those areas later this year,
> > assuming the world doesn't end sooner...)
> 
> If you are interested in general optimizations, I did spot a bunch
> of sub optimal patterns, such as `=~` in places where a string
> comparison, or `match?` would do.
> 
> However many optimizations are only available on more recent
> rubies.

`match?' is Ruby 2.4+, which is probably too big a jump since
we're still on Ruby 1.9.3 at the moment... (though maybe 2.3+
is/was on the horizon).

String comparison as in `==' and `!='?  Would be interested to know
where and what improvements can be had.

But yeah, Ruby just seems hopeless performance-wise (and
compatibility-wise :P); so unicorn mainly exists to keep legacy
projects alive.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-20 10:09             ` Eric Wong
@ 2020-07-20 10:27               ` Jean Boussier
  2020-07-20 10:55                 ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-20 10:27 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public

> `match?' is Ruby 2.4+, which is probably too big a jump since
> we're still on Ruby 1.9.3 at the moment...

That's what I figured.

> String comparison as in `==' and `!='?  Would be interested to know
> where and what improvements can be had.

One place that jumped to mind when I saw it is http_response_write.
But there are many other places where Regexp are used to do case
insensitive comparisons.

```
require 'benchmark/ips'

def http_response_write(headers)
  headers.each do |key, value|
    case key
    when %r{\A(?:Date|Connection)\z}i
      next
    end
  end
end

def http_response_write_upcase(headers)
  headers.each do |key, value|
    case key.upcase
    when 'DATE'.freeze, 'CONNECTION'.freeze
      next
    end
  end
end

def http_response_write_casecmp(headers)
  headers.each do |key, value|
    case key
    when key.casecmp?('Date'.freeze) || key.casecmp?('Connection'.freeze)
      next
    end
  end
end

HEADERS = {
  'Foo' => 'bar',
  'Date' => 'plop',
  'User-Agent' => 'blah',
}

Benchmark.ips do |x|
  x.report('original') { http_response_write(HEADERS) }
  x.report('upcase') { http_response_write_upcase(HEADERS) }
  x.report('casecmp?') { http_response_write_casecmp(HEADERS) }
  x.compare!
end
```

```
Warming up --------------------------------------
            original    82.066k i/100ms
              upcase   177.429k i/100ms
            casecmp?    96.288k i/100ms
Calculating -------------------------------------
            original    831.610k (± 1.6%) i/s -      4.185M in   5.034146s
              upcase      1.770M (± 1.6%) i/s -      8.871M in   5.013796s
            casecmp?    979.618k (± 1.3%) i/s -      4.911M in   5.013678s

Comparison:
              upcase:  1769883.2 i/s
            casecmp?:   979618.3 i/s - 1.81x  (± 0.00) slower
            original:   831610.2 i/s - 2.13x  (± 0.00) slower
```

Similarly, that method use `value =~ /\n/` which could be replaced
favorably for `value.include?("\n".freeze)`

```
VAL = "foobar"
Benchmark.ips do |x|
  x.report('=~') { VAL =~ /\n/ }
  x.report('include?') { VAL.include?("\n".freeze) }
  x.compare!
end
```

```
Warming up --------------------------------------
                  =~   409.096k i/100ms
            include?     1.322M i/100ms
Calculating -------------------------------------
                  =~      4.083M (± 2.1%) i/s -     20.455M in   5.011539s
            include?     13.053M (± 1.5%) i/s -     66.125M in   5.067097s

Comparison:
            include?: 13052859.2 i/s
                  =~:  4083401.3 i/s - 3.20x  (± 0.00) slower

```

> Ruby just seems hopeless performance-wise 

Well, the gap between 1.9.3 and 2.5+ is pretty big performance-wise.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-20 10:27               ` Jean Boussier
@ 2020-07-20 10:55                 ` Eric Wong
  2020-07-20 11:53                   ` Jean Boussier
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2020-07-20 10:55 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
> > `match?' is Ruby 2.4+, which is probably too big a jump since
> > we're still on Ruby 1.9.3 at the moment...
> 
> That's what I figured.
> 
> > String comparison as in `==' and `!='?  Would be interested to know
> > where and what improvements can be had.

Thanks for the benchmarks.

> One place that jumped to mind when I saw it is http_response_write.
> But there are many other places where Regexp are used to do case
> insensitive comparisons.
> 
> ```
> require 'benchmark/ips'
> 
> def http_response_write(headers)
>   headers.each do |key, value|
>     case key
>     when %r{\A(?:Date|Connection)\z}i
>       next
>     end
>   end
> end
> 
> def http_response_write_upcase(headers)
>   headers.each do |key, value|
>     case key.upcase
>     when 'DATE'.freeze, 'CONNECTION'.freeze
>       next
>     end
>   end
> end
> 
> def http_response_write_casecmp(headers)
>   headers.each do |key, value|
>     case key
>     when key.casecmp?('Date'.freeze) || key.casecmp?('Connection'.freeze)
>       next
>     end
>   end
> end
> 
> HEADERS = {
>   'Foo' => 'bar',
>   'Date' => 'plop',
>   'User-Agent' => 'blah',
> }
> 
> Benchmark.ips do |x|
>   x.report('original') { http_response_write(HEADERS) }
>   x.report('upcase') { http_response_write_upcase(HEADERS) }
>   x.report('casecmp?') { http_response_write_casecmp(HEADERS) }
>   x.compare!
> end
> ```
> 
> ```
> Warming up --------------------------------------
>             original    82.066k i/100ms
>               upcase   177.429k i/100ms
>             casecmp?    96.288k i/100ms
> Calculating -------------------------------------
>             original    831.610k (± 1.6%) i/s -      4.185M in   5.034146s
>               upcase      1.770M (± 1.6%) i/s -      8.871M in   5.013796s
>             casecmp?    979.618k (± 1.3%) i/s -      4.911M in   5.013678s
> 
> Comparison:
>               upcase:  1769883.2 i/s
>             casecmp?:   979618.3 i/s - 1.81x  (± 0.00) slower
>             original:   831610.2 i/s - 2.13x  (± 0.00) slower
> ```

upcase seems VERY compelling in a micro benchmark since it can
go straight into opt_case_dispatch.  But I worry the extra
garbage might have a different effect in a real app, especially
with more headers.

> Similarly, that method use `value =~ /\n/` which could be replaced
> favorably for `value.include?("\n".freeze)`

Yes, we tried that a few years ago and broke existing code
that had `nil' value, so it was promptly reverted:

https://yhbt.net/unicorn-public/CAO47=rJa=zRcLn_Xm4v2cHPr6c0UswaFC_omYFEH+baSxHOWKQ@mail.gmail.com/

Maybe:

	 (val || ''.freeze).include?("\n".freeze)

Can work for those buggy apps, though...

> ```
> VAL = "foobar"
> Benchmark.ips do |x|
>   x.report('=~') { VAL =~ /\n/ }
>   x.report('include?') { VAL.include?("\n".freeze) }
>   x.compare!
> end

> > Ruby just seems hopeless performance-wise 
> 
> Well, the gap between 1.9.3 and 2.5+ is pretty big performance-wise.

True, but it's still pretty slow :>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-20 10:55                 ` Eric Wong
@ 2020-07-20 11:53                   ` Jean Boussier
  2020-07-20 20:27                     ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Jean Boussier @ 2020-07-20 11:53 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public

> upcase seems VERY compelling in a micro benchmark since it can
> go straight into opt_case_dispatch.  But I worry the extra
> garbage might have a different effect in a real app, especially
> with more headers.

It is indeed debatable. The thing is that on match, that Regexp
will generate a MatchData and a garbage string. So I'd think
the extra GC pressure would be totally negligible.

But I agree it's hard to tell without a larger more representative benchmark.

> Maybe:
> 
> 	 (val || ''.freeze).include?("\n".freeze)
> 
> Can work for those buggy apps, though...

Yes. or:

   vall && val.include?...


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Add early hints support
  2020-07-20 11:53                   ` Jean Boussier
@ 2020-07-20 20:27                     ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2020-07-20 20:27 UTC (permalink / raw)
  To: Jean Boussier; +Cc: unicorn-public

Jean Boussier <jean.boussier@shopify.com> wrote:
> > upcase seems VERY compelling in a micro benchmark since it can
> > go straight into opt_case_dispatch.  But I worry the extra
> > garbage might have a different effect in a real app, especially
> > with more headers.
> 
> It is indeed debatable. The thing is that on match, that Regexp
> will generate a MatchData and a garbage string. So I'd think
> the extra GC pressure would be totally negligible.

Ah good point.  I take it Ruby doesn't lazy-allocate or have
optimizations to quickly recycle unused MatchData?  (Maybe it
could, if somebody who hacks on the VM is reading this).

> But I agree it's hard to tell without a larger more representative benchmark.

Yes, I would accept patches to speed these up with comprehensive
benchmark :)

> > Maybe:
> > 
> > 	 (val || ''.freeze).include?("\n".freeze)
> > 
> > Can work for those buggy apps, though...
> 
> Yes. or:
> 
>    vall && val.include?...

Yes.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-20 20:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-16 10:05 [PATCH] Add early hints support Jean Boussier
2020-07-16 10:50 ` Eric Wong
2020-07-16 11:41   ` Jean Boussier
2020-07-16 12:16     ` Eric Wong
2020-07-16 12:24       ` Jean Boussier
2020-07-17  1:19         ` Eric Wong
2020-07-20  9:18           ` Jean Boussier
2020-07-20 10:09             ` Eric Wong
2020-07-20 10:27               ` Jean Boussier
2020-07-20 10:55                 ` Eric Wong
2020-07-20 11:53                   ` Jean Boussier
2020-07-20 20:27                     ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).