raindrops RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: Dale Hamel <dale.hamel@shopify.com>
To: raindrops-public@yhbt.net
Cc: Dale Hamel <dale.hamel@shopify.com>
Subject: [PATCH] Fix queue stats for sockets with SO_REUSEPORT
Date: Fri, 24 Feb 2023 13:35:44 -0500	[thread overview]
Message-ID: <20230224183544.51837-1-dale.hamel@shopify.com> (raw)

SO_REUSEPORT was introduced in 2013, see https://lwn.net/Articles/542629/.

However, the current logic for calculating socket queue backlog stats
is from 2011, predating the advent of SO_REUSEPORT.

The current strategy was thus written before the notion that the mapping of
INET_ADDR:socket could potentially be 1:N instead of always 1:1.

This causes raindrops to provide invalid socket backlog values when
SO_REUSEPORT is used, which Unicorn supports since 2013. The current behaviour
will set the queue metric to the queue depth of the last inet_diag_msg
it processes matching the INET_ADDR. In practice, this will result in
the backlog being off by a factor N, assuming relatively even
distribution of requests to sockets, which the kernel should guarantee
through consistent hashing.

The fix here is to accumulate the socket queue depth as we iterate
over the the socket diagnostics, rather than reset it each time.

I have a provided a test, but it only checks the queues, not the
accept metrics, as those are not affected by this bug, and it is
not possible to know which of the listeners will be dispatched the
request by the kernel, and thus which should call accept.
---
 ext/raindrops/linux_inet_diag.c               |  4 +-
 test/test_linux_reuseport_tcp_listen_stats.rb | 51 +++++++++++++++++++
 2 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 test/test_linux_reuseport_tcp_listen_stats.rb

diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c
index cabd427..2a2360c 100644
--- a/ext/raindrops/linux_inet_diag.c
+++ b/ext/raindrops/linux_inet_diag.c
@@ -306,7 +306,7 @@ static void table_set_queued(st_table *table, struct inet_diag_msg *r)
 {
 	struct listen_stats *stats = stats_for(table, r);
 	stats->listener_p = 1;
-	stats->queued = r->idiag_rqueue;
+	stats->queued += r->idiag_rqueue;
 }
 
 /* inner loop of inet_diag, called for every socket returned by netlink */
@@ -328,7 +328,7 @@ static inline void r_acc(struct nogvl_args *args, struct inet_diag_msg *r)
 		if (args->table)
 			table_set_queued(args->table, r);
 		else
-			args->stats.queued = r->idiag_rqueue;
+			args->stats.queued += r->idiag_rqueue;
 	}
 	/*
 	 * we wont get anything else because of the idiag_states filter
diff --git a/test/test_linux_reuseport_tcp_listen_stats.rb b/test/test_linux_reuseport_tcp_listen_stats.rb
new file mode 100644
index 0000000..c977c43
--- /dev/null
+++ b/test/test_linux_reuseport_tcp_listen_stats.rb
@@ -0,0 +1,51 @@
+# -*- encoding: binary -*-
+require "./test/rack_unicorn"
+require 'test/unit'
+require 'socket'
+require 'raindrops'
+$stderr.sync = $stdout.sync = true
+
+class TestLinuxReuseportTcpListenStats < Test::Unit::TestCase
+  include Raindrops::Linux
+  include Unicorn::SocketHelper
+  TEST_ADDR = ENV['UNICORN_TEST_ADDR'] || '127.0.0.1'
+  DEFAULT_BACKLOG = 10
+
+  def setup
+    @socks = []
+  end
+
+  def teardown
+    @socks.each { |io| io.closed? or io.close }
+  end
+
+  def new_socket_server(**kwargs)
+    s = new_tcp_server TEST_ADDR, kwargs[:port] || 0, kwargs
+    s.listen(kwargs[:backlog] || DEFAULT_BACKLOG)
+    @socks << s
+    [ s, s.addr[1] ]
+  end
+
+  def new_client(port)
+    s = TCPSocket.new("127.0.0.1", port)
+    @socks << s
+    s
+  end
+
+  def test_reuseport_queue_stats
+    listeners = 10
+    _, port = new_socket_server(reuseport: true)
+    addr = "#{TEST_ADDR}:#{port}"
+    listeners.times do
+      new_socket_server(reuseport: true, port: port)
+    end
+
+    listeners.times do |i|
+      all = Raindrops::Linux.tcp_listener_stats
+      assert_equal [0, i], all[addr].to_a
+      new_client(port)
+      all = Raindrops::Linux.tcp_listener_stats
+      assert_equal [0, i+1], all[addr].to_a
+    end
+  end
+end if RUBY_PLATFORM =~ /linux/
-- 
2.34.1


             reply	other threads:[~2023-02-24 18:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-24 18:35 Dale Hamel [this message]
2023-02-24 18:47 ` [PATCH] Fix off by one error in test Dale Hamel
2023-02-24 19:09 ` [PATCH] Fix queue stats for sockets with SO_REUSEPORT Eric Wong
2023-02-24 19:42   ` Dale Hamel
2023-02-25  0:36     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/raindrops/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230224183544.51837-1-dale.hamel@shopify.com \
    --to=dale.hamel@shopify.com \
    --cc=raindrops-public@yhbt.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/raindrops.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).