All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/3] mlx4: Add support for netdev-genl API
@ 2024-04-26 18:33 Joe Damato
  2024-04-26 18:33 ` [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Joe Damato @ 2024-04-26 18:33 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski,
	open list:MELLANOX MLX4 core VPI driver, Paolo Abeni

Greetings:

Welcome to v2.

This series adds support to mlx4 for the netdev-genl API which makes it
much easier for users and user programs to map NAPI IDs back to
ifindexes, queues, and IRQs. This is extremely useful for a number of
use cases, including epoll-based busy poll.

In addition, this series includes a patch to generate per-queue
statistics using the netlink API, as well.

To facilitate the stats, patch 1/3 makes use of an existing field,
"dropped" which was already being exported in the ethtool stats by the
driver, but was never incremented. As of patch 1/3, it is now being
incremented by the driver in an appropriate place and used in patch 3/3
as alloc_fail.

Please note: I do not have access to mlx4 hardware, but I've been
working closely with Martin Karsten from University of Waterloo (CC'd)
who has very graciously tested my patches on their mlx4 hardware (hence
his Tested-by attribution in each commit). His latest research work is
particularly interesting [1] and this series helps to support that (and
future) work.

Martin has re-tested this v2 using Jakub's tool [2] and the
stats.pkt_byte_sum and stats.qstat_by_ifindex tests passed.

[1]: https://dl.acm.org/doi/pdf/10.1145/3626780
[2]: https://lore.kernel.org/lkml/20240423175718.4ad4dc5a@kernel.org/

Thanks,
Joe

v1 -> v2:
 - Patch 1/3 now initializes dropped to 0.
 - Patch 2/3 fix use of uninitialized qtype warning.
 - Patch 3/3 includes several changes:
   - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
     valid before proceeding.
   - All initialization to 0xff for stats fields has been omit. The
     network stack does this before calling into the driver functions, so
     I've adjusted the driver functions to only set values if there is
     data to set, leaving the network stack's 0xff in place if not.
   - mlx4_get_base_stats set all stat fields to 0 individually if there
     are RX and TX queues.

Joe Damato (3):
  net/mlx4: Track RX allocation failures in a stat
  net/mlx4: link NAPI instances to queues and IRQs
  net/mlx4: support per-queue statistics via netlink

 drivers/net/ethernet/mellanox/mlx4/en_cq.c    | 14 ++++
 .../net/ethernet/mellanox/mlx4/en_netdev.c    | 79 +++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/en_port.c  |  5 +-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  4 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |  1 +
 5 files changed, 101 insertions(+), 2 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-26 18:33 [PATCH net-next v2 0/3] mlx4: Add support for netdev-genl API Joe Damato
@ 2024-04-26 18:33 ` Joe Damato
  2024-04-26 20:00   ` Jakub Kicinski
  2024-04-26 18:33 ` [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
  2024-04-26 18:33 ` [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
  2 siblings, 1 reply; 12+ messages in thread
From: Joe Damato @ 2024-04-26 18:33 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
fails but does not increment a stat field when this occurs.

struct mlx4_en_rx_ring has a dropped field which is tabulated in
mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.

This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
dropped field for the -ENOMEM case.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 drivers/net/ethernet/mellanox/mlx4/en_port.c | 5 ++++-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 532997eba698..47541f7666f2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
-	unsigned long packets, bytes;
+	unsigned long packets, bytes, dropped;
 	int i;
 
 	if (!priv->port_up || mlx4_is_master(mdev->dev))
@@ -159,14 +159,17 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
 
 	packets = 0;
 	bytes = 0;
+	dropped = 0;
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
 
 		packets += READ_ONCE(ring->packets);
 		bytes   += READ_ONCE(ring->bytes);
+		dropped += READ_ONCE(ring->dropped);
 	}
 	dev->stats.rx_packets = packets;
 	dev->stats.rx_bytes = bytes;
+	dev->stats.rx_missed_errors = dropped;
 
 	packets = 0;
 	bytes = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 8328df8645d5..573ae10300c7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
 
 	for (i = 0; i < priv->num_frags; i++, frags++) {
 		if (!frags->page) {
-			if (mlx4_alloc_page(priv, frags, gfp))
+			if (mlx4_alloc_page(priv, frags, gfp)) {
+				ring->dropped++;
 				return -ENOMEM;
+			}
 			ring->rx_alloc_pages++;
 		}
 		rx_desc->data[i].addr = cpu_to_be64(frags->dma +
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs
  2024-04-26 18:33 [PATCH net-next v2 0/3] mlx4: Add support for netdev-genl API Joe Damato
  2024-04-26 18:33 ` [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-04-26 18:33 ` Joe Damato
  2024-04-26 20:01   ` Jakub Kicinski
  2024-04-26 18:33 ` [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
  2 siblings, 1 reply; 12+ messages in thread
From: Joe Damato @ 2024-04-26 18:33 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

Make mlx4 compatible with the newly added netlink queue GET APIs.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 drivers/net/ethernet/mellanox/mlx4/en_cq.c   | 14 ++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 1184ac5751e1..461cc2c79c71 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -126,6 +126,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 		cq_idx = cq_idx % priv->rx_ring_num;
 		rx_cq = priv->rx_cq[cq_idx];
 		cq->vector = rx_cq->vector;
+		irq = mlx4_eq_get_irq(mdev->dev, cq->vector);
 	}
 
 	if (cq->type == RX)
@@ -142,18 +143,23 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 	if (err)
 		goto free_eq;
 
+	cq->cq_idx = cq_idx;
 	cq->mcq.event = mlx4_en_cq_event;
 
 	switch (cq->type) {
 	case TX:
 		cq->mcq.comp = mlx4_en_tx_irq;
 		netif_napi_add_tx(cq->dev, &cq->napi, mlx4_en_poll_tx_cq);
+		netif_napi_set_irq(&cq->napi, irq);
 		napi_enable(&cq->napi);
+		netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_TX, &cq->napi);
 		break;
 	case RX:
 		cq->mcq.comp = mlx4_en_rx_irq;
 		netif_napi_add(cq->dev, &cq->napi, mlx4_en_poll_rx_cq);
+		netif_napi_set_irq(&cq->napi, irq);
 		napi_enable(&cq->napi);
+		netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_RX, &cq->napi);
 		break;
 	case TX_XDP:
 		/* nothing regarding napi, it's shared with rx ring */
@@ -189,6 +195,14 @@ void mlx4_en_destroy_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq **pcq)
 void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
 {
 	if (cq->type != TX_XDP) {
+		enum netdev_queue_type qtype;
+
+		if (cq->type == RX)
+			qtype = NETDEV_QUEUE_TYPE_RX;
+		else
+			qtype = NETDEV_QUEUE_TYPE_TX;
+
+		netif_queue_set_napi(cq->dev, cq->cq_idx, qtype, NULL);
 		napi_disable(&cq->napi);
 		netif_napi_del(&cq->napi);
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index efe3f97b874f..896f985549a4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -379,6 +379,7 @@ struct mlx4_en_cq {
 #define MLX4_EN_OPCODE_ERROR	0x1e
 
 	const struct cpumask *aff_mask;
+	int cq_idx;
 };
 
 struct mlx4_en_port_profile {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-26 18:33 [PATCH net-next v2 0/3] mlx4: Add support for netdev-genl API Joe Damato
  2024-04-26 18:33 ` [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
  2024-04-26 18:33 ` [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
@ 2024-04-26 18:33 ` Joe Damato
  2024-04-26 20:01   ` Jakub Kicinski
  2 siblings, 1 reply; 12+ messages in thread
From: Joe Damato @ 2024-04-26 18:33 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

Make mlx4 compatible with the newly added netlink queue stats API.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 .../net/ethernet/mellanox/mlx4/en_netdev.c    | 79 +++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 5d3fde63b273..6875f8c5103a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -43,6 +43,7 @@
 #include <net/vxlan.h>
 #include <net/devlink.h>
 #include <net/rps.h>
+#include <net/netdev_queues.h>
 
 #include <linux/mlx4/driver.h>
 #include <linux/mlx4/device.h>
@@ -3099,6 +3100,83 @@ void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
 	last_i += NUM_PHY_STATS;
 }
 
+static void mlx4_get_queue_stats_rx(struct net_device *dev, int i,
+				    struct netdev_queue_stats_rx *stats)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	const struct mlx4_en_rx_ring *ring;
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	if (i < 0 || i >= priv->rx_ring_num)
+		goto out_unlock;
+
+	ring = priv->rx_ring[i];
+	stats->packets = READ_ONCE(ring->packets);
+	stats->bytes   = READ_ONCE(ring->bytes);
+	stats->alloc_fail = READ_ONCE(ring->dropped);
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_queue_stats_tx(struct net_device *dev, int i,
+				    struct netdev_queue_stats_tx *stats)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	const struct mlx4_en_tx_ring *ring;
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	if (i < 0 || i >= priv->tx_ring_num[TX])
+		goto out_unlock;
+
+	ring = priv->tx_ring[TX][i];
+	stats->packets = READ_ONCE(ring->packets);
+	stats->bytes   = READ_ONCE(ring->bytes);
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_base_stats(struct net_device *dev,
+				struct netdev_queue_stats_rx *rx,
+				struct netdev_queue_stats_tx *tx)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	if (priv->rx_ring_num) {
+		rx->packets = 0;
+		rx->bytes = 0;
+		rx->alloc_fail = 0;
+	}
+
+	if (priv->tx_ring_num[TX]) {
+		tx->packets = 0;
+		tx->bytes = 0;
+	}
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static const struct netdev_stat_ops mlx4_stat_ops = {
+	.get_queue_stats_rx     = mlx4_get_queue_stats_rx,
+	.get_queue_stats_tx     = mlx4_get_queue_stats_tx,
+	.get_base_stats         = mlx4_get_base_stats,
+};
+
 int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 			struct mlx4_en_port_profile *prof)
 {
@@ -3262,6 +3340,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
 	netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
 
+	dev->stat_ops = &mlx4_stat_ops;
 	dev->ethtool_ops = &mlx4_en_ethtool_ops;
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-26 18:33 ` [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-04-26 20:00   ` Jakub Kicinski
  2024-04-26 23:43     ` Joe Damato
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-26 20:00 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, 26 Apr 2024 18:33:53 +0000 Joe Damato wrote:
> --- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> @@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
>  {
>  	struct mlx4_en_priv *priv = netdev_priv(dev);
>  	struct mlx4_en_dev *mdev = priv->mdev;
> -	unsigned long packets, bytes;
> +	unsigned long packets, bytes, dropped;
>  	int i;
>  
>  	if (!priv->port_up || mlx4_is_master(mdev->dev))
> @@ -159,14 +159,17 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
>  
>  	packets = 0;
>  	bytes = 0;
> +	dropped = 0;
>  	for (i = 0; i < priv->rx_ring_num; i++) {
>  		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
>  
>  		packets += READ_ONCE(ring->packets);
>  		bytes   += READ_ONCE(ring->bytes);
> +		dropped += READ_ONCE(ring->dropped);
>  	}
>  	dev->stats.rx_packets = packets;
>  	dev->stats.rx_bytes = bytes;
> +	dev->stats.rx_missed_errors = dropped;

I'd drop this chunk, there's a slight but meaningful difference in
definition of rx_missed vs alloc-fail:

 * @rx_missed_errors: Count of packets missed by the host.
 *   Folded into the "drop" counter in `/proc/net/dev`.
 *
 *   Counts number of packets dropped by the device due to lack
 *   of buffer space. This usually indicates that the host interface
 *   is slower than the network interface, or host is not keeping up
 *   with the receive packet rate.
---
        name: rx-alloc-fail
        doc: |
          Number of times skb or buffer allocation failed on the Rx datapath.
          Allocation failure may, or may not result in a packet drop, depending
          on driver implementation and whether system recovers quickly.

tl;dr "packets dropped" vs "may, or may not result in a packet drop"

In case of mlx4 looks like the buffer refill is "async", the driver
tries to refill the buffers to max, but if it fails the next NAPI poll
will try again. Allocation failures are not directly tied to packet
drops. In case of bnxt if "replacement" buffer can't be allocated -
packet is dropped and old buffer gets returned to the ring (although 
if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
gets incremented on ifup pre-fill failures).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-26 18:33 ` [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
@ 2024-04-26 20:01   ` Jakub Kicinski
  2024-04-27  0:05     ` Joe Damato
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-26 20:01 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, 26 Apr 2024 18:33:55 +0000 Joe Damato wrote:
> Make mlx4 compatible with the newly added netlink queue stats API.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>

Not sure what the "master" and "port_up" things are :) 
but the rest looks good:

Reviewed-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs
  2024-04-26 18:33 ` [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
@ 2024-04-26 20:01   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-26 20:01 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, 26 Apr 2024 18:33:54 +0000 Joe Damato wrote:
> Make mlx4 compatible with the newly added netlink queue GET APIs.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>

Acked-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-26 20:00   ` Jakub Kicinski
@ 2024-04-26 23:43     ` Joe Damato
  2024-04-26 23:52       ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Damato @ 2024-04-26 23:43 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, Apr 26, 2024 at 01:00:17PM -0700, Jakub Kicinski wrote:
> On Fri, 26 Apr 2024 18:33:53 +0000 Joe Damato wrote:
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > @@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> >  {
> >  	struct mlx4_en_priv *priv = netdev_priv(dev);
> >  	struct mlx4_en_dev *mdev = priv->mdev;
> > -	unsigned long packets, bytes;
> > +	unsigned long packets, bytes, dropped;
> >  	int i;
> >  
> >  	if (!priv->port_up || mlx4_is_master(mdev->dev))
> > @@ -159,14 +159,17 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> >  
> >  	packets = 0;
> >  	bytes = 0;
> > +	dropped = 0;
> >  	for (i = 0; i < priv->rx_ring_num; i++) {
> >  		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
> >  
> >  		packets += READ_ONCE(ring->packets);
> >  		bytes   += READ_ONCE(ring->bytes);
> > +		dropped += READ_ONCE(ring->dropped);
> >  	}
> >  	dev->stats.rx_packets = packets;
> >  	dev->stats.rx_bytes = bytes;
> > +	dev->stats.rx_missed_errors = dropped;
> 
> I'd drop this chunk, there's a slight but meaningful difference in
> definition of rx_missed vs alloc-fail:
> 
>  * @rx_missed_errors: Count of packets missed by the host.
>  *   Folded into the "drop" counter in `/proc/net/dev`.
>  *
>  *   Counts number of packets dropped by the device due to lack
>  *   of buffer space. This usually indicates that the host interface
>  *   is slower than the network interface, or host is not keeping up
>  *   with the receive packet rate.
> ---
>         name: rx-alloc-fail
>         doc: |
>           Number of times skb or buffer allocation failed on the Rx datapath.
>           Allocation failure may, or may not result in a packet drop, depending
>           on driver implementation and whether system recovers quickly.
> 
> tl;dr "packets dropped" vs "may, or may not result in a packet drop"
> 
> In case of mlx4 looks like the buffer refill is "async", the driver
> tries to refill the buffers to max, but if it fails the next NAPI poll
> will try again. Allocation failures are not directly tied to packet
> drops. In case of bnxt if "replacement" buffer can't be allocated -
> packet is dropped and old buffer gets returned to the ring (although 
> if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
> gets incremented on ifup pre-fill failures).

Yes, I see that now. I'll drop this patch entirely from v3 and just leave
the other two and remove alloc_fail from the queue stats patch.

Thanks for the careful review.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-26 23:43     ` Joe Damato
@ 2024-04-26 23:52       ` Jakub Kicinski
  2024-04-27  0:28         ` Joe Damato
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:52 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, 26 Apr 2024 16:43:53 -0700 Joe Damato wrote:
> > In case of mlx4 looks like the buffer refill is "async", the driver
> > tries to refill the buffers to max, but if it fails the next NAPI poll
> > will try again. Allocation failures are not directly tied to packet
> > drops. In case of bnxt if "replacement" buffer can't be allocated -
> > packet is dropped and old buffer gets returned to the ring (although 
> > if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
> > gets incremented on ifup pre-fill failures).  
> 
> Yes, I see that now. I'll drop this patch entirely from v3 and just leave
> the other two and remove alloc_fail from the queue stats patch.

Up to you, but I'd keep alloc_fail itself.
If mlx4 gets page pool support one day it will be useful to run this:
https://lore.kernel.org/all/20240426232400.624864-1-kuba@kernel.org/

And I think it's useful to be able to check in case there are Rx
discards whether the system was also under transient memory pressure 
or not.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-26 20:01   ` Jakub Kicinski
@ 2024-04-27  0:05     ` Joe Damato
  0 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2024-04-27  0:05 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, Apr 26, 2024 at 01:01:16PM -0700, Jakub Kicinski wrote:
> On Fri, 26 Apr 2024 18:33:55 +0000 Joe Damato wrote:
> > Make mlx4 compatible with the newly added netlink queue stats API.
> > 
> > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
> 
> Not sure what the "master" and "port_up" things are :) 
> but the rest looks good:

So in mlx4_en_DUMP_ETH_STATS, the driver calls mlx4_en_fold_software_stats
which does the same "port_up" / "master" check and bails out... so I figured
for these stats I should do the same.

Was hoping Mellanox would give us a hint, but glancing at the code where the
MLX4_FLAG_MASTER bit is set, it looks like sriov ? maybe "master" means pf and
"slave" means "vf" ?

Not sure why the stats code bails on is_master but not is_slave, though.

> Reviewed-by: Jakub Kicinski <kuba@kernel.org>

I'll add your reviewed-by to my v3 and wait until sometime mid next week to
send the v3. Hopefully we'll hear back from the Mellanox folks by then if they
have thoughts/opinions on the stats code.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-26 23:52       ` Jakub Kicinski
@ 2024-04-27  0:28         ` Joe Damato
  2024-04-27  0:34           ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Damato @ 2024-04-27  0:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, Apr 26, 2024 at 04:52:13PM -0700, Jakub Kicinski wrote:
> On Fri, 26 Apr 2024 16:43:53 -0700 Joe Damato wrote:
> > > In case of mlx4 looks like the buffer refill is "async", the driver
> > > tries to refill the buffers to max, but if it fails the next NAPI poll
> > > will try again. Allocation failures are not directly tied to packet
> > > drops. In case of bnxt if "replacement" buffer can't be allocated -
> > > packet is dropped and old buffer gets returned to the ring (although 
> > > if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
> > > gets incremented on ifup pre-fill failures).  
> > 
> > Yes, I see that now. I'll drop this patch entirely from v3 and just leave
> > the other two and remove alloc_fail from the queue stats patch.
> 
> Up to you, but I'd keep alloc_fail itself.
> If mlx4 gets page pool support one day it will be useful to run this:
> https://lore.kernel.org/all/20240426232400.624864-1-kuba@kernel.org/
> 
> And I think it's useful to be able to check in case there are Rx
> discards whether the system was also under transient memory pressure 
> or not.

Ah, maybe I read what you wrote incorrectly in your previous message.

I think you were saying that I should drop just the

  dev->stats.rx_missed_errors = dropped;

due to the definition of rx_missed_errors, but that by the definition of
rx-alloc-fail:

  alloc_fail = ring->dropped;

is still valid and can stay.

Is that right or am I just totally off?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-27  0:28         ` Joe Damato
@ 2024-04-27  0:34           ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-27  0:34 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Fri, 26 Apr 2024 17:28:03 -0700 Joe Damato wrote:
> Ah, maybe I read what you wrote incorrectly in your previous message.
> 
> I think you were saying that I should drop just the
> 
>   dev->stats.rx_missed_errors = dropped;
> 
> due to the definition of rx_missed_errors, but that by the definition of
> rx-alloc-fail:
> 
>   alloc_fail = ring->dropped;
> 
> is still valid and can stay.

That's right, yes.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-27  0:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-26 18:33 [PATCH net-next v2 0/3] mlx4: Add support for netdev-genl API Joe Damato
2024-04-26 18:33 ` [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
2024-04-26 20:00   ` Jakub Kicinski
2024-04-26 23:43     ` Joe Damato
2024-04-26 23:52       ` Jakub Kicinski
2024-04-27  0:28         ` Joe Damato
2024-04-27  0:34           ` Jakub Kicinski
2024-04-26 18:33 ` [PATCH net-next v2 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
2024-04-26 20:01   ` Jakub Kicinski
2024-04-26 18:33 ` [PATCH net-next v2 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
2024-04-26 20:01   ` Jakub Kicinski
2024-04-27  0:05     ` Joe Damato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.