All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015
@ 2015-06-17 15:26 Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 01/12] net/mlx4_en: Fix off-by-four in ethtool Or Gerlitz
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Or Gerlitz

Hi Dave,

This series has two fixes from Eran to his recent SRIOV counters work in 
mlx4 and few more updates from Saeed and Achiad to the mlx5 Ethernet
code. All fixes here relate to net-next code, so no need for -stable.

Or.

Achiad Shochat (7):
  net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Remove extra spaces
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Pop cq outside mlx5e_get_cqe

Eran Ben Elisha (2):
  net/mlx4_en: Fix off-by-four in ethtool
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device

Saeed Mahameed (3):
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  net/mlx5e: Prefetch skb data on RX

 drivers/net/ethernet/mellanox/mlx4/en_port.c      | 14 ++++
 drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h   |  5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 45 ++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |  5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 92 +++++++++++------------
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 10 +--
 7 files changed, 96 insertions(+), 81 deletions(-)

-- 
2.3.7

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net-next 01/12] net/mlx4_en: Fix off-by-four in ethtool
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 02/12] net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device Or Gerlitz
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Eran Ben Elisha, Or Gerlitz

From: Eran Ben Elisha <eranbe@mellanox.com>

NUM_ALL_STATS was not updated with the new four entries, instead
NUM_FLOW_STATS was updated, fix it. that caused off-by-four for all
counters below pf_*_*.

Fixes: b42de4d01264 ('net/mlx4_en: Show PF own statistics via ethtool')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h
index c5c1de9..7fd466c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_stats.h
@@ -79,8 +79,7 @@ struct mlx4_en_flow_stats_tx {
 
 #define NUM_FLOW_STATS (NUM_FLOW_STATS_RX + NUM_FLOW_STATS_TX + \
 			NUM_FLOW_PRIORITY_STATS_TX + \
-			NUM_FLOW_PRIORITY_STATS_RX + \
-			NUM_PF_STATS)
+			NUM_FLOW_PRIORITY_STATS_RX)
 
 struct mlx4_en_stat_out_flow_control_mbox {
 	/* Total number of PAUSE frames received from the far-end port */
@@ -108,7 +107,7 @@ enum {
 };
 
 #define NUM_ALL_STATS	(NUM_MAIN_STATS + NUM_PORT_STATS + NUM_PKT_STATS + \
-			 NUM_FLOW_STATS + NUM_PERF_STATS)
+			 NUM_FLOW_STATS + NUM_PERF_STATS + NUM_PF_STATS)
 
 #define MLX4_FIND_NETDEV_STAT(n) (offsetof(struct net_device_stats, n) / \
 				  sizeof(((struct net_device_stats *)0)->n))
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 02/12] net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 01/12] net/mlx4_en: Fix off-by-four in ethtool Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 03/12] net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues Or Gerlitz
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Eran Ben Elisha, Or Gerlitz

From: Eran Ben Elisha <eranbe@mellanox.com>

Under SRIOV, the port rx/tx bytes/packets statistics should by read
from the HW instead of using the PF netdevice SW accounting. This is
needed in order to get the full port statistics and not just the PF
own ones

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_port.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 73f6277..ee99e67 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -203,6 +203,20 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 port, u8 reset)
 		priv->port_stats.tso_packets       += ring->tso_packets;
 		priv->port_stats.xmit_more         += ring->xmit_more;
 	}
+	if (mlx4_is_master(mdev->dev)) {
+		stats->rx_packets = en_stats_adder(&mlx4_en_stats->RTOT_prio_0,
+						   &mlx4_en_stats->RTOT_prio_1,
+						   NUM_PRIORITIES);
+		stats->tx_packets = en_stats_adder(&mlx4_en_stats->TTOT_prio_0,
+						   &mlx4_en_stats->TTOT_prio_1,
+						   NUM_PRIORITIES);
+		stats->rx_bytes = en_stats_adder(&mlx4_en_stats->ROCT_prio_0,
+						 &mlx4_en_stats->ROCT_prio_1,
+						 NUM_PRIORITIES);
+		stats->tx_bytes = en_stats_adder(&mlx4_en_stats->TOCT_prio_0,
+						 &mlx4_en_stats->TOCT_prio_1,
+						 NUM_PRIORITIES);
+	}
 
 	/* net device stats */
 	stats->rx_errors = be64_to_cpu(mlx4_en_stats->PCS) +
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 03/12] net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 01/12] net/mlx4_en: Fix off-by-four in ethtool Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 02/12] net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 04/12] net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them Or Gerlitz
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Saeed Mahameed, Or Gerlitz

From: Saeed Mahameed <saeedm@mellanox.com>

To save per-packet calculations, we use the following static mappings:
1) priv {channel, tc} to netdev txq (used @mlx5e_selec_queue())
2) netdev txq to priv sq (used @mlx5e_xmit())

Thanks to these static mappings, no more need for a separate implementation
of ndo_start_xmit when multiple TCs are configured.
We believe the performance improvement of such separation would be negligible, if any.
The previous way of dynamically calculating the above mappings required
allocating more TX queues than actually used (@alloc_etherdev_mqs()),
which is now no longer needed.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  5 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 41 +++++++++++++++--------
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 18 ++--------
 3 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index e14120e..1706979 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -316,6 +316,7 @@ struct mlx5e_channel {
 	__be32                     mkey_be;
 	u8                         num_tc;
 	unsigned long              flags;
+	int                        tc_to_txq_map[MLX5E_MAX_NUM_TC];
 
 	/* control */
 	struct mlx5e_priv         *priv;
@@ -379,10 +380,9 @@ struct mlx5e_flow_table {
 
 struct mlx5e_priv {
 	/* priv data path fields - start */
-	int                        order_base_2_num_channels;
-	int                        queue_mapping_channel_mask;
 	int                        num_tc;
 	int                        default_vlan_prio;
+	struct mlx5e_sq            **txq_to_sq_map;
 	/* priv data path fields - end */
 
 	unsigned long              state;
@@ -460,7 +460,6 @@ void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw);
 u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
 		       void *accel_priv, select_queue_fallback_t fallback);
 netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
-netdev_tx_t mlx5e_xmit_multi_tc(struct sk_buff *skb, struct net_device *dev);
 
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 9a48d8e..f5f5eb9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -496,6 +496,7 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 
 	void *sqc = param->sqc;
 	void *sqc_wq = MLX5_ADDR_OF(sqc, sqc, wq);
+	int txq_ix;
 	int err;
 
 	err = mlx5_alloc_map_uar(mdev, &sq->uar);
@@ -515,14 +516,15 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 	if (err)
 		goto err_sq_wq_destroy;
 
-	sq->txq = netdev_get_tx_queue(priv->netdev,
-				      c->ix + tc * priv->params.num_channels);
+	txq_ix = c->ix + tc * priv->params.num_channels;
+	sq->txq = netdev_get_tx_queue(priv->netdev, txq_ix);
 
 	sq->pdev    = c->pdev;
 	sq->mkey_be = c->mkey_be;
 	sq->channel = c;
 	sq->tc      = tc;
 	sq->edge    = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+	priv->txq_to_sq_map[txq_ix] = sq;
 
 	return 0;
 
@@ -902,6 +904,15 @@ static void mlx5e_close_sqs(struct mlx5e_channel *c)
 		mlx5e_close_sq(&c->sq[tc]);
 }
 
+static void mlx5e_build_tc_to_txq_map(struct mlx5e_channel *c,
+				      int num_channels)
+{
+	int i;
+
+	for (i = 0; i < MLX5E_MAX_NUM_TC; i++)
+		c->tc_to_txq_map[i] = c->ix + i * num_channels;
+}
+
 static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 			      struct mlx5e_channel_param *cparam,
 			      struct mlx5e_channel **cp)
@@ -923,6 +934,8 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	c->mkey_be  = cpu_to_be32(priv->mr.key);
 	c->num_tc   = priv->num_tc;
 
+	mlx5e_build_tc_to_txq_map(c, priv->params.num_channels);
+
 	netif_napi_add(netdev, &c->napi, mlx5e_napi_poll, 64);
 
 	err = mlx5e_open_tx_cqs(c, cparam);
@@ -1050,14 +1063,18 @@ static void mlx5e_build_channel_param(struct mlx5e_priv *priv,
 static int mlx5e_open_channels(struct mlx5e_priv *priv)
 {
 	struct mlx5e_channel_param cparam;
-	int err;
+	int err = -ENOMEM;
 	int i;
 	int j;
 
 	priv->channel = kcalloc(priv->params.num_channels,
 				sizeof(struct mlx5e_channel *), GFP_KERNEL);
-	if (!priv->channel)
-		return -ENOMEM;
+
+	priv->txq_to_sq_map = kcalloc(priv->params.num_channels * priv->num_tc,
+				      sizeof(struct mlx5e_sq *), GFP_KERNEL);
+
+	if (!priv->channel || !priv->txq_to_sq_map)
+		goto err_free_txq_to_sq_map;
 
 	mlx5e_build_channel_param(priv, &cparam);
 	for (i = 0; i < priv->params.num_channels; i++) {
@@ -1078,6 +1095,8 @@ err_close_channels:
 	for (i--; i >= 0; i--)
 		mlx5e_close_channel(priv->channel[i]);
 
+err_free_txq_to_sq_map:
+	kfree(priv->txq_to_sq_map);
 	kfree(priv->channel);
 
 	return err;
@@ -1090,6 +1109,7 @@ static void mlx5e_close_channels(struct mlx5e_priv *priv)
 	for (i = 0; i < priv->params.num_channels; i++)
 		mlx5e_close_channel(priv->channel[i]);
 
+	kfree(priv->txq_to_sq_map);
 	kfree(priv->channel);
 }
 
@@ -1384,8 +1404,7 @@ int mlx5e_open_locked(struct net_device *netdev)
 	int num_txqs;
 	int err;
 
-	num_txqs = roundup_pow_of_two(priv->params.num_channels) *
-		   priv->params.num_tc;
+	num_txqs = priv->params.num_channels * priv->params.num_tc;
 	netif_set_real_num_tx_queues(netdev, num_txqs);
 	netif_set_real_num_rx_queues(netdev, priv->params.num_channels);
 
@@ -1693,9 +1712,6 @@ static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
 	priv->mdev                         = mdev;
 	priv->netdev                       = netdev;
 	priv->params.num_channels          = num_comp_vectors;
-	priv->order_base_2_num_channels    = order_base_2(num_comp_vectors);
-	priv->queue_mapping_channel_mask   =
-		roundup_pow_of_two(num_comp_vectors) - 1;
 	priv->num_tc                       = priv->params.num_tc;
 	priv->default_vlan_prio            = priv->params.default_vlan_prio;
 
@@ -1723,7 +1739,6 @@ static void mlx5e_build_netdev(struct net_device *netdev)
 
 	if (priv->num_tc > 1) {
 		mlx5e_netdev_ops.ndo_select_queue = mlx5e_select_queue;
-		mlx5e_netdev_ops.ndo_start_xmit   = mlx5e_xmit_multi_tc;
 	}
 
 	netdev->netdev_ops        = &mlx5e_netdev_ops;
@@ -1793,9 +1808,7 @@ static void *mlx5e_create_netdev(struct mlx5_core_dev *mdev)
 	if (mlx5e_check_required_hca_cap(mdev))
 		return NULL;
 
-	netdev = alloc_etherdev_mqs(sizeof(struct mlx5e_priv),
-				    roundup_pow_of_two(ncv) * MLX5E_MAX_NUM_TC,
-				    ncv);
+	netdev = alloc_etherdev_mqs(sizeof(struct mlx5e_priv), ncv, ncv);
 	if (!netdev) {
 		mlx5_core_err(mdev, "alloc_etherdev_mqs() failed\n");
 		return NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index bac268a..471babd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -106,7 +106,7 @@ u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
 		 priv->default_vlan_prio;
 	int tc = netdev_get_prio_tc_map(dev, up);
 
-	return (tc << priv->order_base_2_num_channels) | channel_ix;
+	return priv->channel[channel_ix]->tc_to_txq_map[tc];
 }
 
 static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
@@ -250,21 +250,7 @@ dma_unmap_wqe_err:
 netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	int ix = skb->queue_mapping;
-	int tc = 0;
-	struct mlx5e_channel *c = priv->channel[ix];
-	struct mlx5e_sq *sq = &c->sq[tc];
-
-	return mlx5e_sq_xmit(sq, skb);
-}
-
-netdev_tx_t mlx5e_xmit_multi_tc(struct sk_buff *skb, struct net_device *dev)
-{
-	struct mlx5e_priv *priv = netdev_priv(dev);
-	int ix = skb->queue_mapping & priv->queue_mapping_channel_mask;
-	int tc = skb->queue_mapping >> priv->order_base_2_num_channels;
-	struct mlx5e_channel *c = priv->channel[ix];
-	struct mlx5e_sq *sq = &c->sq[tc];
+	struct mlx5e_sq *sq = priv->txq_to_sq_map[skb_get_queue_mapping(skb)];
 
 	return mlx5e_sq_xmit(sq, skb);
 }
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 04/12] net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (2 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 03/12] net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency Or Gerlitz
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Saeed Mahameed, Or Gerlitz

From: Saeed Mahameed <saeedm@mellanox.com>

Instead of counting number of gso fragments, we can use
skb_shinfo(skb)->gso_segs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 471babd..c0566b6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -143,16 +143,13 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 
 	if (skb_is_gso(skb)) {
 		u32 payload_len;
-		int num_pkts;
 
 		eseg->mss    = cpu_to_be16(skb_shinfo(skb)->gso_size);
 		opcode       = MLX5_OPCODE_LSO;
 		ihs          = skb_transport_offset(skb) + tcp_hdrlen(skb);
 		payload_len  = skb->len - ihs;
-		num_pkts     =    (payload_len / skb_shinfo(skb)->gso_size) +
-				!!(payload_len % skb_shinfo(skb)->gso_size);
 		MLX5E_TX_SKB_CB(skb)->num_bytes = skb->len +
-						  (num_pkts - 1) * ihs;
+					(skb_shinfo(skb)->gso_segs - 1) * ihs;
 		sq->stats.tso_packets++;
 		sq->stats.tso_bytes += payload_len;
 	} else {
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (3 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 04/12] net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-21 17:21   ` David Miller
  2015-06-17 15:26 ` [PATCH net-next 06/12] net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() Or Gerlitz
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

For better round trip latency, handle rx completions before
tx completions.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 088bc42..9f31572 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -60,13 +60,13 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 
 	clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
 
-	for (i = 0; i < c->num_tc; i++)
-		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
-
 	busy |= mlx5e_poll_rx_cq(&c->rq.cq, budget);
 
 	busy |= mlx5e_post_rx_wqes(c->rq.cq.sqrq);
 
+	for (i = 0; i < c->num_tc; i++)
+		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+
 	if (busy)
 		return budget;
 
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 06/12] net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (4 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 07/12] net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion Or Gerlitz
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

It is already assigned at mlx5e_build_rq_param()

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f5f5eb9..45dc8c2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -345,7 +345,6 @@ static int mlx5e_enable_rq(struct mlx5e_rq *rq, struct mlx5e_rq_param *param)
 	MLX5_SET(rqc,  rqc, cqn,		c->rq.cq.mcq.cqn);
 	MLX5_SET(rqc,  rqc, state,		MLX5_RQC_STATE_RST);
 	MLX5_SET(rqc,  rqc, flush_in_error_en,	1);
-	MLX5_SET(wq,   wq,  wq_type,		MLX5_WQ_TYPE_LINKED_LIST);
 	MLX5_SET(wq,   wq,  log_wq_pg_sz,	rq->wq_ctrl.buf.page_shift -
 						PAGE_SHIFT);
 	MLX5_SET64(wq, wq,  dbr_addr,		rq->wq_ctrl.db.dma);
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 07/12] net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (5 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 06/12] net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 08/12] net/mlx5e: Avoid TX CQE generation if more xmit packets expected Or Gerlitz
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

NOP completion SKBs are always NULL.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c0566b6..f5c7d78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -294,7 +294,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 		if (unlikely(!skb)) { /* nop */
 			sq->stats.nop++;
 			sqcc++;
-			goto free_skb;
+			continue;
 		}
 
 		for (j = 0; j < MLX5E_TX_SKB_CB(skb)->num_dma; j++) {
@@ -309,8 +309,6 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 		npkts++;
 		nbytes += MLX5E_TX_SKB_CB(skb)->num_bytes;
 		sqcc += MLX5E_TX_SKB_CB(skb)->num_wqebbs;
-
-free_skb:
 		dev_kfree_skb(skb);
 	}
 
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 08/12] net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (6 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 07/12] net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 09/12] net/mlx5e: Remove extra spaces Or Gerlitz
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

In order to save PCI BW consumed by TX CQEs and to reduce the amount of
CPU cache misses caused by TX CQE reading, we request TX CQE generation
only when skb->xmit_more=0.

As a consequence of the above, a single TX CQE may now indicate the
transmission completion of multiple TX SKBs.

This also handles a problem introduced in commit b1b8105ebf41 "net/mlx5e:
Support NETIF_F_SG" where we didn't ask for NOP completions while the
driver didn't have the proper code to handle this case.

Fixes: b1b8105ebf41 ('net/mlx5e: Support NETIF_F_SG')
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 57 +++++++++++++++----------
 1 file changed, 34 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f5c7d78..a45d751 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -210,7 +210,6 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 
 	cseg->opmod_idx_opcode	= cpu_to_be32((sq->pc << 8) | opcode);
 	cseg->qpn_ds		= cpu_to_be32((sq->sqn << 8) | ds_cnt);
-	cseg->fm_ce_se		= MLX5_WQE_CTRL_CQ_UPDATE;
 
 	sq->skb[pi] = skb;
 
@@ -225,8 +224,10 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 		sq->stats.stopped++;
 	}
 
-	if (!skb->xmit_more || netif_xmit_stopped(sq->txq))
+	if (!skb->xmit_more || netif_xmit_stopped(sq->txq)) {
+		cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
 		mlx5e_tx_notify_hw(sq, wqe);
+	}
 
 	/* fill sq edge with nops to avoid wqe wrap around */
 	while ((sq->pc & wq->sz_m1) > sq->edge)
@@ -280,36 +281,46 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 
 	for (i = 0; i < MLX5E_TX_CQ_POLL_BUDGET; i++) {
 		struct mlx5_cqe64 *cqe;
-		struct sk_buff *skb;
-		u16 ci;
-		int j;
+		u16 wqe_counter;
+		bool last_wqe;
 
 		cqe = mlx5e_get_cqe(cq);
 		if (!cqe)
 			break;
 
-		ci = sqcc & sq->wq.sz_m1;
-		skb = sq->skb[ci];
+		wqe_counter = be16_to_cpu(cqe->wqe_counter);
+
+		do {
+			struct sk_buff *skb;
+			u16 ci;
+			int j;
+
+			last_wqe = (sqcc == wqe_counter);
+
+			ci = sqcc & sq->wq.sz_m1;
+			skb = sq->skb[ci];
 
-		if (unlikely(!skb)) { /* nop */
-			sq->stats.nop++;
-			sqcc++;
-			continue;
-		}
+			if (unlikely(!skb)) { /* nop */
+				sq->stats.nop++;
+				sqcc++;
+				continue;
+			}
 
-		for (j = 0; j < MLX5E_TX_SKB_CB(skb)->num_dma; j++) {
-			dma_addr_t addr;
-			u32 size;
+			for (j = 0; j < MLX5E_TX_SKB_CB(skb)->num_dma; j++) {
+				dma_addr_t addr;
+				u32 size;
 
-			mlx5e_dma_get(sq, dma_fifo_cc, &addr, &size);
-			dma_fifo_cc++;
-			dma_unmap_single(sq->pdev, addr, size, DMA_TO_DEVICE);
-		}
+				mlx5e_dma_get(sq, dma_fifo_cc, &addr, &size);
+				dma_fifo_cc++;
+				dma_unmap_single(sq->pdev, addr, size,
+						 DMA_TO_DEVICE);
+			}
 
-		npkts++;
-		nbytes += MLX5E_TX_SKB_CB(skb)->num_bytes;
-		sqcc += MLX5E_TX_SKB_CB(skb)->num_wqebbs;
-		dev_kfree_skb(skb);
+			npkts++;
+			nbytes += MLX5E_TX_SKB_CB(skb)->num_bytes;
+			sqcc += MLX5E_TX_SKB_CB(skb)->num_wqebbs;
+			dev_kfree_skb(skb);
+		} while (!last_wqe);
 	}
 
 	mlx5_cqwq_update_db_record(&cq->wq);
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 09/12] net/mlx5e: Remove extra spaces
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (7 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 08/12] net/mlx5e: Avoid TX CQE generation if more xmit packets expected Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 10/12] net/mlx5e: Remove mlx5e_cq.sqrq back-pointer Or Gerlitz
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

Coding Style fix, remove extra spaces.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index a45d751..67493ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -153,7 +153,7 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 		sq->stats.tso_packets++;
 		sq->stats.tso_bytes += payload_len;
 	} else {
-		ihs             = mlx5e_get_inline_hdr_size(sq, skb);
+		ihs = mlx5e_get_inline_hdr_size(sq, skb);
 		MLX5E_TX_SKB_CB(skb)->num_bytes = max_t(unsigned int, skb->len,
 							ETH_ZLEN);
 	}
@@ -161,7 +161,7 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 	skb_copy_from_linear_data(skb, eseg->inline_hdr_start, ihs);
 	skb_pull_inline(skb, ihs);
 
-	eseg->inline_hdr_sz	= cpu_to_be16(ihs);
+	eseg->inline_hdr_sz = cpu_to_be16(ihs);
 
 	ds_cnt  = sizeof(*wqe) / MLX5_SEND_WQE_DS;
 	ds_cnt += DIV_ROUND_UP(ihs - sizeof(eseg->inline_hdr_start),
@@ -208,8 +208,8 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 
 	ds_cnt += MLX5E_TX_SKB_CB(skb)->num_dma;
 
-	cseg->opmod_idx_opcode	= cpu_to_be32((sq->pc << 8) | opcode);
-	cseg->qpn_ds		= cpu_to_be32((sq->sqn << 8) | ds_cnt);
+	cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | opcode);
+	cseg->qpn_ds           = cpu_to_be32((sq->sqn << 8) | ds_cnt);
 
 	sq->skb[pi] = skb;
 
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 10/12] net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (8 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 09/12] net/mlx5e: Remove extra spaces Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 11/12] net/mlx5e: Pop cq outside mlx5e_get_cqe Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 12/12] net/mlx5e: Prefetch skb data on RX Or Gerlitz
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

Use container_of() instead.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      | 1 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 ---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 2 +-
 5 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1706979..3d23bd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -208,7 +208,6 @@ enum cq_flags {
 struct mlx5e_cq {
 	/* data path - accessed per cqe */
 	struct mlx5_cqwq           wq;
-	void                      *sqrq;
 	unsigned long              flags;
 
 	/* data path - accessed per napi poll */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 45dc8c2..40206da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -853,8 +853,6 @@ static int mlx5e_open_tx_cqs(struct mlx5e_channel *c,
 				    priv->params.tx_cq_moderation_pkts);
 		if (err)
 			goto err_close_tx_cqs;
-
-		c->sq[tc].cq.sqrq = &c->sq[tc];
 	}
 
 	return 0;
@@ -946,7 +944,6 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 			    priv->params.rx_cq_moderation_pkts);
 	if (err)
 		goto err_close_tx_cqs;
-	c->rq.cq.sqrq = &c->rq;
 
 	napi_enable(&c->napi);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 06e7c74..4a25957 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -191,7 +191,7 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 
 bool mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 {
-	struct mlx5e_rq *rq = cq->sqrq;
+	struct mlx5e_rq *rq = container_of(cq, struct mlx5e_rq, cq);
 	int i;
 
 	/* avoid accessing cq (dma coherent memory) if not needed */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 67493ab..c789619 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -266,7 +266,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 	if (!test_and_clear_bit(MLX5E_CQ_HAS_CQES, &cq->flags))
 		return false;
 
-	sq = cq->sqrq;
+	sq = container_of(cq, struct mlx5e_sq, cq);
 
 	npkts = 0;
 	nbytes = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 9f31572..8f7cbac 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -62,7 +62,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 
 	busy |= mlx5e_poll_rx_cq(&c->rq.cq, budget);
 
-	busy |= mlx5e_post_rx_wqes(c->rq.cq.sqrq);
+	busy |= mlx5e_post_rx_wqes(&c->rq);
 
 	for (i = 0; i < c->num_tc; i++)
 		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 11/12] net/mlx5e: Pop cq outside mlx5e_get_cqe
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (9 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 10/12] net/mlx5e: Remove mlx5e_cq.sqrq back-pointer Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  2015-06-17 15:26 ` [PATCH net-next 12/12] net/mlx5e: Prefetch skb data on RX Or Gerlitz
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Tal Alon, Achiad Shochat, Saeed Mahameed,
	Or Gerlitz

From: Achiad Shochat <achiad@mellanox.com>

Separate between mlx5e_get_cqe() and mlx5_cqwq_pop(), this helps for
better code readability and better CQ buffer management.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 2 --
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 4a25957..760b3ef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -209,6 +209,8 @@ bool mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 		if (!cqe)
 			break;
 
+		mlx5_cqwq_pop(&cq->wq);
+
 		wqe_counter_be = cqe->wqe_counter;
 		wqe_counter    = be16_to_cpu(wqe_counter_be);
 		wqe            = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c789619..03f28f4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -288,6 +288,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 		if (!cqe)
 			break;
 
+		mlx5_cqwq_pop(&cq->wq);
+
 		wqe_counter = be16_to_cpu(cqe->wqe_counter);
 
 		do {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 8f7cbac..0406fae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -43,8 +43,6 @@ struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq)
 	if (cqe_ownership_bit != sw_ownership_val)
 		return NULL;
 
-	mlx5_cqwq_pop(wq);
-
 	/* ensure cqe content is read after cqe ownership bit */
 	rmb();
 
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 12/12] net/mlx5e: Prefetch skb data on RX
  2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
                   ` (10 preceding siblings ...)
  2015-06-17 15:26 ` [PATCH net-next 11/12] net/mlx5e: Pop cq outside mlx5e_get_cqe Or Gerlitz
@ 2015-06-17 15:26 ` Or Gerlitz
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-17 15:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Tal Alon, Saeed Mahameed, Or Gerlitz

From: Saeed Mahameed <saeedm@mellanox.com>

Prefetch the 1st cache line used by the buffer pointed by
the skb linear data.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 760b3ef..9a93741 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -215,6 +215,7 @@ bool mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 		wqe_counter    = be16_to_cpu(wqe_counter_be);
 		wqe            = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
 		skb            = rq->skb[wqe_counter];
+		prefetch(skb->data);
 		rq->skb[wqe_counter] = NULL;
 
 		dma_unmap_single(rq->pdev,
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  2015-06-17 15:26 ` [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency Or Gerlitz
@ 2015-06-21 17:21   ` David Miller
  2015-06-21 21:35     ` achiad shochat
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2015-06-21 17:21 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, amirv, talal, achiad, saeedm

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Wed, 17 Jun 2015 18:26:22 +0300

> From: Achiad Shochat <achiad@mellanox.com>
> 
> For better round trip latency, handle rx completions before
> tx completions.
> 
> Signed-off-by: Achiad Shochat <achiad@mellanox.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

I completely disagree with this change.

TX completions should always be handled first because they free up resources
and therefore increase the likelyhood that RX processing will not fail due
to lack of resources (memory, etc.).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  2015-06-21 17:21   ` David Miller
@ 2015-06-21 21:35     ` achiad shochat
  2015-06-22 13:35       ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: achiad shochat @ 2015-06-21 21:35 UTC (permalink / raw)
  To: David Miller; +Cc: ogerlitz, netdev, amirv, talal, saeedm

Hello Dave,

In mlx5 the RX processing is broken down into two stages:
1) Hand to kernel SKBs of completed RX packets - @mlx5e_poll_rx_cq()
2) Allocate and post to HW new RX buffers - @mlx5e_post_rx_wqes()

Would handling of TX completions in between stages (1) and (2) be OK?

On 21 June 2015 at 20:21, David Miller <davem@davemloft.net> wrote:
> From: Or Gerlitz <ogerlitz@mellanox.com>
> Date: Wed, 17 Jun 2015 18:26:22 +0300
>
>> From: Achiad Shochat <achiad@mellanox.com>
>>
>> For better round trip latency, handle rx completions before
>> tx completions.
>>
>> Signed-off-by: Achiad Shochat <achiad@mellanox.com>
>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>
> I completely disagree with this change.
>
> TX completions should always be handled first because they free up resources
> and therefore increase the likelyhood that RX processing will not fail due
> to lack of resources (memory, etc.).
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  2015-06-21 21:35     ` achiad shochat
@ 2015-06-22 13:35       ` David Miller
  2015-06-23 14:12         ` Or Gerlitz
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2015-06-22 13:35 UTC (permalink / raw)
  To: achiad.mellanox; +Cc: ogerlitz, netdev, amirv, talal, saeedm

From: achiad shochat <achiad.mellanox@gmail.com>
Date: Mon, 22 Jun 2015 00:35:37 +0300

> Hello Dave,
> 
> In mlx5 the RX processing is broken down into two stages:
> 1) Hand to kernel SKBs of completed RX packets - @mlx5e_poll_rx_cq()
> 2) Allocate and post to HW new RX buffers - @mlx5e_post_rx_wqes()
> 
> Would handling of TX completions in between stages (1) and (2) be OK?

I would do all of TX processing first and synchronously.  It's very
cheap and makes lots of resources available for RX processing.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency
  2015-06-22 13:35       ` David Miller
@ 2015-06-23 14:12         ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2015-06-23 14:12 UTC (permalink / raw)
  To: David Miller, achiad.mellanox; +Cc: netdev, amirv, talal, saeedm

On 6/22/2015 4:35 PM, David Miller wrote:
> From: achiad shochat <achiad.mellanox@gmail.com>
> Date: Mon, 22 Jun 2015 00:35:37 +0300
>
>> Hello Dave,
>>
>> In mlx5 the RX processing is broken down into two stages:
>> 1) Hand to kernel SKBs of completed RX packets - @mlx5e_poll_rx_cq()
>> 2) Allocate and post to HW new RX buffers - @mlx5e_post_rx_wqes()
>>
>> Would handling of TX completions in between stages (1) and (2) be OK?
> I would do all of TX processing first and synchronously.  It's very
> cheap and makes lots of resources available for RX processing.

So we'll drop this patch.

Or.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-06-23 14:12 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-17 15:26 [PATCH net-next 00/12] Mellanox NIC drivers update, Jun 17 2015 Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 01/12] net/mlx4_en: Fix off-by-four in ethtool Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 02/12] net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 03/12] net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 04/12] net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 05/12] net/mlx5e: Poll rx cq before tx cq to improve round-trip latency Or Gerlitz
2015-06-21 17:21   ` David Miller
2015-06-21 21:35     ` achiad shochat
2015-06-22 13:35       ` David Miller
2015-06-23 14:12         ` Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 06/12] net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 07/12] net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 08/12] net/mlx5e: Avoid TX CQE generation if more xmit packets expected Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 09/12] net/mlx5e: Remove extra spaces Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 10/12] net/mlx5e: Remove mlx5e_cq.sqrq back-pointer Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 11/12] net/mlx5e: Pop cq outside mlx5e_get_cqe Or Gerlitz
2015-06-17 15:26 ` [PATCH net-next 12/12] net/mlx5e: Prefetch skb data on RX Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.