Netdev Archive mirror
 help / color / mirror / Atom feed
From: Andy Gospodarek <andrew.gospodarek@broadcom.com>
To: David Wei <dw@davidwei.uk>
Cc: Andy Gospodarek <andrew.gospodarek@broadcom.com>,
	Vadim Fedorenko <vadim.fedorenko@linux.dev>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Wei Huang <wei.huang2@amd.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, netdev@vger.kernel.org,
	bhelgaas@google.com, corbet@lwn.net, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	alex.williamson@redhat.com, michael.chan@broadcom.com,
	manoj.panicker2@amd.com, Eric.VanTassell@amd.com
Subject: Re: [PATCH V1 8/9] bnxt_en: Add TPH support in BNXT driver
Date: Fri, 10 May 2024 16:33:14 -0400	[thread overview]
Message-ID: <Zj6EimbBR9hp_ILT@C02YVCJELVCG.dhcp.broadcom.net> (raw)
In-Reply-To: <0ef50183-42d4-4abd-adeb-bd92b030fe6a@davidwei.uk>

On Fri, May 10, 2024 at 01:03:50PM -0700, David Wei wrote:
> On 2024-05-10 08:23, Andy Gospodarek wrote:
> > On Fri, May 10, 2024 at 11:35:35AM +0100, Vadim Fedorenko wrote:
> >> On 10.05.2024 04:55, Ajit Khaparde wrote:
> >>> On Thu, May 9, 2024 at 2:50 PM Vadim Fedorenko
> >>> <vadim.fedorenko@linux.dev> wrote:
> >>>>
> >>>> On 09/05/2024 17:27, Wei Huang wrote:
> >>>>> From: Manoj Panicker <manoj.panicker2@amd.com>
> >>>>>
> >>>>> As a usage example, this patch implements TPH support in Broadcom BNXT
> >>>>> device driver by invoking pcie_tph_set_st() function when interrupt
> >>>>> affinity is changed.
> >>>>>
> >>>>> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> >>>>> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> >>>>> Reviewed-by: Wei Huang <wei.huang2@amd.com>
> >>>>> Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
> >>>>> ---
> >>>>>    drivers/net/ethernet/broadcom/bnxt/bnxt.c | 51 +++++++++++++++++++++++
> >>>>>    drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 ++
> >>>>>    2 files changed, 55 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >>>>> index 2c2ee79c4d77..be9c17566fb4 100644
> >>>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >>>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >>>>> @@ -55,6 +55,7 @@
> >>>>>    #include <net/page_pool/helpers.h>
> >>>>>    #include <linux/align.h>
> >>>>>    #include <net/netdev_queues.h>
> >>>>> +#include <linux/pci-tph.h>
> >>>>>
> >>>>>    #include "bnxt_hsi.h"
> >>>>>    #include "bnxt.h"
> >>>>> @@ -10491,6 +10492,7 @@ static void bnxt_free_irq(struct bnxt *bp)
> >>>>>                                free_cpumask_var(irq->cpu_mask);
> >>>>>                                irq->have_cpumask = 0;
> >>>>>                        }
> >>>>> +                     irq_set_affinity_notifier(irq->vector, NULL);
> >>>>>                        free_irq(irq->vector, bp->bnapi[i]);
> >>>>>                }
> >>>>>
> >>>>> @@ -10498,6 +10500,45 @@ static void bnxt_free_irq(struct bnxt *bp)
> >>>>>        }
> >>>>>    }
> >>>>>
> >>>>> +static void bnxt_rtnl_lock_sp(struct bnxt *bp);
> >>>>> +static void bnxt_rtnl_unlock_sp(struct bnxt *bp);
> >>>>> +static void bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
> >>>>> +                                  const cpumask_t *mask)
> >>>>> +{
> >>>>> +     struct bnxt_irq *irq;
> >>>>> +
> >>>>> +     irq = container_of(notify, struct bnxt_irq, affinity_notify);
> >>>>> +     cpumask_copy(irq->cpu_mask, mask);
> >>>>> +
> >>>>> +     if (!pcie_tph_set_st(irq->bp->pdev, irq->msix_nr,
> >>>>> +                          cpumask_first(irq->cpu_mask),
> >>>>> +                          TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
> >>>>> +             pr_err("error in configuring steering tag\n");
> >>>>> +
> >>>>> +     if (netif_running(irq->bp->dev)) {
> >>>>> +             rtnl_lock();
> >>>>> +             bnxt_close_nic(irq->bp, false, false);
> >>>>> +             bnxt_open_nic(irq->bp, false, false);
> >>>>> +             rtnl_unlock();
> >>>>> +     }
> >>>>
> >>>> Is it really needed? It will cause link flap and pause in the traffic
> >>>> service for the device. Why the device needs full restart in this case?
> >>>
> >>> In that sequence only the rings are recreated for the hardware to sync
> >>> up the tags.
> >>>
> >>> Actually its not a full restart. There is no link reinit or other
> >>> heavy lifting in this sequence.
> >>> The pause in traffic may be momentary. Do IRQ/CPU affinities change frequently?
> >>> Probably not?
> >>
> >> From what I can see in bnxt_en, proper validation of link_re_init parameter is
> >> not (yet?) implemented, __bnxt_open_nic will unconditionally call
> >> netif_carrier_off() which will be treated as loss of carrier with counters
> >> increment and proper events posted. Changes to CPU affinities were
> >> non-disruptive before the patch, but now it may break user-space
> >> assumptions.
> > 
> > From my testing the link should not flap.  I just fired up a recent net-next
> > and confirmed the same by calling $ ethtool -G ens7f0np0 rx 1024 which does a
> > similar bnxt_close_nic(bp, false, false)/bnxt_open_nic(bp, false, false) as
> > this patch.  Link remained up -- even with a non-Broadocm link-partner.
> > 
> >> Does FW need full rings re-init to update target value, which is one u32 write?
> >> It looks like overkill TBH.
> > 
> > Full rings do not, but the initialization of that particular ring associated
> > with this irq does need to be done.  On my list of things we need to do in
> > bnxt_en is implement the new ndo_queue_stop/start and ndo_queue_mem_alloc/free
> > operations and once those are done we could make a switch as that may be less
> > disruptive.
> 
> Hi Andy, I have an implementation of the new ndo_queue_stop/start() API
> [1] and would appreciate comments. I've been trying to test it but
> without avail due to FW issues.
> 
> [1]: https://lore.kernel.org/netdev/20240502045410.3524155-1-dw@davidwei.uk/
> 

David, I will take a look at those in more detail over the weekend or on Monday
(they are sitting in my inbox).

The overall structure looks good, but I do have at least one concern that is
related to what would need to be done in the hardware pipeline to be sure it is
safe to free packet buffers.

> > 
> >> And yes, affinities can be change on fly according to the changes of the
> >> workload on the host.
> >>
> >>>>
> >>>>
> >>>>> +}
> >>>>> +
> >>>>> +static void bnxt_irq_affinity_release(struct kref __always_unused *ref)
> >>>>> +{
> >>>>> +}
> >>>>> +
> >>>>> +static inline void __bnxt_register_notify_irqchanges(struct bnxt_irq *irq)
> >>>>
> >>>> No inlines in .c files, please. Let compiler decide what to inline.
> >>>>
> >>>>> +{
> >>>>> +     struct irq_affinity_notify *notify;
> >>>>> +
> >>>>> +     notify = &irq->affinity_notify;
> >>>>> +     notify->irq = irq->vector;
> >>>>> +     notify->notify = bnxt_irq_affinity_notify;
> >>>>> +     notify->release = bnxt_irq_affinity_release;
> >>>>> +
> >>>>> +     irq_set_affinity_notifier(irq->vector, notify);
> >>>>> +}
> >>>>> +
> >>>>>    static int bnxt_request_irq(struct bnxt *bp)
> >>>>>    {
> >>>>>        int i, j, rc = 0;
> >>>>> @@ -10543,6 +10584,7 @@ static int bnxt_request_irq(struct bnxt *bp)
> >>>>>                        int numa_node = dev_to_node(&bp->pdev->dev);
> >>>>>
> >>>>>                        irq->have_cpumask = 1;
> >>>>> +                     irq->msix_nr = map_idx;
> >>>>>                        cpumask_set_cpu(cpumask_local_spread(i, numa_node),
> >>>>>                                        irq->cpu_mask);
> >>>>>                        rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
> >>>>> @@ -10552,6 +10594,15 @@ static int bnxt_request_irq(struct bnxt *bp)
> >>>>>                                            irq->vector);
> >>>>>                                break;
> >>>>>                        }
> >>>>> +
> >>>>> +                     if (!pcie_tph_set_st(bp->pdev, i,
> >>>>> +                                          cpumask_first(irq->cpu_mask),
> >>>>> +                                          TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY)) {
> >>>>> +                             netdev_err(bp->dev, "error in setting steering tag\n");
> >>>>> +                     } else {
> >>>>> +                             irq->bp = bp;
> >>>>> +                             __bnxt_register_notify_irqchanges(irq);
> >>>>> +                     }
> >>>>>                }
> >>>>>        }
> >>>>>        return rc;
> >>>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> >>>>> index dd849e715c9b..0d3442590bb4 100644
> >>>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> >>>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> >>>>> @@ -1195,6 +1195,10 @@ struct bnxt_irq {
> >>>>>        u8              have_cpumask:1;
> >>>>>        char            name[IFNAMSIZ + 2];
> >>>>>        cpumask_var_t   cpu_mask;
> >>>>> +
> >>>>> +     int             msix_nr;
> >>>>> +     struct bnxt     *bp;
> >>>>> +     struct irq_affinity_notify affinity_notify;
> >>>>>    };
> >>>>>
> >>>>>    #define HWRM_RING_ALLOC_TX  0x1
> >>>>
> >>

  reply	other threads:[~2024-05-10 20:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09 16:27 [PATCH V1 0/9] PCIe TPH and cache direct injection support Wei Huang
2024-05-09 16:27 ` [PATCH V1 1/9] PCI: Introduce PCIe TPH support framework Wei Huang
2024-05-09 16:27 ` [PATCH V1 2/9] PCI: Add TPH related register definition Wei Huang
2024-05-09 16:27 ` [PATCH V1 3/9] PCI/TPH: Implement a command line option to disable TPH Wei Huang
2024-05-09 16:27 ` [PATCH V1 4/9] PCI/TPH: Implement a command line option to force No ST Mode Wei Huang
2024-05-09 16:27 ` [PATCH V1 5/9] PCI/TPH: Introduce API functions to get/set steering tags Wei Huang
2024-05-10  3:07   ` kernel test robot
2024-05-11 20:15   ` Simon Horman
2024-05-13 13:29     ` Wei Huang
2024-05-09 16:27 ` [PATCH V1 6/9] PCI/TPH: Retrieve steering tag from ACPI _DSM Wei Huang
2024-05-10  4:20   ` kernel test robot
2024-05-10  5:24   ` kernel test robot
2024-05-09 16:27 ` [PATCH V1 7/9] PCI/TPH: Add TPH documentation Wei Huang
2024-05-15 12:11   ` Bagas Sanjaya
2024-05-09 16:27 ` [PATCH V1 8/9] bnxt_en: Add TPH support in BNXT driver Wei Huang
2024-05-09 21:50   ` Vadim Fedorenko
2024-05-10  3:55     ` Ajit Khaparde
2024-05-10 10:35       ` Vadim Fedorenko
2024-05-10 15:23         ` Andy Gospodarek
2024-05-10 20:03           ` David Wei
2024-05-10 20:33             ` Andy Gospodarek [this message]
2024-05-10 20:33           ` Vadim Fedorenko
2024-05-10 20:37             ` Andy Gospodarek
2024-05-10  3:10   ` Somnath Kotur
2024-05-09 16:27 ` [PATCH V1 9/9] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zj6EimbBR9hp_ILT@C02YVCJELVCG.dhcp.broadcom.net \
    --to=andrew.gospodarek@broadcom.com \
    --cc=Eric.VanTassell@amd.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=manoj.panicker2@amd.com \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vadim.fedorenko@linux.dev \
    --cc=wei.huang2@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).