Linux-PCI Archive mirror
 help / color / mirror / Atom feed
From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
To: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Wei Huang <wei.huang2@amd.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, netdev@vger.kernel.org,
	bhelgaas@google.com, corbet@lwn.net, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	alex.williamson@redhat.com, gospo@broadcom.com,
	michael.chan@broadcom.com, manoj.panicker2@amd.com,
	Eric.VanTassell@amd.com
Subject: Re: [PATCH V1 8/9] bnxt_en: Add TPH support in BNXT driver
Date: Fri, 10 May 2024 11:35:35 +0100	[thread overview]
Message-ID: <4c6a8b86-6544-4c99-a0f2-030e2ec4e98f@linux.dev> (raw)
In-Reply-To: <CACZ4nhuBMOX8s1ODcJOvvCKp-VsOPHShEUHAsPvB75Yv2823qA@mail.gmail.com>

On 10.05.2024 04:55, Ajit Khaparde wrote:
> On Thu, May 9, 2024 at 2:50 PM Vadim Fedorenko
> <vadim.fedorenko@linux.dev> wrote:
>>
>> On 09/05/2024 17:27, Wei Huang wrote:
>>> From: Manoj Panicker <manoj.panicker2@amd.com>
>>>
>>> As a usage example, this patch implements TPH support in Broadcom BNXT
>>> device driver by invoking pcie_tph_set_st() function when interrupt
>>> affinity is changed.
>>>
>>> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
>>> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
>>> Reviewed-by: Wei Huang <wei.huang2@amd.com>
>>> Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
>>> ---
>>>    drivers/net/ethernet/broadcom/bnxt/bnxt.c | 51 +++++++++++++++++++++++
>>>    drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 ++
>>>    2 files changed, 55 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> index 2c2ee79c4d77..be9c17566fb4 100644
>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> @@ -55,6 +55,7 @@
>>>    #include <net/page_pool/helpers.h>
>>>    #include <linux/align.h>
>>>    #include <net/netdev_queues.h>
>>> +#include <linux/pci-tph.h>
>>>
>>>    #include "bnxt_hsi.h"
>>>    #include "bnxt.h"
>>> @@ -10491,6 +10492,7 @@ static void bnxt_free_irq(struct bnxt *bp)
>>>                                free_cpumask_var(irq->cpu_mask);
>>>                                irq->have_cpumask = 0;
>>>                        }
>>> +                     irq_set_affinity_notifier(irq->vector, NULL);
>>>                        free_irq(irq->vector, bp->bnapi[i]);
>>>                }
>>>
>>> @@ -10498,6 +10500,45 @@ static void bnxt_free_irq(struct bnxt *bp)
>>>        }
>>>    }
>>>
>>> +static void bnxt_rtnl_lock_sp(struct bnxt *bp);
>>> +static void bnxt_rtnl_unlock_sp(struct bnxt *bp);
>>> +static void bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
>>> +                                  const cpumask_t *mask)
>>> +{
>>> +     struct bnxt_irq *irq;
>>> +
>>> +     irq = container_of(notify, struct bnxt_irq, affinity_notify);
>>> +     cpumask_copy(irq->cpu_mask, mask);
>>> +
>>> +     if (!pcie_tph_set_st(irq->bp->pdev, irq->msix_nr,
>>> +                          cpumask_first(irq->cpu_mask),
>>> +                          TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
>>> +             pr_err("error in configuring steering tag\n");
>>> +
>>> +     if (netif_running(irq->bp->dev)) {
>>> +             rtnl_lock();
>>> +             bnxt_close_nic(irq->bp, false, false);
>>> +             bnxt_open_nic(irq->bp, false, false);
>>> +             rtnl_unlock();
>>> +     }
>>
>> Is it really needed? It will cause link flap and pause in the traffic
>> service for the device. Why the device needs full restart in this case?
> 
> In that sequence only the rings are recreated for the hardware to sync
> up the tags.
> 
> Actually its not a full restart. There is no link reinit or other
> heavy lifting in this sequence.
> The pause in traffic may be momentary. Do IRQ/CPU affinities change frequently?
> Probably not?

 From what I can see in bnxt_en, proper validation of link_re_init parameter is
not (yet?) implemented, __bnxt_open_nic will unconditionally call 
netif_carrier_off() which will be treated as loss of carrier with counters
increment and proper events posted. Changes to CPU affinities were 
non-distruptive before the patch, but now it may break user-space assumptions.

Does FW need full rings re-init to update target value, which is one u32 write?
It looks like overkill TBH.

And yes, affinities can be change on fly according to the changes of the
workload on the host.

>>
>>
>>> +}
>>> +
>>> +static void bnxt_irq_affinity_release(struct kref __always_unused *ref)
>>> +{
>>> +}
>>> +
>>> +static inline void __bnxt_register_notify_irqchanges(struct bnxt_irq *irq)
>>
>> No inlines in .c files, please. Let compiler decide what to inline.
>>
>>> +{
>>> +     struct irq_affinity_notify *notify;
>>> +
>>> +     notify = &irq->affinity_notify;
>>> +     notify->irq = irq->vector;
>>> +     notify->notify = bnxt_irq_affinity_notify;
>>> +     notify->release = bnxt_irq_affinity_release;
>>> +
>>> +     irq_set_affinity_notifier(irq->vector, notify);
>>> +}
>>> +
>>>    static int bnxt_request_irq(struct bnxt *bp)
>>>    {
>>>        int i, j, rc = 0;
>>> @@ -10543,6 +10584,7 @@ static int bnxt_request_irq(struct bnxt *bp)
>>>                        int numa_node = dev_to_node(&bp->pdev->dev);
>>>
>>>                        irq->have_cpumask = 1;
>>> +                     irq->msix_nr = map_idx;
>>>                        cpumask_set_cpu(cpumask_local_spread(i, numa_node),
>>>                                        irq->cpu_mask);
>>>                        rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
>>> @@ -10552,6 +10594,15 @@ static int bnxt_request_irq(struct bnxt *bp)
>>>                                            irq->vector);
>>>                                break;
>>>                        }
>>> +
>>> +                     if (!pcie_tph_set_st(bp->pdev, i,
>>> +                                          cpumask_first(irq->cpu_mask),
>>> +                                          TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY)) {
>>> +                             netdev_err(bp->dev, "error in setting steering tag\n");
>>> +                     } else {
>>> +                             irq->bp = bp;
>>> +                             __bnxt_register_notify_irqchanges(irq);
>>> +                     }
>>>                }
>>>        }
>>>        return rc;
>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> index dd849e715c9b..0d3442590bb4 100644
>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> @@ -1195,6 +1195,10 @@ struct bnxt_irq {
>>>        u8              have_cpumask:1;
>>>        char            name[IFNAMSIZ + 2];
>>>        cpumask_var_t   cpu_mask;
>>> +
>>> +     int             msix_nr;
>>> +     struct bnxt     *bp;
>>> +     struct irq_affinity_notify affinity_notify;
>>>    };
>>>
>>>    #define HWRM_RING_ALLOC_TX  0x1
>>


  reply	other threads:[~2024-05-10 10:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09 16:27 [PATCH V1 0/9] PCIe TPH and cache direct injection support Wei Huang
2024-05-09 16:27 ` [PATCH V1 1/9] PCI: Introduce PCIe TPH support framework Wei Huang
2024-05-09 16:27 ` [PATCH V1 2/9] PCI: Add TPH related register definition Wei Huang
2024-05-09 16:27 ` [PATCH V1 3/9] PCI/TPH: Implement a command line option to disable TPH Wei Huang
2024-05-09 16:27 ` [PATCH V1 4/9] PCI/TPH: Implement a command line option to force No ST Mode Wei Huang
2024-05-09 16:27 ` [PATCH V1 5/9] PCI/TPH: Introduce API functions to get/set steering tags Wei Huang
2024-05-10  3:07   ` kernel test robot
2024-05-11 20:15   ` Simon Horman
2024-05-13 13:29     ` Wei Huang
2024-05-09 16:27 ` [PATCH V1 6/9] PCI/TPH: Retrieve steering tag from ACPI _DSM Wei Huang
2024-05-10  4:20   ` kernel test robot
2024-05-10  5:24   ` kernel test robot
2024-05-09 16:27 ` [PATCH V1 7/9] PCI/TPH: Add TPH documentation Wei Huang
2024-05-15 12:11   ` Bagas Sanjaya
2024-05-09 16:27 ` [PATCH V1 8/9] bnxt_en: Add TPH support in BNXT driver Wei Huang
2024-05-09 21:50   ` Vadim Fedorenko
2024-05-10  3:55     ` Ajit Khaparde
2024-05-10 10:35       ` Vadim Fedorenko [this message]
2024-05-10 15:23         ` Andy Gospodarek
2024-05-10 20:03           ` David Wei
2024-05-10 20:33             ` Andy Gospodarek
2024-05-10 20:33           ` Vadim Fedorenko
2024-05-10 20:37             ` Andy Gospodarek
2024-05-10  3:10   ` Somnath Kotur
2024-05-09 16:27 ` [PATCH V1 9/9] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c6a8b86-6544-4c99-a0f2-030e2ec4e98f@linux.dev \
    --to=vadim.fedorenko@linux.dev \
    --cc=Eric.VanTassell@amd.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gospo@broadcom.com \
    --cc=kuba@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=manoj.panicker2@amd.com \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=wei.huang2@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).