From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A53E5C25B08 for ; Wed, 10 Aug 2022 08:41:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231761AbiHJIlJ (ORCPT ); Wed, 10 Aug 2022 04:41:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231781AbiHJIku (ORCPT ); Wed, 10 Aug 2022 04:40:50 -0400 Received: from mailout-taastrup.gigahost.dk (mailout-taastrup.gigahost.dk [46.183.139.199]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE45E2BB0C; Wed, 10 Aug 2022 01:40:47 -0700 (PDT) Received: from mailout.gigahost.dk (mailout.gigahost.dk [89.186.169.112]) by mailout-taastrup.gigahost.dk (Postfix) with ESMTP id 3CB0B188494E; Wed, 10 Aug 2022 08:40:46 +0000 (UTC) Received: from smtp.gigahost.dk (smtp.gigahost.dk [89.186.169.109]) by mailout.gigahost.dk (Postfix) with ESMTP id 344BB25032B7; Wed, 10 Aug 2022 08:40:46 +0000 (UTC) Received: by smtp.gigahost.dk (Postfix, from userid 1000) id 2B1A7A1A004D; Wed, 10 Aug 2022 08:40:46 +0000 (UTC) X-Screener-Id: 413d8c6ce5bf6eab4824d0abaab02863e8e3f662 MIME-Version: 1.0 Date: Wed, 10 Aug 2022 10:40:45 +0200 From: netdev@kapio-technology.com To: Ido Schimmel Cc: Vladimir Oltean , davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, Andrew Lunn , Vivien Didelot , Florian Fainelli , Eric Dumazet , Paolo Abeni , Jiri Pirko , Ivan Vecera , Roopa Prabhu , Nikolay Aleksandrov , Shuah Khan , Daniel Borkmann , linux-kernel@vger.kernel.org, bridge@lists.linux-foundation.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH v4 net-next 3/6] drivers: net: dsa: add locked fdb entry flag to drivers In-Reply-To: References: <20220708091550.2qcu3tyqkhgiudjg@skbuf> <20220708115624.rrjzjtidlhcqczjv@skbuf> <723e2995314b41ff323272536ef27341@kapio-technology.com> <79683d9cf122e22b66b5da3bbbb0ee1f@kapio-technology.com> <6c6fe135ce7b5b118289dc370135b0d3@kapio-technology.com> User-Agent: Gigahost Webmail Message-ID: <2491232d5c017d94ca3213197a3fb283@kapio-technology.com> X-Sender: netdev@kapio-technology.com Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-08-10 09:21, Ido Schimmel wrote: > On Tue, Aug 09, 2022 at 10:00:49PM +0200, netdev@kapio-technology.com > wrote: >> On 2022-08-09 11:20, Ido Schimmel wrote: >> > On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com >> > wrote: >> > > On 2022-07-13 14:39, Ido Schimmel wrote: >> > > >> > > > >> > > > What are "Storm Prevention" and "zero-DPV" FDB entries? >> > > > >> > > >> > > For the zero-DPV entries, I can summarize: >> > > >> > > Since a CPU can become saturated from constant SA Miss Violations >> > > from a >> > > denied source, source MAC address are masked by loading a zero-DPV >> > > (Destination Port Vector) entry in the ATU. As the address now >> > > appears in >> > > the database it will not cause more Miss Violations. ANY port trying >> > > to send >> > > a frame to this unauthorized address is discarded. Any locked port >> > > trying to >> > > use this unauthorized address has its frames discarded too (as the >> > > ports SA >> > > bit is not set in the ATU entry). >> > >> > What happens to unlocked ports that have learning enabled and are trying >> > to use this address as SMAC? AFAICT, at least in the bridge driver, the >> > locked entry will roam, but will keep the "locked" flag, which is >> > probably not what we want. Let's see if we can agree on these semantics >> > for a "locked" entry: >> >> The next version of this will block forwarding to locked entries in >> the >> bridge, so they will behave like the zero-DPV entries. > > I'm talking about roaming, not forwarding. Let's say you have a locked > entry with MAC X pointing to port Y. Now you get a packet with SMAC X > from port Z which is unlocked. Will the FDB entry roam to port Z? I > think it should, but at least in current implementation it seems that > the "locked" flag will not be reset and having locked entries pointing > to an unlocked port looks like a bug. > Remember that zero-DPV entries blackhole (mask) the MAC, so whenever a packet appears with the same MAC on another port it is just dropped in the HW, so there is no possibility of doing any CPU processing in this case. Only after the timeout (5 min) can the MAC get a normal ATU on an open port. For the bridge to do what you suggest, a FDB search would be needed afaics, and this might be in a process sensitive part of the code, thus leading to too heavy a cost. >> >> > >> > 1. It discards packets with matching DMAC, regardless of ingress port. I >> > read the document [1] you linked to in a different reply and could not >> > find anything against this approach, so this might be fine or at least >> > not very significant. >> > >> > Note that this means that "locked" entries need to be notified to device >> > drivers so that they will install a matching entry in the HW FDB. >> >> Okay, so as V4 does (just without the error noted). >> >> > >> > 2. It is not refreshed and has ageing enabled. That is, after initial >> > installation it will be removed by the bridge driver after configured >> > ageing time unless converted to a regular (unlocked) entry. >> > >> > I assume this allows you to remove the timer implementation from your >> > driver and let the bridge driver notify you about the removal of this >> > entry. >> >> Okay, but only if the scheme is not so that the driver creates the >> locked >> entries itself, unless you indicate that the driver notifies the >> bridge, >> which then notifies back to the driver and installs the zero-DPV >> entry? If >> not I think the current implementation for the mv88e6xxx is fine. > > I don't see a problem in having the driver notifying the bridge about > the installation of this entry and the bridge notifying the driver that > the entry needs to be removed. It removes complexity from device > drivers > like mv88e6xxx and doesn't add extra complexity to the bridge driver. > > Actually, there is one complication, 'SWITCHDEV_FDB_ADD_TO_BRIDGE' will > add the locked entry as externally learned, which means the bridge will > not age it. Might need something like this: > > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c > index e7f4fccb6adb..5f73d0b44ed9 100644 > --- a/net/bridge/br_fdb.c > +++ b/net/bridge/br_fdb.c > @@ -530,7 +530,8 @@ void br_fdb_cleanup(struct work_struct *work) > unsigned long this_timer = f->updated + delay; > > if (test_bit(BR_FDB_STATIC, &f->flags) || > - test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags)) { > + (test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags) && > + !test_bit(BR_FDB_ENTRY_LOCKED, &f->flags))) { > if (test_bit(BR_FDB_NOTIFY, &f->flags)) { > if (time_after(this_timer, now)) > work_delay = min(work_delay, > There is a case of ownership of the FDB/ATU entry, which if I remember correctly, will point to the current implementation being the right way to do it, thus having the driver keeping ownership of the entry and thereby also ageing it, but I think Vladimir should have his say here. >> >> > >> > 3. With regards to roaming, the entry cannot roam between locked ports >> > (they need to have learning disabled anyway), but can roam to an >> > unlocked port, in which case it becomes a regular entry that can roam >> > and age. >> > >> > If we agree on these semantics, then I can try to verify that at least >> > Spectrum can support them (it seems mv88e6xxx can). >> >> The consensus here is that at least for the mv88e6xxx, learning should >> be on >> and link local learning should be blocked by the userspace setting you >> pointed to earlier. > > Why learning needs to be on in the bridge (not mv88e6xxx) driver? I think it is seen as 'cheating' to enable learning only in the driver behind the scenes, so kind of hackish. E.g. 'bridge -d link show' will then report 'learning off', while learning is on in the driver. And learning needs to be on for the driver as discussed earlier, which only gives rise to the link local learning problem, which is then solved by the user space setting. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 72F6E60E49 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 1451660BAB MIME-Version: 1.0 Date: Wed, 10 Aug 2022 10:40:45 +0200 From: netdev@kapio-technology.com In-Reply-To: References: <20220708091550.2qcu3tyqkhgiudjg@skbuf> <20220708115624.rrjzjtidlhcqczjv@skbuf> <723e2995314b41ff323272536ef27341@kapio-technology.com> <79683d9cf122e22b66b5da3bbbb0ee1f@kapio-technology.com> <6c6fe135ce7b5b118289dc370135b0d3@kapio-technology.com> Message-ID: <2491232d5c017d94ca3213197a3fb283@kapio-technology.com> Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Bridge] [PATCH v4 net-next 3/6] drivers: net: dsa: add locked fdb entry flag to drivers List-Id: Linux Ethernet Bridging List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ido Schimmel Cc: Ivan Vecera , Andrew Lunn , Florian Fainelli , Jiri Pirko , Daniel Borkmann , netdev@vger.kernel.org, Nikolay Aleksandrov , bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Vivien Didelot , Eric Dumazet , Paolo Abeni , linux-kselftest@vger.kernel.org, Roopa Prabhu , kuba@kernel.org, Vladimir Oltean , Shuah Khan , davem@davemloft.net On 2022-08-10 09:21, Ido Schimmel wrote: > On Tue, Aug 09, 2022 at 10:00:49PM +0200, netdev@kapio-technology.com > wrote: >> On 2022-08-09 11:20, Ido Schimmel wrote: >> > On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com >> > wrote: >> > > On 2022-07-13 14:39, Ido Schimmel wrote: >> > > >> > > > >> > > > What are "Storm Prevention" and "zero-DPV" FDB entries? >> > > > >> > > >> > > For the zero-DPV entries, I can summarize: >> > > >> > > Since a CPU can become saturated from constant SA Miss Violations >> > > from a >> > > denied source, source MAC address are masked by loading a zero-DPV >> > > (Destination Port Vector) entry in the ATU. As the address now >> > > appears in >> > > the database it will not cause more Miss Violations. ANY port trying >> > > to send >> > > a frame to this unauthorized address is discarded. Any locked port >> > > trying to >> > > use this unauthorized address has its frames discarded too (as the >> > > ports SA >> > > bit is not set in the ATU entry). >> > >> > What happens to unlocked ports that have learning enabled and are trying >> > to use this address as SMAC? AFAICT, at least in the bridge driver, the >> > locked entry will roam, but will keep the "locked" flag, which is >> > probably not what we want. Let's see if we can agree on these semantics >> > for a "locked" entry: >> >> The next version of this will block forwarding to locked entries in >> the >> bridge, so they will behave like the zero-DPV entries. > > I'm talking about roaming, not forwarding. Let's say you have a locked > entry with MAC X pointing to port Y. Now you get a packet with SMAC X > from port Z which is unlocked. Will the FDB entry roam to port Z? I > think it should, but at least in current implementation it seems that > the "locked" flag will not be reset and having locked entries pointing > to an unlocked port looks like a bug. > Remember that zero-DPV entries blackhole (mask) the MAC, so whenever a packet appears with the same MAC on another port it is just dropped in the HW, so there is no possibility of doing any CPU processing in this case. Only after the timeout (5 min) can the MAC get a normal ATU on an open port. For the bridge to do what you suggest, a FDB search would be needed afaics, and this might be in a process sensitive part of the code, thus leading to too heavy a cost. >> >> > >> > 1. It discards packets with matching DMAC, regardless of ingress port. I >> > read the document [1] you linked to in a different reply and could not >> > find anything against this approach, so this might be fine or at least >> > not very significant. >> > >> > Note that this means that "locked" entries need to be notified to device >> > drivers so that they will install a matching entry in the HW FDB. >> >> Okay, so as V4 does (just without the error noted). >> >> > >> > 2. It is not refreshed and has ageing enabled. That is, after initial >> > installation it will be removed by the bridge driver after configured >> > ageing time unless converted to a regular (unlocked) entry. >> > >> > I assume this allows you to remove the timer implementation from your >> > driver and let the bridge driver notify you about the removal of this >> > entry. >> >> Okay, but only if the scheme is not so that the driver creates the >> locked >> entries itself, unless you indicate that the driver notifies the >> bridge, >> which then notifies back to the driver and installs the zero-DPV >> entry? If >> not I think the current implementation for the mv88e6xxx is fine. > > I don't see a problem in having the driver notifying the bridge about > the installation of this entry and the bridge notifying the driver that > the entry needs to be removed. It removes complexity from device > drivers > like mv88e6xxx and doesn't add extra complexity to the bridge driver. > > Actually, there is one complication, 'SWITCHDEV_FDB_ADD_TO_BRIDGE' will > add the locked entry as externally learned, which means the bridge will > not age it. Might need something like this: > > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c > index e7f4fccb6adb..5f73d0b44ed9 100644 > --- a/net/bridge/br_fdb.c > +++ b/net/bridge/br_fdb.c > @@ -530,7 +530,8 @@ void br_fdb_cleanup(struct work_struct *work) > unsigned long this_timer = f->updated + delay; > > if (test_bit(BR_FDB_STATIC, &f->flags) || > - test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags)) { > + (test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags) && > + !test_bit(BR_FDB_ENTRY_LOCKED, &f->flags))) { > if (test_bit(BR_FDB_NOTIFY, &f->flags)) { > if (time_after(this_timer, now)) > work_delay = min(work_delay, > There is a case of ownership of the FDB/ATU entry, which if I remember correctly, will point to the current implementation being the right way to do it, thus having the driver keeping ownership of the entry and thereby also ageing it, but I think Vladimir should have his say here. >> >> > >> > 3. With regards to roaming, the entry cannot roam between locked ports >> > (they need to have learning disabled anyway), but can roam to an >> > unlocked port, in which case it becomes a regular entry that can roam >> > and age. >> > >> > If we agree on these semantics, then I can try to verify that at least >> > Spectrum can support them (it seems mv88e6xxx can). >> >> The consensus here is that at least for the mv88e6xxx, learning should >> be on >> and link local learning should be blocked by the userspace setting you >> pointed to earlier. > > Why learning needs to be on in the bridge (not mv88e6xxx) driver? I think it is seen as 'cheating' to enable learning only in the driver behind the scenes, so kind of hackish. E.g. 'bridge -d link show' will then report 'learning off', while learning is on in the driver. And learning needs to be on for the driver as discussed earlier, which only gives rise to the link local learning problem, which is then solved by the user space setting.