All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* mac80211 and multiple RX queues (with RSS hashing)
@ 2015-06-12  8:31 Johannes Berg
  2015-06-16 13:28 ` Johannes Berg
  0 siblings, 1 reply; 2+ messages in thread
From: Johannes Berg @ 2015-06-12  8:31 UTC (permalink / raw)
  To: linux-wireless

Hi,

We're currently investigating how to best add RSS hashing to the
wireless stack.

Let's say, for the sake of discussion, that we'll add a function to
mac80211 called

ieee80211_rx_napi_mq(struct hw *, struct napi_struct *, struct sk_buff
*);

and (depending on the outcome of the design) the driver would have to
set the skb_queue_mapping to the queue this packet was received on.

For context, there are a few requirements for callers of this function -
here are the ones I'm planning:

 * It would only support (non-null) data frames, management frames must
not be
   passed to the new API, but must be passed to the regular
ieee80211_rx()
   function.
 * I'm not going to support ieee80211_rx_irqsafe() with this API, it
would be
   quite pointless (or a lot of code to have per-queue tasklets?)
 * Won't support software crypto (PN checking can't really work in
parallel)
 * Won't support monitor mode, and probably a few other similar things
(TBD)
 * Won't support defragmentation (fragmented must anyway either be
reassembled
   for hashing or somehow not hashed perhaps)
 * Won't support client powersave in mac80211 - uh ... just say no to
that!
 * For not, I'm not planning to support mesh on this, maybe not even
IBSS.
 * Duplicate detection must be handled by hardware/firmware or similar
   (possibly could be done with HW assist in driver, but clearly that
cannot be
   handled generically in mac80211)
 * RX aggregation reordering done by hardware/firmware or similar
   (same here)
 * require a NAPI struct? (not really sure yet how the stack treats
parallel RX)

Of course these don't really seem fairly natural so far, by the nature
of using multiple queues.

There are a few other areas that are of more concern:
 a) Statistics
    Obviously, we can no longer have single counters. Using atomic
counters would
    kill much of the benefit of having multiple queues to start with, so
in some
    way we need to have multiple counters.
 b) handling AP/GO powersaving clients
    With RSS, we can end up with various races - right now we say TX
status and
    RX must be serialized by the driver, but clearly that can no longer
be
    guaranteed with multiple RX queues.
    [also need to check if there are *other* things that require
serialization]
 c) aggregation session timeout/reorder timer handling
    There's a single field/timer (for each of this) per session,
obviously it's
    not a great idea to hit these from multiple CPUs.

Let's take these one by one:

a) This is probably the biggest one. We have a LOT of statistics that we
keep, and they all rely on RX being serialized. For example, per-station
RX packet and byte counters. Making all of these atomic would be
correct, but would obviously kill much of the performance benefit of
RSS.

As a consequence, I see two possible solutions here:

a1) Just make this the driver's concern, change sta_set_sinfo() to not
provide any (with a few exceptions) mac80211 statistics when multi-queue
RX is used. This could mean the same kind of statistics code is in each
driver, if the driver supports the statistics at all - or it could mean
that we cause a lot of divergence with statistics between drivers.

a2) Alternatively, drivers could tell mac80211 before-hand how many
queues they'll use, pass a queue identifier to mac80211 for each packet
(e.g. in skb's queue_mapping) and have mac80211 gather per-queue
statistics that get combined when read. This means allocating separate
statistics/queue arrays for stations etc. in mac80211, and then using
u64_stats_update_begin() etc. to get a consistent reading like we do
with per-cpu netdev stats already (since my fairly recent patch.)

Clearly this cannot support the "average" values like "average signal
strength" and the "last packet signal strength" might not always be
really the very last packet (which doesn't really matter though) so
those would still have to be excluded and generated by the driver, but
it could mean a bit more code sharing (if more than our driver ends up
using this facility) and more consistency. The downside might be that if
drivers want to do statistics in the firmware or so, they'd waste the
extra cycles on the host. I don't think we're planning to do that for
now though.


b) I think this one is pretty simple - just require the driver to set
AP_LINK_PS and if necessary call ieee80211_sta_ps_transition(). However,
it might require adding more logic for U-APSD support depending on the
hardware design.


c) I'm not really sure about this. I think it really needs hardware
assist.


Does anyone else have any thoughts?

johannes


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: mac80211 and multiple RX queues (with RSS hashing)
  2015-06-12  8:31 mac80211 and multiple RX queues (with RSS hashing) Johannes Berg
@ 2015-06-16 13:28 ` Johannes Berg
  0 siblings, 0 replies; 2+ messages in thread
From: Johannes Berg @ 2015-06-16 13:28 UTC (permalink / raw)
  To: linux-wireless

On Fri, 2015-06-12 at 10:31 +0200, Johannes Berg wrote:

>  * It would only support (non-null) data frames, management frames must
> not be
>    passed to the new API, but must be passed to the regular
> ieee80211_rx()
>    function.

Sorry about the line breaking - not sure how that happened.

Let me recap these points:
> * only real data frames allowed in "MQ RX" API
> * no _irqsafe variant
> * no software crypto
> * no monitor mode (TBD)

This makes no sense, I think I'll leave monitor mode.

> * no defragmentation
> * no client powersave support
> * no mesh, perhaps no IBSS
> * aggregation reorder already done
> * require NAPI struct (TBD)

Another item we might add:
 * no TDLS ethertype frames

Those are more control than data, so they shouldn't really go here. It
wouldn't be hard to support them going there, it just introduces race
conditions.

>  b) handling AP/GO powersaving clients
>     With RSS, we can end up with various races - right now we say TX
>     status and RX must be serialized by the driver, but clearly that
>     can no longer be guaranteed with multiple RX queues.

I'm just going to ignore this for now and disallow AP mode unless the HW
offload is enabled.

>     [also need to check if there are *other* things that require
>     serialization]

TBD

>  c) aggregation session timeout/reorder timer handling
>     There's a single field/timer (for each of this) per session,
>     obviously it's
>     not a great idea to hit these from multiple CPUs.

The reorder timeout is moot since we require already well-ordered
frames. Therefore, we can't even have it. Driver developers will have to
think about how to implement that though.

The aggregation session timeout handling actually doesn't really change,
just the data structure needs to be 'exploded' into per-queue structures
instead.

> a2) Alternatively, drivers could tell mac80211 before-hand how many
> queues they'll use, pass a queue identifier to mac80211 for each packet
> (e.g. in skb's queue_mapping) and have mac80211 gather per-queue
> statistics that get combined when read. This means allocating separate
> statistics/queue arrays for stations etc. in mac80211, and then using
> u64_stats_update_begin() etc. to get a consistent reading like we do
> with per-cpu netdev stats already (since my fairly recent patch.)

I'm going to go with this one, but block a few things like 'average
signal' and require drivers to implement those if needed.

johannes


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-06-16 13:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-12  8:31 mac80211 and multiple RX queues (with RSS hashing) Johannes Berg
2015-06-16 13:28 ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.