All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [RFC V3] net: don't wait for order-3 page allocation
@ 2015-06-11 23:50 ` Shaohua Li
  0 siblings, 0 replies; 20+ messages in thread
From: Shaohua Li @ 2015-06-11 23:50 UTC (permalink / raw)
  To: netdev; +Cc: davem, Kernel-team, clm, linux-mm, dbavatar, Eric Dumazet

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 net/core/skbuff.c | 2 +-
 net/core/sock.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3cfff2a..41ec022 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4398,7 +4398,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
 
 		while (order) {
 			if (npages >= 1 << order) {
-				page = alloc_pages(gfp_mask |
+				page = alloc_pages((gfp_mask & ~__GFP_WAIT) |
 						   __GFP_COMP |
 						   __GFP_NOWARN |
 						   __GFP_NORETRY,
diff --git a/net/core/sock.c b/net/core/sock.c
index 292f422..e9855a4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1883,7 +1883,7 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
 
 	pfrag->offset = 0;
 	if (SKB_FRAG_PAGE_ORDER) {
-		pfrag->page = alloc_pages(gfp | __GFP_COMP |
+		pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP |
 					  __GFP_NOWARN | __GFP_NORETRY,
 					  SKB_FRAG_PAGE_ORDER);
 		if (likely(pfrag->page)) {
-- 
1.8.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-06-30 23:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-11 23:50 [RFC V3] net: don't wait for order-3 page allocation Shaohua Li
2015-06-11 23:50 ` Shaohua Li
2015-06-12  0:02 ` Eric Dumazet
2015-06-12  0:02   ` Eric Dumazet
2015-06-12  0:34 ` David Miller
2015-06-12  0:34   ` David Miller
2015-06-12  9:36 ` Vlastimil Babka
2015-06-12  9:36   ` Vlastimil Babka
2015-06-17 23:02   ` David Rientjes
2015-06-17 23:02     ` David Rientjes
2015-06-18 14:30     ` Michal Hocko
2015-06-18 14:30       ` Michal Hocko
2015-06-18 14:35       ` Eric Dumazet
2015-06-18 14:43         ` Michal Hocko
2015-06-18 14:43           ` Michal Hocko
2015-06-18 15:22           ` Vlastimil Babka
2015-06-18 15:47             ` Michal Hocko
2015-06-18 15:47               ` Michal Hocko
2015-06-30 23:49               ` David Rientjes
2015-06-30 23:49                 ` David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.