From: Michal Hocko <mhocko@suse.cz> To: Vlastimil Babka <vbabka@suse.cz> Cc: Eric Dumazet <edumazet@google.com>, David Rientjes <rientjes@google.com>, Shaohua Li <shli@fb.com>, netdev <netdev@vger.kernel.org>, David Miller <davem@davemloft.net>, kernel-team <Kernel-team@fb.com>, clm@fb.com, linux-mm@kvack.org, dbavatar@gmail.com Subject: Re: [RFC V3] net: don't wait for order-3 page allocation Date: Thu, 18 Jun 2015 17:47:16 +0200 [thread overview] Message-ID: <20150618154716.GH5858@dhcp22.suse.cz> (raw) In-Reply-To: <5582E240.8080704@suse.cz> On Thu 18-06-15 17:22:40, Vlastimil Babka wrote: > On 06/18/2015 04:43 PM, Michal Hocko wrote: > >On Thu 18-06-15 07:35:53, Eric Dumazet wrote: > >>On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko <mhocko@suse.cz> wrote: > >> > >>>Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the > >>>_current_ implementation of the allocator has this nasty and very subtle > >>>side effect but that doesn't mean it should be abused outside of the mm > >>>proper. Why shouldn't this path wake the kswapd and let it compact > >>>memory on the background to increase the success rate for the later > >>>high order allocations? > >> > >>I kind of agree. > >> > >>If kswapd is a problem (is it ???) we should fix it, instead of adding > >>yet another flag to some random locations attempting > >>memory allocations. > > > >No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can > >access some portion of the memory reserves (see gfp_to_alloc_flags resp. > >__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty > >hack to not give that access which was introduced for THP AFAIR. > > > >The implicit access to memory reserves for non sleeping allocation has > >been there for ages and it might be not suitable for this particular > >path but that doesn't mean another gfp flag with a different side effect > >should be hijacked. We should either stop doing that implicit access to > >memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But > >that is a problem to be solved in the mm proper. Spreading subtle > >dependencies outside of mm will just make situation worse. > > So you are not proposing to use these __GFP_RESERVE/NORESERVE flag outside > of mm, right? (besides, we distinguish several kinds of reserves, so what > exactly would the flag do?) That is to be discussed. Most allocations already express their interest in memory reserves by __GFP_HIGH directly or by GFP_ATOMIC indirectly. So maybe we do not need any additional flag here. There are not that many ~__GFP_WAIT and most of them seem to require it _only_ because the context doesn't allow for sleeping (e.g. to prevent from deadlocks). > As that would be also subtle dependency. The > general problem I think is that we should want the mm users to specify > higher-level intentions (such as GFP_KERNEL) which would map to specific > directions (__GFP_*) for the allocator, and currently it's rather a mess of > both kinds of flags. I agree. So I think that maybe we should drop that implicit access to memory reserves for ~__GFP_WAIT allocations and let it do what it is documented to do. > Clearly the intention here is "opportunistic allocation that should > not reclaim/compact, use reserves, wake up kswapd (?) because it's > better to fall back to smaller pages than wait") and we don't seem to > have a GFP_OPPORTUNISTIC flag for that. The allocation has to then > mask out __GFP_WAIT which however looks like an atomic allocation to > the allocator and give access to reserves, etc... I think simply dropping GFP_WAIT is a good way to express that. The fact that the current implementation gives access to memory reserves implicitly is just a detail and the user of the allocator shouldn't care about that. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.cz> To: Vlastimil Babka <vbabka@suse.cz> Cc: Eric Dumazet <edumazet@google.com>, David Rientjes <rientjes@google.com>, Shaohua Li <shli@fb.com>, netdev <netdev@vger.kernel.org>, David Miller <davem@davemloft.net>, kernel-team <Kernel-team@fb.com>, clm@fb.com, linux-mm@kvack.org, dbavatar@gmail.com Subject: Re: [RFC V3] net: don't wait for order-3 page allocation Date: Thu, 18 Jun 2015 17:47:16 +0200 [thread overview] Message-ID: <20150618154716.GH5858@dhcp22.suse.cz> (raw) In-Reply-To: <5582E240.8080704@suse.cz> On Thu 18-06-15 17:22:40, Vlastimil Babka wrote: > On 06/18/2015 04:43 PM, Michal Hocko wrote: > >On Thu 18-06-15 07:35:53, Eric Dumazet wrote: > >>On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko <mhocko@suse.cz> wrote: > >> > >>>Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the > >>>_current_ implementation of the allocator has this nasty and very subtle > >>>side effect but that doesn't mean it should be abused outside of the mm > >>>proper. Why shouldn't this path wake the kswapd and let it compact > >>>memory on the background to increase the success rate for the later > >>>high order allocations? > >> > >>I kind of agree. > >> > >>If kswapd is a problem (is it ???) we should fix it, instead of adding > >>yet another flag to some random locations attempting > >>memory allocations. > > > >No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can > >access some portion of the memory reserves (see gfp_to_alloc_flags resp. > >__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty > >hack to not give that access which was introduced for THP AFAIR. > > > >The implicit access to memory reserves for non sleeping allocation has > >been there for ages and it might be not suitable for this particular > >path but that doesn't mean another gfp flag with a different side effect > >should be hijacked. We should either stop doing that implicit access to > >memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But > >that is a problem to be solved in the mm proper. Spreading subtle > >dependencies outside of mm will just make situation worse. > > So you are not proposing to use these __GFP_RESERVE/NORESERVE flag outside > of mm, right? (besides, we distinguish several kinds of reserves, so what > exactly would the flag do?) That is to be discussed. Most allocations already express their interest in memory reserves by __GFP_HIGH directly or by GFP_ATOMIC indirectly. So maybe we do not need any additional flag here. There are not that many ~__GFP_WAIT and most of them seem to require it _only_ because the context doesn't allow for sleeping (e.g. to prevent from deadlocks). > As that would be also subtle dependency. The > general problem I think is that we should want the mm users to specify > higher-level intentions (such as GFP_KERNEL) which would map to specific > directions (__GFP_*) for the allocator, and currently it's rather a mess of > both kinds of flags. I agree. So I think that maybe we should drop that implicit access to memory reserves for ~__GFP_WAIT allocations and let it do what it is documented to do. > Clearly the intention here is "opportunistic allocation that should > not reclaim/compact, use reserves, wake up kswapd (?) because it's > better to fall back to smaller pages than wait") and we don't seem to > have a GFP_OPPORTUNISTIC flag for that. The allocation has to then > mask out __GFP_WAIT which however looks like an atomic allocation to > the allocator and give access to reserves, etc... I think simply dropping GFP_WAIT is a good way to express that. The fact that the current implementation gives access to memory reserves implicitly is just a detail and the user of the allocator shouldn't care about that. -- Michal Hocko SUSE Labs
next prev parent reply other threads:[~2015-06-18 15:47 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-06-11 23:50 [RFC V3] net: don't wait for order-3 page allocation Shaohua Li 2015-06-11 23:50 ` Shaohua Li 2015-06-12 0:02 ` Eric Dumazet 2015-06-12 0:02 ` Eric Dumazet 2015-06-12 0:34 ` David Miller 2015-06-12 0:34 ` David Miller 2015-06-12 9:36 ` Vlastimil Babka 2015-06-12 9:36 ` Vlastimil Babka 2015-06-17 23:02 ` David Rientjes 2015-06-17 23:02 ` David Rientjes 2015-06-18 14:30 ` Michal Hocko 2015-06-18 14:30 ` Michal Hocko 2015-06-18 14:35 ` Eric Dumazet 2015-06-18 14:43 ` Michal Hocko 2015-06-18 14:43 ` Michal Hocko 2015-06-18 15:22 ` Vlastimil Babka 2015-06-18 15:47 ` Michal Hocko [this message] 2015-06-18 15:47 ` Michal Hocko 2015-06-30 23:49 ` David Rientjes 2015-06-30 23:49 ` David Rientjes
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150618154716.GH5858@dhcp22.suse.cz \ --to=mhocko@suse.cz \ --cc=Kernel-team@fb.com \ --cc=clm@fb.com \ --cc=davem@davemloft.net \ --cc=dbavatar@gmail.com \ --cc=edumazet@google.com \ --cc=linux-mm@kvack.org \ --cc=netdev@vger.kernel.org \ --cc=rientjes@google.com \ --cc=shli@fb.com \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.