From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) by kanga.kvack.org (Postfix) with ESMTP id DE34F6B0074 for ; Thu, 18 Jun 2015 11:47:21 -0400 (EDT) Received: by wibdq8 with SMTP id dq8so91204375wib.1 for ; Thu, 18 Jun 2015 08:47:21 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id pr9si14575126wjc.194.2015.06.18.08.47.19 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 18 Jun 2015 08:47:20 -0700 (PDT) Date: Thu, 18 Jun 2015 17:47:16 +0200 From: Michal Hocko Subject: Re: [RFC V3] net: don't wait for order-3 page allocation Message-ID: <20150618154716.GH5858@dhcp22.suse.cz> References: <0099265406c32b9b9057de100404a4148d602cdd.1434066549.git.shli@fb.com> <557AA834.8070503@suse.cz> <20150618143019.GE5858@dhcp22.suse.cz> <20150618144311.GF5858@dhcp22.suse.cz> <5582E240.8080704@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5582E240.8080704@suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Eric Dumazet , David Rientjes , Shaohua Li , netdev , David Miller , kernel-team , clm@fb.com, linux-mm@kvack.org, dbavatar@gmail.com On Thu 18-06-15 17:22:40, Vlastimil Babka wrote: > On 06/18/2015 04:43 PM, Michal Hocko wrote: > >On Thu 18-06-15 07:35:53, Eric Dumazet wrote: > >>On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko wrote: > >> > >>>Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the > >>>_current_ implementation of the allocator has this nasty and very subtle > >>>side effect but that doesn't mean it should be abused outside of the mm > >>>proper. Why shouldn't this path wake the kswapd and let it compact > >>>memory on the background to increase the success rate for the later > >>>high order allocations? > >> > >>I kind of agree. > >> > >>If kswapd is a problem (is it ???) we should fix it, instead of adding > >>yet another flag to some random locations attempting > >>memory allocations. > > > >No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can > >access some portion of the memory reserves (see gfp_to_alloc_flags resp. > >__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty > >hack to not give that access which was introduced for THP AFAIR. > > > >The implicit access to memory reserves for non sleeping allocation has > >been there for ages and it might be not suitable for this particular > >path but that doesn't mean another gfp flag with a different side effect > >should be hijacked. We should either stop doing that implicit access to > >memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But > >that is a problem to be solved in the mm proper. Spreading subtle > >dependencies outside of mm will just make situation worse. > > So you are not proposing to use these __GFP_RESERVE/NORESERVE flag outside > of mm, right? (besides, we distinguish several kinds of reserves, so what > exactly would the flag do?) That is to be discussed. Most allocations already express their interest in memory reserves by __GFP_HIGH directly or by GFP_ATOMIC indirectly. So maybe we do not need any additional flag here. There are not that many ~__GFP_WAIT and most of them seem to require it _only_ because the context doesn't allow for sleeping (e.g. to prevent from deadlocks). > As that would be also subtle dependency. The > general problem I think is that we should want the mm users to specify > higher-level intentions (such as GFP_KERNEL) which would map to specific > directions (__GFP_*) for the allocator, and currently it's rather a mess of > both kinds of flags. I agree. So I think that maybe we should drop that implicit access to memory reserves for ~__GFP_WAIT allocations and let it do what it is documented to do. > Clearly the intention here is "opportunistic allocation that should > not reclaim/compact, use reserves, wake up kswapd (?) because it's > better to fall back to smaller pages than wait") and we don't seem to > have a GFP_OPPORTUNISTIC flag for that. The allocation has to then > mask out __GFP_WAIT which however looks like an atomic allocation to > the allocator and give access to reserves, etc... I think simply dropping GFP_WAIT is a good way to express that. The fact that the current implementation gives access to memory reserves implicitly is just a detail and the user of the allocator shouldn't care about that. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [RFC V3] net: don't wait for order-3 page allocation Date: Thu, 18 Jun 2015 17:47:16 +0200 Message-ID: <20150618154716.GH5858@dhcp22.suse.cz> References: <0099265406c32b9b9057de100404a4148d602cdd.1434066549.git.shli@fb.com> <557AA834.8070503@suse.cz> <20150618143019.GE5858@dhcp22.suse.cz> <20150618144311.GF5858@dhcp22.suse.cz> <5582E240.8080704@suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eric Dumazet , David Rientjes , Shaohua Li , netdev , David Miller , kernel-team , clm@fb.com, linux-mm@kvack.org, dbavatar@gmail.com To: Vlastimil Babka Return-path: Received: from cantor2.suse.de ([195.135.220.15]:52866 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752327AbbFRPrV (ORCPT ); Thu, 18 Jun 2015 11:47:21 -0400 Content-Disposition: inline In-Reply-To: <5582E240.8080704@suse.cz> Sender: netdev-owner@vger.kernel.org List-ID: On Thu 18-06-15 17:22:40, Vlastimil Babka wrote: > On 06/18/2015 04:43 PM, Michal Hocko wrote: > >On Thu 18-06-15 07:35:53, Eric Dumazet wrote: > >>On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko wrote: > >> > >>>Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the > >>>_current_ implementation of the allocator has this nasty and very subtle > >>>side effect but that doesn't mean it should be abused outside of the mm > >>>proper. Why shouldn't this path wake the kswapd and let it compact > >>>memory on the background to increase the success rate for the later > >>>high order allocations? > >> > >>I kind of agree. > >> > >>If kswapd is a problem (is it ???) we should fix it, instead of adding > >>yet another flag to some random locations attempting > >>memory allocations. > > > >No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can > >access some portion of the memory reserves (see gfp_to_alloc_flags resp. > >__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty > >hack to not give that access which was introduced for THP AFAIR. > > > >The implicit access to memory reserves for non sleeping allocation has > >been there for ages and it might be not suitable for this particular > >path but that doesn't mean another gfp flag with a different side effect > >should be hijacked. We should either stop doing that implicit access to > >memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But > >that is a problem to be solved in the mm proper. Spreading subtle > >dependencies outside of mm will just make situation worse. > > So you are not proposing to use these __GFP_RESERVE/NORESERVE flag outside > of mm, right? (besides, we distinguish several kinds of reserves, so what > exactly would the flag do?) That is to be discussed. Most allocations already express their interest in memory reserves by __GFP_HIGH directly or by GFP_ATOMIC indirectly. So maybe we do not need any additional flag here. There are not that many ~__GFP_WAIT and most of them seem to require it _only_ because the context doesn't allow for sleeping (e.g. to prevent from deadlocks). > As that would be also subtle dependency. The > general problem I think is that we should want the mm users to specify > higher-level intentions (such as GFP_KERNEL) which would map to specific > directions (__GFP_*) for the allocator, and currently it's rather a mess of > both kinds of flags. I agree. So I think that maybe we should drop that implicit access to memory reserves for ~__GFP_WAIT allocations and let it do what it is documented to do. > Clearly the intention here is "opportunistic allocation that should > not reclaim/compact, use reserves, wake up kswapd (?) because it's > better to fall back to smaller pages than wait") and we don't seem to > have a GFP_OPPORTUNISTIC flag for that. The allocation has to then > mask out __GFP_WAIT which however looks like an atomic allocation to > the allocator and give access to reserves, etc... I think simply dropping GFP_WAIT is a good way to express that. The fact that the current implementation gives access to memory reserves implicitly is just a detail and the user of the allocator shouldn't care about that. -- Michal Hocko SUSE Labs