From: Tim Chen <tim.c.chen@linux.intel.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Ying Huang <ying.huang@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
David Rientjes <rientjes@google.com>,
Shakeel Butt <shakeelb@google.com>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory
Date: Mon, 5 Apr 2021 10:08:24 -0700 [thread overview]
Message-ID: <cover.1617642417.git.tim.c.chen@linux.intel.com> (raw)
Traditionally, all memory is DRAM. Some DRAM might be closer/faster than
others NUMA wise, but a byte of media has about the same cost whether it
is close or far. But, with new memory tiers such as Persistent Memory
(PMEM). there is a choice between fast/expensive DRAM and slow/cheap
PMEM.
The fast/expensive memory lives in the top tier of the memory hierachy.
Previously, the patchset
[PATCH 00/10] [v7] Migrate Pages in lieu of discard
https://lore.kernel.org/linux-mm/20210401183216.443C4443@viggo.jf.intel.com/
provides a mechanism to demote cold pages from DRAM node into PMEM.
And the patchset
[PATCH 0/6] [RFC v6] NUMA balancing: optimize memory placement for memory tiering system
https://lore.kernel.org/linux-mm/20210311081821.138467-1-ying.huang@intel.com/
provides a mechanism to promote hot pages in PMEM to the DRAM node
leveraging autonuma.
The two patchsets together keep the hot pages in DRAM and colder pages
in PMEM.
To make fine grain cgroup based management of the precious top tier
DRAM memory possible, this patchset adds a few new features:
1. Provides memory monitors on the amount of top tier memory used per cgroup
and by the system as a whole.
2. Applies soft limits on the top tier memory each cgroup uses
3. Enables kswapd to demote top tier pages from cgroup with excess top
tier memory usages.
This allows us to provision different amount of top tier memory to each
cgroup according to the cgroup's latency need.
The patchset is based on cgroup v1 interface. One shortcoming of the v1
interface is the limit on the cgroup is a soft limit, so a cgroup can
exceed the limit quite a bit before reclaim before page demotion reins
it in.
We are also working on a cgroup v2 interface control interface that will will
have a max limit on the top tier memory per cgroup but requires much
additional logic to fall back and allocate from non top tier memory when a
cgroup reaches the maximum limit. This simpler cgroup v1 implementation
with all its warts is used to illustrate the concept of cgroup based
top tier memory management and serves as a starting point of discussions.
The soft limit and soft reclaim logic in this patchset will be similar for what
we would do for a cgroup v2 interface when we reach the high watermark
for top tier usage in a cgroup v2 interface.
This patchset is applied on top of
[PATCH 00/10] [v7] Migrate Pages in lieu of discard
and
[PATCH 0/6] [RFC v6] NUMA balancing: optimize memory placement for memory tiering system
It is part of a larger patchset. You can play with the complete set of patches
using the tree:
https://git.kernel.org/pub/scm/linux/kernel/git/vishal/tiering.git/log/?h=tiering-0.71
Tim Chen (11):
mm: Define top tier memory node mask
mm: Add soft memory limit for mem cgroup
mm: Account the top tier memory usage per cgroup
mm: Report top tier memory usage in sysfs
mm: Add soft_limit_top_tier tree for mem cgroup
mm: Handle top tier memory in cgroup soft limit memory tree utilities
mm: Account the total top tier memory in use
mm: Add toptier option for mem_cgroup_soft_limit_reclaim()
mm: Use kswapd to demote pages when toptier memory is tight
mm: Set toptier_scale_factor via sysctl
mm: Wakeup kswapd if toptier memory need soft reclaim
Documentation/admin-guide/sysctl/vm.rst | 12 +
drivers/base/node.c | 2 +
include/linux/memcontrol.h | 20 +-
include/linux/mm.h | 4 +
include/linux/mmzone.h | 7 +
include/linux/nodemask.h | 1 +
include/linux/vmstat.h | 18 ++
kernel/sysctl.c | 10 +
mm/memcontrol.c | 303 +++++++++++++++++++-----
mm/memory_hotplug.c | 3 +
mm/migrate.c | 1 +
mm/page_alloc.c | 36 ++-
mm/vmscan.c | 73 +++++-
mm/vmstat.c | 22 +-
14 files changed, 444 insertions(+), 68 deletions(-)
--
2.20.1
next reply other threads:[~2021-04-05 18:08 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-05 17:08 Tim Chen [this message]
2021-04-05 17:08 ` [RFC PATCH v1 01/11] mm: Define top tier memory node mask Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 02/11] mm: Add soft memory limit for mem cgroup Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 03/11] mm: Account the top tier memory usage per cgroup Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 04/11] mm: Report top tier memory usage in sysfs Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 05/11] mm: Add soft_limit_top_tier tree for mem cgroup Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 06/11] mm: Handle top tier memory in cgroup soft limit memory tree utilities Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 07/11] mm: Account the total top tier memory in use Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 08/11] mm: Add toptier option for mem_cgroup_soft_limit_reclaim() Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 09/11] mm: Use kswapd to demote pages when toptier memory is tight Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 10/11] mm: Set toptier_scale_factor via sysctl Tim Chen
2021-04-05 17:08 ` [RFC PATCH v1 11/11] mm: Wakeup kswapd if toptier memory need soft reclaim Tim Chen
2021-04-06 9:08 ` [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory Michal Hocko
2021-04-07 22:33 ` Tim Chen
2021-04-08 11:52 ` Michal Hocko
2021-04-09 23:26 ` Tim Chen
2021-04-12 19:20 ` Shakeel Butt
2021-04-14 8:59 ` Jonathan Cameron
2021-04-15 0:42 ` Tim Chen
2021-04-13 2:15 ` Huang, Ying
2021-04-13 8:33 ` Michal Hocko
2021-04-12 14:03 ` Shakeel Butt
2021-04-08 17:18 ` Shakeel Butt
2021-04-08 18:00 ` Yang Shi
2021-04-08 20:29 ` Shakeel Butt
2021-04-08 20:50 ` Yang Shi
2021-04-12 14:03 ` Shakeel Butt
2021-04-09 7:24 ` Michal Hocko
2021-04-15 22:31 ` Tim Chen
2021-04-16 6:38 ` Michal Hocko
2021-04-14 23:22 ` Tim Chen
2021-04-09 2:58 ` Huang, Ying
2021-04-09 20:50 ` Yang Shi
2021-04-15 22:25 ` Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1617642417.git.tim.c.chen@linux.intel.com \
--to=tim.c.chen@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).