From: David Rientjes <rientjes@google.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
Michal Hocko <mhocko@suse.com>,
Dan Williams <dan.j.williams@intel.com>,
John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
Bharata B Rao <bharata@amd.com>,
Dave Jiang <dave.jiang@intel.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Alistair Popple <apopple@nvidia.com>,
Christoph Lameter <cl@gentwo.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Mel Gorman <mgorman@suse.de>, Jon Grimm <jon.grimm@amd.com>,
Gregory Price <gourry.memverge@gmail.com>,
Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
SeongJae Park <sj@kernel.org>,
David Hildenbrand <david@redhat.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Yuanchu Xie <yuanchu@google.com>
Subject: Re: [LSF/MM/BPF TOPIC] Locally attached memory tiering
Date: Thu, 9 May 2024 20:10:07 -0700 (PDT) [thread overview]
Message-ID: <4e326d6a-b4d2-0617-97fe-e2d8c6458c68@google.com> (raw)
In-Reply-To: <87msp1kkj2.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Wed, 8 May 2024, Huang, Ying wrote:
> > Hi all,
> >
> > I think it would be very worthwhile to have a block set aside for
> > discussion on locally attached memory tiering extensions at LSF/MM/BPF
> > 2024.
> >
> > Primarily interested in discussing Linux enlightenment for CXL 1.1 and
> > later type-3 memory expansion devices (CXL.mem). I think we could touch
> > on CXL 2.0 and later memory pooling architectures if we have time and
> > there is interest, but the primary focus here would be local attached.
> >
> > Based on the premise for a Memory Tiering Working Group[1], there is
> > widespread interest in the foundational topics for generally useful Linux
> > enlightenment:
> >
> > - Decoupling CPU balancing from memory balancing (or obsoleting CPU
> > balancing entirely)
> >
> > + John Hubbard notes this would be useful for GPUs:
> >
> > a) GPUs have their own processors that are invisible to the kernel's
> > NUMA "which tasks are active on which NUMA nodes" calculations,
> > and
> >
> > b) Similar to where CXL is generally going, we have already built
> > fully memory-coherent hardware, which include memory-only NUMA
> > nodes.
> >
> > - In-kernel hot memory abstraction, informed by hardware hinting drivers
> > (incl some architectures like Power10), usable as a NUMA Balancing
> > backend for promotion and other areas of the kernel like transparent
> > hugepage utilization
> >
> > - NUMA and memory tiering enlightenment for accelerators, such as for
> > optimal use of GPU memory, extremely important for a cloud provider
> > (hint hint :)
> >
> > - Asynchronous memory promotion independent of task_numa_fault() while
> > considering the cost of page migration (due to identifying cold memory)
> >
> > - What the role of userspace plays in this decision-making and how we can
> > extend the default policy and mechanisms in the kernel to allow for it
> > if necessary
> >
> > Additional topics that you find interesting are also very helpful!
>
> In addition to the hot memory identification and promotion, I think that
> we should consider the cold memory identification and demotion too as a
> full solution. The existing method based on the page table accessed bit
> may be good enough, but we still need to consider the full solution in
> the context of the general NUMA balancing.
>
I think that's a great suggestion! We'll be able to cover the approach
taken by workingset reporting[*] which is quite powerful for the purposes
of proactive reclaim through memory.reclaim and would also very be useful
for identifying cold memory for the purposes of demotion as well.
[*] https://lore.kernel.org/linux-mm/20240504073011.4000534-1-yuanchu@google.com/T/
> > I'm biased toward a generally useful solution that would leverage the
> > kernel as the ultimate source of truth for page hotness that can be
> > extended for multiple use caes, one of which is memory tiering support.
> > But certainly if there are other approaches, we can discuss that as well.
> >
> > A few main goals from this discussion:
> >
> > - Ensure that proposals address, or can be extended to address, the
> > emerging needs of the various use cases that users may have
> >
> > - Surface any constraints that stakeholders may find to be prohibitive
> > for support in the core MM subsystem
> >
> > - Alignment and division of work for developers who are actively looking
> > to contribute to this area
> >
> > As I'm just one of many stakeholders for this discussion, I'd nominate
> > Michal Hocko to moderate it if he's willing to do so. If he's so willing,
> > we'd be in good hands :)
> >
> > [1] https://lore.kernel.org/linux-mm/45d850ec-623b-7c07-c266-e948cdbf1f62@linux.com/T/
>
> --
> Best Regards,
> Huang, Ying
>
next prev parent reply other threads:[~2024-05-10 3:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 3:37 [LSF/MM/BPF TOPIC] Locally attached memory tiering David Rientjes
2024-05-07 11:52 ` Michal Hocko
2024-05-07 20:09 ` David Rientjes
2024-05-08 4:14 ` Huang, Ying
2024-05-10 3:10 ` David Rientjes [this message]
2024-05-08 21:39 ` Davidlohr Bueso
2024-05-09 1:42 ` Huang, Ying
2024-05-13 1:49 ` Davidlohr Bueso
2024-05-13 3:28 ` Bharata B Rao
2024-05-13 7:48 ` Huang, Ying
[not found] ` <CGME20240509173529uscas1p1b6e43b169514d36915cd2bc8aabc4200@uscas1p1.samsung.com>
2024-05-09 17:35 ` Adam Manzanares
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e326d6a-b4d2-0617-97fe-e2d8c6458c68@google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=apopple@nvidia.com \
--cc=bharata@amd.com \
--cc=cl@gentwo.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=gourry.memverge@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=jhubbard@nvidia.com \
--cc=jon.grimm@amd.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=sj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=weixugc@google.com \
--cc=ying.huang@intel.com \
--cc=yuanchu@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).