Re: [LSF/MM/BPF TOPIC] Locally attached memory tiering

Linux-mm Archive mirror
 help / color / mirror / Atom feed

From: David Rientjes <rientjes@google.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	 Michal Hocko <mhocko@suse.com>,
	Dan Williams <dan.j.williams@intel.com>,
	 John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	 Bharata B Rao <bharata@amd.com>,
	Dave Jiang <dave.jiang@intel.com>,
	 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	 Alistair Popple <apopple@nvidia.com>,
	Christoph Lameter <cl@gentwo.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	Mel Gorman <mgorman@suse.de>,  Jon Grimm <jon.grimm@amd.com>,
	Gregory Price <gourry.memverge@gmail.com>,
	 Wei Xu <weixugc@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 SeongJae Park <sj@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	 Davidlohr Bueso <dave@stgolabs.net>,
	Yuanchu Xie <yuanchu@google.com>
Subject: Re: [LSF/MM/BPF TOPIC] Locally attached memory tiering
Date: Thu, 9 May 2024 20:10:07 -0700 (PDT)	[thread overview]
Message-ID: <4e326d6a-b4d2-0617-97fe-e2d8c6458c68@google.com> (raw)
In-Reply-To: <87msp1kkj2.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Wed, 8 May 2024, Huang, Ying wrote:

> > Hi all,
> >
> > I think it would be very worthwhile to have a block set aside for 
> > discussion on locally attached memory tiering extensions at LSF/MM/BPF 
> > 2024.
> >
> > Primarily interested in discussing Linux enlightenment for CXL 1.1 and 
> > later type-3 memory expansion devices (CXL.mem).  I think we could touch 
> > on CXL 2.0 and later memory pooling architectures if we have time and 
> > there is interest, but the primary focus here would be local attached.
> >
> > Based on the premise for a Memory Tiering Working Group[1], there is 
> > widespread interest in the foundational topics for generally useful Linux 
> > enlightenment:
> >
> >  - Decoupling CPU balancing from memory balancing (or obsoleting CPU
> >    balancing entirely)
> >
> >    + John Hubbard notes this would be useful for GPUs:
> >
> >       a) GPUs have their own processors that are invisible to the kernel's
> >          NUMA "which tasks are active on which NUMA nodes" calculations,
> >          and
> >
> >       b) Similar to where CXL is generally going, we have already built
> >          fully memory-coherent hardware, which include memory-only NUMA
> >          nodes.
> >
> >  - In-kernel hot memory abstraction, informed by hardware hinting drivers
> >    (incl some architectures like Power10), usable as a NUMA Balancing
> >    backend for promotion and other areas of the kernel like transparent
> >    hugepage utilization
> >
> >  - NUMA and memory tiering enlightenment for accelerators, such as for
> >    optimal use of GPU memory, extremely important for a cloud provider
> >    (hint hint :)
> >
> >  - Asynchronous memory promotion independent of task_numa_fault() while
> >    considering the cost of page migration (due to identifying cold memory)
> >
> >  - What the role of userspace plays in this decision-making and how we can
> >    extend the default policy and mechanisms in the kernel to allow for it
> >    if necessary
> >
> > Additional topics that you find interesting are also very helpful!
> 
> In addition to the hot memory identification and promotion, I think that
> we should consider the cold memory identification and demotion too as a
> full solution.  The existing method based on the page table accessed bit
> may be good enough, but we still need to consider the full solution in
> the context of the general NUMA balancing.
> 

I think that's a great suggestion!  We'll be able to cover the approach 
taken by workingset reporting[*] which is quite powerful for the purposes 
of proactive reclaim through memory.reclaim and would also very be useful 
for identifying cold memory for the purposes of demotion as well.

 [*] https://lore.kernel.org/linux-mm/20240504073011.4000534-1-yuanchu@google.com/T/

> > I'm biased toward a generally useful solution that would leverage the 
> > kernel as the ultimate source of truth for page hotness that can be 
> > extended for multiple use caes, one of which is memory tiering support.  
> > But certainly if there are other approaches, we can discuss that as well.
> >
> > A few main goals from this discussion:
> >
> >  - Ensure that proposals address, or can be extended to address, the 
> >    emerging needs of the various use cases that users may have
> >
> >  - Surface any constraints that stakeholders may find to be prohibitive
> >    for support in the core MM subsystem
> >
> >  - Alignment and division of work for developers who are actively looking
> >    to contribute to this area
> >
> > As I'm just one of many stakeholders for this discussion, I'd nominate 
> > Michal Hocko to moderate it if he's willing to do so.  If he's so willing, 
> > we'd be in good hands :)
> >
> >  [1] https://lore.kernel.org/linux-mm/45d850ec-623b-7c07-c266-e948cdbf1f62@linux.com/T/
> 
> --
> Best Regards,
> Huang, Ying
>

next prev parent reply	other threads:[~2024-05-10  3:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  3:37 [LSF/MM/BPF TOPIC] Locally attached memory tiering David Rientjes
2024-05-07 11:52 ` Michal Hocko
2024-05-07 20:09   ` David Rientjes
2024-05-08  4:14 ` Huang, Ying
2024-05-10  3:10   ` David Rientjes [this message]
2024-05-08 21:39 ` Davidlohr Bueso
2024-05-09  1:42   ` Huang, Ying
2024-05-13  1:49     ` Davidlohr Bueso
2024-05-13  3:28       ` Bharata B Rao
2024-05-13  7:48       ` Huang, Ying
     [not found] ` <CGME20240509173529uscas1p1b6e43b169514d36915cd2bc8aabc4200@uscas1p1.samsung.com>
2024-05-09 17:35   ` Adam Manzanares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e326d6a-b4d2-0617-97fe-e2d8c6458c68@google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=apopple@nvidia.com \
    --cc=bharata@amd.com \
    --cc=cl@gentwo.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=gourry.memverge@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=jon.grimm@amd.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=sj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).