Re: Barlopass nvdimm as MemoryMode question

NVDIMM Device and Persistent Memory development
 help / color / mirror / Atom feed

From: Dan Williams <dan.j.williams@intel.com>
To: Jane Chu <jane.chu@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<vishal.l.verma@intel.com>, <nvdimm@lists.linux.dev>
Cc: Linux-MM <linux-mm@kvack.org>,
	Joao Martins <joao.m.martins@oracle.com>,
	Jane Chu <jane.chu@oracle.com>
Subject: Re: Barlopass nvdimm as MemoryMode question
Date: Thu, 7 Mar 2024 18:53:16 -0800	[thread overview]
Message-ID: <65ea7d9c710c0_142a129493@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <0660501f-ae9f-48a3-a1c0-f19be8ca5f02@oracle.com>

Jane Chu wrote:
> On 3/7/2024 5:42 PM, Jane Chu wrote:
> 
> > On 3/7/2024 4:49 PM, Dan Williams wrote:
> >
> >> Jane Chu wrote:
> >>> Add Joao.
> >>>
> >>> On 3/7/2024 1:05 PM, Dan Williams wrote:
> >>>
> >>>> Jane Chu wrote:
> >>>>> Hi, Dan and Vishal,
> >>>>>
> >>>>> What kind of NUMAness is visible to the kernel w.r.t. SysRAM region
> >>>>> backed by Barlopass nvdimms configured in MemoryMode by impctl ?
> >>>> As always, the NUMA description, is a property of the platform not the
> >>>> media type / DIMM. The ACPI HMAT desrcibes the details of a
> >>>> memory-side-caches. See "5.2.27.2 Memory Side Cache Overview" in ACPI
> >>>> 6.4.
> >>> Thanks!  So, compare to dax_kmem which assign a numa node to a newly
> >>> converted pmem/SysRAM region,
> >> ...to be clear, dax_kmem is not creating a new NUMA node, it is just
> >> potentially onlining a proximity domain that was fully described by ACPI
> >> SRAT but offline.
> >>
> >>> w.r.t. pmem in MemoryMode, is there any clue that kernel exposes(or
> >>> could expose) to userland about the extra latency such that userland
> >>> may treat these memory regions differently?
> >> Userland should be able to interrogate the memory_side_cache/ property
> >> in NUMA sysfs:
> >>
> >> https://docs.kernel.org/admin-guide/mm/numaperf.html?#numa-cache
> >>
> >> Otherwise I believe SRAT and SLIT for that node only reflect the
> >> performance of the DDR fronting the PMEM. So if you have a DDR node and
> >> DDR+PMEM cache node, they may look the same from the ACPI SLIT
> >> perspective, but the ACPI HMAT contains the details of the backing
> >> memory. The Linux NUMA performance sysfs interface gets populated by
> >> ACPI HMAT.
> >
> > Thanks Dan.
> >
> > Please correct me if I'm mistaken:  if I configure some barlowpass 
> > nvdimms to MemoryMode and reboot, as those regions of memory is 
> > automatically two level with DDR as the front cache, so hmat_init() is 
> > expected to create the memory_side_cache/indexN interface, and if I 
> > see multiple indexN layers, that would be a sign that pmem in 
> > MemoryMode is present, right?
> >
> > I've yet to grab hold of a system to confirm this, but apparently with 
> > only DDR memory, memory_side_cache/ doesn't exist.
> 
> On each CPU socket node, we have
> 
> | |-memory_side_cache | | |-uevent | | |-power | | |-index1 | | | 
> |-uevent | | | |-power | | | |-line_size | | | |-write_policy | | | 
> |-size | | | |-indexing
> 
> where 'indexing' = 0, means direct-mapped cache?, so is that a clue that 
> slower/far-memory is behind the cache?

Correct.

Note that the ACPI HMAT may also populate data about the performance of
the memory range on a cache miss (see ACPI 6.4 Table 5.129: System
Locality Latency and Bandwidth Information Structure), but the Linux
enabling does not export that information.

     prev parent reply	other threads:[~2024-03-08  2:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <a639e29f-fa99-4fec-a148-c235e5f90071@oracle.com>
2024-03-07 21:05 ` Barlopass nvdimm as MemoryMode question Dan Williams
2024-03-08  0:30   ` Jane Chu
2024-03-08  0:49     ` Dan Williams
2024-03-08  1:42       ` Jane Chu
2024-03-08  1:57         ` Jane Chu
2024-03-08  2:53           ` Dan Williams [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65ea7d9c710c0_142a129493@dwillia2-mobl3.amr.corp.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).