NVDIMM Device and Persistent Memory development
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Alistair Popple <apopple@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,  <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>,  <linux-cxl@vger.kernel.org>,
	<nvdimm@lists.linux.dev>,  <linux-acpi@vger.kernel.org>,
	 "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	 Wei Xu <weixugc@google.com>,
	 Dan Williams <dan.j.williams@intel.com>,
	 Dave Hansen <dave.hansen@intel.com>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	 "Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	Michal Hocko <mhocko@kernel.org>,  Yang Shi <shy828301@gmail.com>,
	Rafael J Wysocki <rafael.j.wysocki@intel.com>,
	 Dave Jiang <dave.jiang@intel.com>
Subject: Re: [PATCH RESEND 1/4] memory tiering: add abstract distance calculation algorithms management
Date: Thu, 27 Jul 2023 12:02:25 +0800	[thread overview]
Message-ID: <87edkuvw6m.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <87351axbk6.fsf@nvdebian.thelocal> (Alistair Popple's message of "Thu, 27 Jul 2023 13:42:15 +1000")

Alistair Popple <apopple@nvidia.com> writes:

> "Huang, Ying" <ying.huang@intel.com> writes:
>
>>>> The other way (suggested by this series) is to make dax/kmem call a
>>>> notifier chain, then CXL CDAT or ACPI HMAT can identify the type of
>>>> device and calculate the distance if the type is correct for them.  I
>>>> don't think that it's good to make dax/kem to know every possible
>>>> types of memory devices.
>>>
>>> Do we expect there to be lots of different types of memory devices
>>> sharing a common dax/kmem driver though? Must admit I'm coming from a
>>> GPU background where we'd expect each type of device to have it's own
>>> driver anyway so wasn't expecting different types of memory devices to
>>> be handled by the same driver.
>>
>> Now, dax/kmem.c is used for
>>
>> - PMEM (Optane DCPMM, or AEP)
>> - CXL.mem
>> - HBM (attached to CPU)
>
> Thanks a lot for the background! I will admit to having a faily narrow
> focus here.
>
>>>> And, I don't think that we are forced to use the general notifier
>>>> chain interface in all memory device drivers.  If the memory device
>>>> driver has better understanding of the memory device, it can use other
>>>> way to determine abstract distance.  For example, a CXL memory device
>>>> driver can identify abstract distance by itself.  While other memory
>>>> device drivers can use the general notifier chain interface at the
>>>> same time.
>>>
>>> Whilst I think personally I would find that flexibility useful I am
>>> concerned it means every driver will just end up divining it's own
>>> distance rather than ensuring data in HMAT/CDAT/etc. is correct. That
>>> would kind of defeat the purpose of it all then.
>>
>> But we have no way to enforce that too.
>
> Enforce that HMAT/CDAT/etc. is correct? Agree we can't enforce it, but
> we can influence it. If drivers can easily ignore the notifier chain and
> do their own thing that's what will happen.

IMHO, both enforce HMAT/CDAT/etc is correct and enforce drivers to use
general interface we provided.  Anyway, we should try to make HMAT/CDAT
works well, so drivers want to use them :-)

>>>> While other memory device drivers can use the general notifier chain
>>>> interface at the same time.
>
> How would that work in practice though? The abstract distance as far as
> I can tell doesn't have any meaning other than establishing preferences
> for memory demotion order. Therefore all calculations are relative to
> the rest of the calculations on the system. So if a driver does it's own
> thing how does it choose a sensible distance? IHMO the value here is in
> coordinating all that through a standard interface, whether that is HMAT
> or something else.

Only if different algorithms follow the same basic principle.  For
example, the abstract distance of default DRAM nodes are fixed
(MEMTIER_ADISTANCE_DRAM).  The abstract distance of the memory device is
in linear direct proportion to the memory latency and inversely
proportional to the memory bandwidth.  Use the memory latency and
bandwidth of default DRAM nodes as base.

HMAT and CDAT report the raw memory latency and bandwidth.  If there are
some other methods to report the raw memory latency and bandwidth, we
can use them too.

--
Best Regards,
Huang, Ying

  reply	other threads:[~2023-07-27  4:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21  1:29 [PATCH RESEND 0/4] memory tiering: calculate abstract distance based on ACPI HMAT Huang Ying
2023-07-21  1:29 ` [PATCH RESEND 1/4] memory tiering: add abstract distance calculation algorithms management Huang Ying
2023-07-25  2:13   ` Alistair Popple
2023-07-25  3:14     ` Huang, Ying
2023-07-25  8:26       ` Alistair Popple
2023-07-26  7:33         ` Huang, Ying
2023-07-27  3:42           ` Alistair Popple
2023-07-27  4:02             ` Huang, Ying [this message]
2023-07-27  4:07               ` Alistair Popple
2023-07-27  5:41                 ` Huang, Ying
2023-07-28  1:20                   ` Alistair Popple
2023-08-11  3:51                     ` Huang, Ying
2023-08-21 11:26                       ` Alistair Popple
2023-08-21 22:50                         ` Huang, Ying
2023-08-21 23:52                           ` Alistair Popple
2023-08-22  0:58                             ` Huang, Ying
2023-08-22  7:11                               ` Alistair Popple
2023-08-23  5:56                                 ` Huang, Ying
2023-08-25  5:41                                   ` Alistair Popple
2023-07-21  1:29 ` [PATCH RESEND 2/4] acpi, hmat: refactor hmat_register_target_initiators() Huang Ying
2023-07-25  2:44   ` Alistair Popple
2023-08-07 16:55   ` Jonathan Cameron
2023-08-11  1:13     ` Huang, Ying
2023-07-21  1:29 ` [PATCH RESEND 3/4] acpi, hmat: calculate abstract distance with HMAT Huang Ying
2023-07-25  2:45   ` Alistair Popple
2023-07-25  6:47     ` Huang, Ying
2023-08-21 11:53       ` Alistair Popple
2023-08-21 23:28         ` Huang, Ying
2023-07-21  1:29 ` [PATCH RESEND 4/4] dax, kmem: calculate abstract distance with general interface Huang Ying
2023-07-25  3:11   ` Alistair Popple
2023-07-25  7:02     ` Huang, Ying
2023-08-21 12:03       ` Alistair Popple
2023-08-21 23:33         ` Huang, Ying
2023-08-22  7:36           ` Alistair Popple
2023-08-23  2:13             ` Huang, Ying
2023-08-25  6:00               ` Alistair Popple
2023-07-21  4:15 ` [PATCH RESEND 0/4] memory tiering: calculate abstract distance based on ACPI HMAT Alistair Popple
2023-07-24 17:58   ` Andrew Morton
2023-08-01  2:35     ` Bharata B Rao
2023-08-11  6:26       ` Huang, Ying
2023-08-11  7:49         ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edkuvw6m.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=apopple@nvidia.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rafael.j.wysocki@intel.com \
    --cc=shy828301@gmail.com \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).