All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Chris Browy <cbrowy@avery-design.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
Date: Thu, 3 Dec 2020 21:55:11 -0800	[thread overview]
Message-ID: <CAPcyv4gk5AcsOhzZEu57NWr9CBXQONyOkzXfSNZiFXP4Kj2G9g@mail.gmail.com> (raw)
In-Reply-To: <A0F02678-46AD-4AC7-92C3-9BEF3B3284DE@avery-design.com>

[ add linux-cxl for the Linux driver question ]


On Thu, Dec 3, 2020 at 9:10 PM Chris Browy <cbrowy@avery-design.com> wrote:
[..]

I'll let Ben address the other questions...

> 4. What are the userspace system APIs for targeting CXL HDM address domain?
>    Usually you can mmap a SPA if you know how to look it up.

tl;dr: /proc/iomem and sysfs

There are a lot of moving pieces to this question. Once a CXL.mem
device is mapped the driver will arrange for it to appear in
/proc/iomem. If it is persistent memory it will be accessed through a
/dev/pmem or /dev/dax device, and if it is volatile it will show up as
just another "System RAM" range. The System RAM range may end up with
a unique Linux NUMA node id. Alternatively, if the BIOS maps the
device then the BIOS will set up the EFI memory map + ACPI
SRAT/SLIT/HMAT for Linux to interpret.

Once something shows up in /proc/iomem it was traditionally accessible
via /dev/mem. These days though /dev/mem is slowly being deprecated
due to security (see CONFIG_LOCK_DOWN_KERNEL*) and stability (see
CONFIG_IO_STRICT_DEVMEM)  concerns, so depending on your kernel
configuration tools may not be able to use /dev/mem to map the SPA
directly and will need to go through the driver provided interface.

The driver needs to be able coordinate coherent HDM settings across
multiple devices in an interleave, all switches in the path, and the
root bridge(s). So I am expecting a sysfs interface similar to the
nvdimm-subsystem and device-dax-subsystem where there will be
sysfs-files to enumerate the mappings and a provisioning mechanism to
reconfigure HDMs with various interleave settings. For a flavor of
what this might look like see the description of NVDIMM "Region"
devices:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/nvdimm/nvdimm.rst#n469

The difference will be that unlike the nvdimm case where regions are
platform-firmware-defined and static, CXL regions are runtime dynamic.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Chris Browy <cbrowy@avery-design.com>
Cc: Vishal L Verma <vishal.l.verma@intel.com>,
	Ben Widawsky <ben.widawsky@intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
Date: Thu, 3 Dec 2020 21:55:11 -0800	[thread overview]
Message-ID: <CAPcyv4gk5AcsOhzZEu57NWr9CBXQONyOkzXfSNZiFXP4Kj2G9g@mail.gmail.com> (raw)
In-Reply-To: <A0F02678-46AD-4AC7-92C3-9BEF3B3284DE@avery-design.com>

[ add linux-cxl for the Linux driver question ]


On Thu, Dec 3, 2020 at 9:10 PM Chris Browy <cbrowy@avery-design.com> wrote:
[..]

I'll let Ben address the other questions...

> 4. What are the userspace system APIs for targeting CXL HDM address domain?
>    Usually you can mmap a SPA if you know how to look it up.

tl;dr: /proc/iomem and sysfs

There are a lot of moving pieces to this question. Once a CXL.mem
device is mapped the driver will arrange for it to appear in
/proc/iomem. If it is persistent memory it will be accessed through a
/dev/pmem or /dev/dax device, and if it is volatile it will show up as
just another "System RAM" range. The System RAM range may end up with
a unique Linux NUMA node id. Alternatively, if the BIOS maps the
device then the BIOS will set up the EFI memory map + ACPI
SRAT/SLIT/HMAT for Linux to interpret.

Once something shows up in /proc/iomem it was traditionally accessible
via /dev/mem. These days though /dev/mem is slowly being deprecated
due to security (see CONFIG_LOCK_DOWN_KERNEL*) and stability (see
CONFIG_IO_STRICT_DEVMEM)  concerns, so depending on your kernel
configuration tools may not be able to use /dev/mem to map the SPA
directly and will need to go through the driver provided interface.

The driver needs to be able coordinate coherent HDM settings across
multiple devices in an interleave, all switches in the path, and the
root bridge(s). So I am expecting a sysfs interface similar to the
nvdimm-subsystem and device-dax-subsystem where there will be
sysfs-files to enumerate the mappings and a provisioning mechanism to
reconfigure HDMs with various interleave settings. For a flavor of
what this might look like see the description of NVDIMM "Region"
devices:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/nvdimm/nvdimm.rst#n469

The difference will be that unlike the nvdimm case where regions are
platform-firmware-defined and static, CXL regions are runtime dynamic.


  reply	other threads:[~2020-12-04  5:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04  5:08 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Chris Browy
2020-12-04  5:55 ` Dan Williams [this message]
2020-12-04  5:55   ` Dan Williams
  -- strict thread matches above, loose matches on Subject: below --
2020-11-11  5:46 Ben Widawsky
2020-11-16 17:21 ` Jonathan Cameron
2020-11-16 18:06   ` Ben Widawsky
2020-11-17 14:09     ` Jonathan Cameron
2020-11-25 18:29       ` Ben Widawsky
2020-12-04 14:27 ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4gk5AcsOhzZEu57NWr9CBXQONyOkzXfSNZiFXP4Kj2G9g@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=cbrowy@avery-design.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.