NVDIMM Device and Persistent Memory development
 help / color / mirror / Atom feed
From: fan <nifan.cxl@gmail.com>
To: alison.schofield@intel.com
Cc: Vishal Verma <vishal.l.verma@intel.com>,
	nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org,
	Dave Jiang <dave.jiang@intel.com>
Subject: Re: [ndctl PATCH v11 1/7] libcxl: add interfaces for GET_POISON_LIST mailbox commands
Date: Mon, 18 Mar 2024 10:51:13 -0700	[thread overview]
Message-ID: <Zfh_EYPNeRJl8Qio@debian> (raw)
In-Reply-To: <c43e12c5bafca30d3194ebb11d9817b9a05eaad0.1710386468.git.alison.schofield@intel.com>

On Wed, Mar 13, 2024 at 09:05:17PM -0700, alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> CXL devices maintain a list of locations that are poisoned or result
> in poison if the addresses are accessed by the host.
> 
> Per the spec (CXL 3.1 8.2.9.9.4.1), the device returns the Poison
> List as a set of  Media Error Records that include the source of the
> error, the starting device physical address and length.
> 
> Trigger the retrieval of the poison list by writing to the memory
> device sysfs attribute: trigger_poison_list. The CXL driver only
> offers triggering per memdev, so the trigger by region interface
> offered here is a convenience API that triggers a poison list
> retrieval for each memdev contributing to a region.
> 
> int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev);
> int cxl_region_trigger_poison_list(struct cxl_region *region);
> 
> The resulting poison records are logged as kernel trace events
> named 'cxl_poison'.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  cxl/lib/libcxl.c   | 47 ++++++++++++++++++++++++++++++++++++++++++++++
>  cxl/lib/libcxl.sym |  2 ++
>  cxl/libcxl.h       |  2 ++
>  3 files changed, 51 insertions(+)
> 
> diff --git a/cxl/lib/libcxl.c b/cxl/lib/libcxl.c
> index ff27cdf7c44a..73db8f15c704 100644
> --- a/cxl/lib/libcxl.c
> +++ b/cxl/lib/libcxl.c
> @@ -1761,6 +1761,53 @@ CXL_EXPORT int cxl_memdev_disable_invalidate(struct cxl_memdev *memdev)
>  	return 0;
>  }
>  
> +CXL_EXPORT int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev)
> +{
> +	struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev);
> +	char *path = memdev->dev_buf;
> +	int len = memdev->buf_len, rc;
> +
> +	if (snprintf(path, len, "%s/trigger_poison_list",
> +		     memdev->dev_path) >= len) {
> +		err(ctx, "%s: buffer too small\n",
> +		    cxl_memdev_get_devname(memdev));
> +		return -ENXIO;
> +	}
> +	rc = sysfs_write_attr(ctx, path, "1\n");
> +	if (rc < 0) {
> +		fprintf(stderr,
> +			"%s: Failed write sysfs attr trigger_poison_list\n",
> +			cxl_memdev_get_devname(memdev));

Should we use err() instead of fprintf here? 

Fan

> +		return rc;
> +	}
> +	return 0;
> +}
> +
> +CXL_EXPORT int cxl_region_trigger_poison_list(struct cxl_region *region)
> +{
> +	struct cxl_memdev_mapping *mapping;
> +	int rc;
> +
> +	cxl_mapping_foreach(region, mapping) {
> +		struct cxl_decoder *decoder;
> +		struct cxl_memdev *memdev;
> +
> +		decoder = cxl_mapping_get_decoder(mapping);
> +		if (!decoder)
> +			continue;
> +
> +		memdev = cxl_decoder_get_memdev(decoder);
> +		if (!memdev)
> +			continue;
> +
> +		rc = cxl_memdev_trigger_poison_list(memdev);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
>  CXL_EXPORT int cxl_memdev_enable(struct cxl_memdev *memdev)
>  {
>  	struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev);
> diff --git a/cxl/lib/libcxl.sym b/cxl/lib/libcxl.sym
> index de2cd84b2960..3f709c60db3d 100644
> --- a/cxl/lib/libcxl.sym
> +++ b/cxl/lib/libcxl.sym
> @@ -280,4 +280,6 @@ global:
>  	cxl_memdev_get_pmem_qos_class;
>  	cxl_memdev_get_ram_qos_class;
>  	cxl_region_qos_class_mismatch;
> +	cxl_memdev_trigger_poison_list;
> +	cxl_region_trigger_poison_list;
>  } LIBCXL_6;
> diff --git a/cxl/libcxl.h b/cxl/libcxl.h
> index a6af3fb04693..29165043ca3f 100644
> --- a/cxl/libcxl.h
> +++ b/cxl/libcxl.h
> @@ -467,6 +467,8 @@ enum cxl_setpartition_mode {
>  
>  int cxl_cmd_partition_set_mode(struct cxl_cmd *cmd,
>  		enum cxl_setpartition_mode mode);
> +int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev);
> +int cxl_region_trigger_poison_list(struct cxl_region *region);
>  
>  int cxl_cmd_alert_config_set_life_used_prog_warn_threshold(struct cxl_cmd *cmd,
>  							   int threshold);
> -- 
> 2.37.3
> 

  reply	other threads:[~2024-03-18 17:51 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14  4:05 [ndctl PATCH v11 0/7] Support poison list retrieval alison.schofield
2024-03-14  4:05 ` [ndctl PATCH v11 1/7] libcxl: add interfaces for GET_POISON_LIST mailbox commands alison.schofield
2024-03-18 17:51   ` fan [this message]
2024-03-18 20:11     ` Alison Schofield
2024-03-18 21:01       ` Dan Williams
2024-03-19 16:43         ` Alison Schofield
2024-03-14  4:05 ` [ndctl PATCH v11 2/7] cxl/event_trace: add an optional pid check to event parsing alison.schofield
2024-03-14  4:05 ` [ndctl PATCH v11 3/7] cxl/event_trace: support poison context in " alison.schofield
2024-03-14  4:05 ` [ndctl PATCH v11 4/7] cxl/event_trace: add helpers to retrieve tep fields by type alison.schofield
2024-03-15 15:44   ` Dave Jiang
2024-03-15 17:39   ` Dan Williams
2024-03-18 17:28     ` Alison Schofield
2024-03-18 21:21   ` fan
2024-03-14  4:05 ` [ndctl PATCH v11 5/7] cxl/list: collect and parse media_error records alison.schofield
2024-03-15 16:16   ` Dave Jiang
2024-03-20 20:24     ` Alison Schofield
2024-03-14  4:05 ` [ndctl PATCH v11 6/7] cxl/list: add --media-errors option to cxl list alison.schofield
2024-03-15 16:41   ` Dave Jiang
2024-03-14  4:05 ` [ndctl PATCH v11 7/7] cxl/test: add cxl-poison.sh unit test alison.schofield
2024-03-15 17:03   ` Dave Jiang
     [not found] ` <CGME20240314040548epcas2p3698bf9d1463a1d2255dc95ac506d3ae8@epcms2p4>
2024-03-15  1:09   ` [ndctl PATCH v11 6/7] cxl/list: add --media-errors option to cxl list Wonjae Lee
2024-03-15  2:36     ` Alison Schofield
2024-03-15  3:35       ` Dan Williams
2024-03-20 20:40         ` Alison Schofield
2024-03-27 19:48         ` Alison Schofield
2024-04-18 20:12           ` Alison Schofield
     [not found] ` <CGME20240314040551epcas2p40829b16b09f439519a692070fb460242@epcms2p1>
2024-03-15 23:03   ` [ndctl PATCH v11 7/7] cxl/test: add cxl-poison.sh unit test Wonjae Lee
2024-03-18 17:17     ` Alison Schofield
2024-03-20 20:42 ` [ndctl PATCH v11 0/7] Support poison list retrieval Alison Schofield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zfh_EYPNeRJl8Qio@debian \
    --to=nifan.cxl@gmail.com \
    --cc=alison.schofield@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).