Linux-CXL Archive mirror
 help / color / mirror / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: linux-cxl@vger.kernel.org, dan.j.williams@intel.com,
	ira.weiny@intel.com, vishal.l.verma@intel.com,
	alison.schofield@intel.com, dave@stgolabs.net
Subject: Re: [PATCH 2/2] cxl: Add checks to access_coordinate calculation to fail missing data
Date: Tue, 5 Mar 2024 17:18:52 -0700	[thread overview]
Message-ID: <ec7a3108-ba31-4ef6-b873-7762258cd727@intel.com> (raw)
In-Reply-To: <b6b8ba4f-0911-4c69-84ab-32027fffbdf4@intel.com>



On 3/5/24 3:36 PM, Dave Jiang wrote:
> 
> 
> On 2/29/24 10:44 AM, Jonathan Cameron wrote:
>> On Wed, 28 Feb 2024 17:25:42 -0700
>> Dave Jiang <dave.jiang@intel.com> wrote:
>>
>>> Jonathan noted that when the coordinates for host bridge and switches
>>> can be 0s if no actual data are retrieved and the calculation continues.
>>> The resulting number would be inaccurate. Add checks to ensure that the
>>> calculation would complete only if the numbers are valid.
>>>
>>> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> Hi Dave,
>>
>> Whilst I think the fix is right, it is getting hard to read. Maybe
>> a rethink is needed on how that iteration works?
>>
>>> ---
>>>  drivers/cxl/core/port.c | 31 +++++++++++++++++++++++++++----
>>>  1 file changed, 27 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>> index e1d30a885700..2c82fe24b789 100644
>>> --- a/drivers/cxl/core/port.c
>>> +++ b/drivers/cxl/core/port.c
>>> @@ -2110,6 +2110,20 @@ static void combine_coordinates(struct access_coordinate *c1,
>>>  		c1->read_latency += c2->read_latency;
>>>  }
>>>  
>>> +static bool coordinates_invalid(struct access_coordinate *c)
>>> +{
>>> +	if (!c->read_bandwidth && !c->write_bandwidth &&
>>> +	    !c->read_latency && !c->write_latency)
>>> +		return true;
>>> +
>>> +	return false;
>>> +}
>>> +
>>> +static bool parent_port_is_cxl_root(struct cxl_port *port)
>>> +{
>>> +	return is_cxl_root(to_cxl_port(port->dev.parent));
>>> +}
>>> +
>>>  /**
>>>   * cxl_endpoint_get_perf_coordinates - Retrieve performance numbers stored in dports
>>>   *				   of CXL path
>>> @@ -2142,16 +2156,25 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
>>>  	 * port each iteration. If the parent is cxl root then there is
>>>  	 * nothing to gather.
>>>  	 */
>>> -	while (!is_cxl_root(to_cxl_port(iter->dev.parent))) {
>>> -		combine_coordinates(&c, &dport->sw_coord);
>>> +	while (!parent_port_is_cxl_root(iter)) {
>>> +		iter = to_cxl_port(iter->dev.parent);
>>> +
>>> +		/* There's no CDAT for the host bridge, so skip if so. */
>>
>> Comment refers to skipping whereas code is 'doing more' for the other case
>> so this is confusing to me.
>>
>> The inverse of this only occurs on the last iteration I think.
>>
>> Possibly a do / while instead of a while will do it.
>> I'm far from confident though as all the levels of look up have
>> me too confused.
> 
> So this is somewhat tricky. For example:
> devices/pci0000:35/0000:35:01.0/0000:37:00.0/mem5
> 
> In this case the endpoint is attached to the host bridge without any switches. The endpoint is 0000:37:00.0 and the host bridge down stream port is 0000:35:01.0. In this instance there is no switch and therefore switch CDAT, but there is a valid dport. So we would skip the dport->sw_coord. However, we do need to pick up the link_latency between endpoint and downstream port. So we spend 1 iteration in the loop and skips the dport->sw_coord.
> 
> devices/pci0000:bf/0000:bf:00.0/0000:c0:00.0/0000:c1:00.0/0000:c2:00.0/mem8
> 
> Now in this case there's a CXL switch in between. So in first iteration, we pick up the dport->sw_coord. And in second iteration, we skip the dport->sw_coord. However, we would be adding two link_latency. For the link between 0000:c2:00.0 (endpoint) and 0000:c1:00.0 (switch downstream port), and the link between 0000:c0:00.0 (switch upstream port) and 0000:bf:00.0 (host bridge down stream port). So therefore we can't put the sum of link_latency outside of the loop.
> 
> Not sure how much better this is:
> 
> 	do {
> 		struct cxl_port *parent_port = to_cxl_port(iter->dev.parent);
> 
> 		dport = iter->parent_dport;
> 		if (!parent_port_is_cxl_root(parent_port)) {
> 			if (coordinates_invalid(&dport->sw_coord))
> 				return -EINVAL;
> 
> 			combine_coordinates(&c, &dport->sw_coord);
> 		}
> 
> 		c.write_latency += dport->link_latency;
> 		c.read_latency += dport->link_latency;
> 		iter = to_cxl_port(iter->dev.parent);
> 	} while (!parent_port_is_cxl_root(iter));
> 
> or:
> 
> 	do {
> 		struct cxl_port *parent_port = to_cxl_port(iter->dev.parent);
> 
> 		dport = iter->parent_dport;
> 		c.write_latency += dport->link_latency;
> 		c.read_latency += dport->link_latency;
> 
> 		if (parent_port_is_cxl_root(parent_port))
> 			break;
> 
> 		if (coordinates_invalid(&dport->sw_coord))
> 			return -EINVAL;
> 
> 		combine_coordinates(&c, &dport->sw_coord);
> 		iter = to_cxl_port(iter->dev.parent);
> 	} while (!parent_port_is_cxl_root(iter));

Actually, I think if we just make it dport->coord instead of dport->sw_coord and dport->hb_coord, we can remove the check and everything should work out properly. 

> 
> 
>>
>>
>> 	do {
>> 		if (coordinates_invalid(&dport->sw_coord))
>> 			return -EINVAL;
>>
>> 		combine_coordinates(&c, &dport->sw_coord);
>> 		iter = to_cxl_port(iter->dev.parent);
>>   		dport = iter->parent_dport;
>> 	} while (!parent_port_is_cxl_root(iter));
>> 	/* Do final link updates */
>> 	c.write_latency += dport->link_latency;
>> 	c.read_latency += dport->link_latency;
>>
>>> +		if (!parent_port_is_cxl_root(iter)) {
>>> +			if (coordinates_invalid(&dport->sw_coord))
>>> +				return -EINVAL;
>>> +
>>> +			combine_coordinates(&c, &dport->sw_coord);
>>> +		}
>>> +
>>>  		c.write_latency += dport->link_latency;
>>>  		c.read_latency += dport->link_latency;
>>> -
>>> -		iter = to_cxl_port(iter->dev.parent);
>>>  		dport = iter->parent_dport;
>>>  	}
>>>  
>>>  	/* Augment with the generic port (host bridge) perf data */
>>> +	if (coordinates_invalid(&dport->hb_coord))
>>> +		return -EINVAL;
>>>  	combine_coordinates(&c, &dport->hb_coord);
>>>  
>>>  	/* Get the calculated PCI paths bandwidth */
>>
> 

  reply	other threads:[~2024-03-06  0:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29  0:25 [PATCH 1/2] cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dave Jiang
2024-02-29  0:25 ` [PATCH 2/2] cxl: Add checks to access_coordinate calculation to fail missing data Dave Jiang
2024-02-29  0:35   ` Dan Williams
2024-02-29  0:39     ` Dave Jiang
2024-02-29  0:44       ` Dan Williams
2024-02-29 17:25         ` Jonathan Cameron
2024-02-29 17:44   ` Jonathan Cameron
2024-03-05 22:36     ` Dave Jiang
2024-03-06  0:18       ` Dave Jiang [this message]
2024-02-29  0:32 ` [PATCH 1/2] cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dan Williams
2024-02-29  0:36   ` Dave Jiang
2024-02-29 17:45 ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec7a3108-ba31-4ef6-b873-7762258cd727@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).