From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE0513BBCC for ; Wed, 1 May 2024 05:00:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714539607; cv=none; b=V3sFB0wKtWsCTrUp89lIisx0cV7eGLQvDoYCpKX+yy+441YZYecDIuH4RxQZlf9sk+Oe3cmjYaLVr3jQchz95JkF1JJVGiuNTWwnJpk2WJ4POE3+3VYSutGfjJNX3rHinoBAnpPN6betkAuXISMaiPzR9lSv/mo95vLLP4D5YII= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714539607; c=relaxed/simple; bh=5UFGOuOlFDtxfNPx5Rv0YnLHi9WncpO+3O+AfldGTzI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dsjhXaBkpPKOAByFHHCYpHV2UOsSqoUQyaM33BF6ckm6KcbVRO2JiTbWrlN3oSi0AaSYT8exyDMenGhk9Vt+I9EtatQ5dK+FP6vw1633Ca1bez3hy5mp3yQNWK87RNKWoqlocIw7ZdFHHnxBMFXRneeHYEx4xyBeJK/TlG419GQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=itKdeDTX; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="itKdeDTX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1714539604; x=1746075604; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=5UFGOuOlFDtxfNPx5Rv0YnLHi9WncpO+3O+AfldGTzI=; b=itKdeDTXuRcI87jWujj3mSTpdKAA7yiAuc1CJRECjC81W3TRuExYb0CC TXrDc0H6SeynWDaeqGqhH/l5OJ97gaiIfMWsJapopBIuhWSIhZ+dvvScD SAc/P0APJmmRa5+nsPBb68f7X2CwIR2MhjnCHZ/GRpk2U6cAhIAvKmL54 VV2PciGN03NTqukXzDZap1V8ZjNf2+jhsXEeeNSriALwdPs/VChMOqUtB 3pCwfgDi1EZK+fVC5huew05pW824UiJ9LwQDB20UfEteR+xV3QXm2OOOA dc/qnragTU2j+G9N+jeMqkn9Lsvxek76leRmth1pDVE6KOBJE2kSr70cp Q==; X-CSE-ConnectionGUID: zqWfcQRVRg+SALEKBl6aMw== X-CSE-MsgGUID: R1YnrokaT9uQZRADjk+SNw== X-IronPort-AV: E=McAfee;i="6600,9927,11060"; a="14046414" X-IronPort-AV: E=Sophos;i="6.07,244,1708416000"; d="scan'208";a="14046414" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Apr 2024 22:00:03 -0700 X-CSE-ConnectionGUID: VQa5RoGlTyKdH6awXY7sTw== X-CSE-MsgGUID: 39JG/BPwS2q2b25hzZ/86Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,244,1708416000"; d="scan'208";a="57570650" Received: from aschofie-mobl2.amr.corp.intel.com (HELO aschofie-mobl2) ([10.212.209.59]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Apr 2024 22:00:04 -0700 Date: Tue, 30 Apr 2024 22:00:02 -0700 From: Alison Schofield To: Dan Williams Cc: Davidlohr Bueso , Jonathan Cameron , Dave Jiang , Vishal Verma , Ira Weiny , linux-cxl@vger.kernel.org Subject: Re: [PATCH 1/3] cxl/acpi: Restore XOR'd position bits during address translation Message-ID: References: <662c4fef695d6_b6e029460@dwillia2-mobl3.amr.corp.intel.com.notmuch> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Apr 30, 2024 at 08:41:19PM -0700, Alison Schofield wrote: > On Fri, Apr 26, 2024 at 06:07:59PM -0700, Dan Williams wrote: > > alison.schofield@ wrote: > > > From: Alison Schofield > > > > > > When a CXL region is created in a CXL Window (CFMWS) that uses XOR > > > interleave arithmetic XOR maps are applied during the HPA->DPA > > > translation. The XOR function changes the interleave selector > > > bit (aka position bit) in the HPA thereby varying which host bridge > > > services an HPA. The purpose is to minimize hot spots thereby > > > improving performance. > > > > > > When a device reports a DPA in events such as poison, general_media, > > > and dram, the driver translates that DPA back to an HPA. Presently, > > > the CXL driver translation only considers the modulo position and > > > will report the wrong HPA for XOR configured CFMWS's. > > > > > > Add a helper function that restores the XOR'd bits during DPA->HPA > > > address translation. Plumb a root decoder callback to the new helper > > > when XOR interleave arithmetic is in use. For MODULO arithmetic, just > > > let the callback be NULL - as in no extra work required. > > > > > > Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events") > > > Signed-off-by: Alison Schofield > > > --- > > > drivers/cxl/acpi.c | 49 +++++++++++++++++++++++++++++++++++++--- > > > drivers/cxl/core/port.c | 5 +++- > > > drivers/cxl/core/trace.c | 5 ++++ > > > drivers/cxl/cxl.h | 6 ++++- > > > 4 files changed, 60 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > > > index af5cb818f84d..519e933b5a4b 100644 > > > --- a/drivers/cxl/acpi.c > > > +++ b/drivers/cxl/acpi.c > > > @@ -74,6 +74,44 @@ static struct cxl_dport *cxl_hb_xor(struct cxl_root_decoder *cxlrd, int pos) > > > return cxlrd->cxlsd.target[n]; > > > } > > > > > > +static u64 restore_xor_pos(u64 hpa, u64 map) > > > +{ > > > + int restore_value, restore_pos = 0; > > > + > > > + /* > > > + * Restore the position bit to its value before the > > > + * xormap was applied at HPA->DPA translation. > > > + * > > > + * restore_pos is the lowest set bit in the map > > > + * restore_value is the XORALLBITS in (hpa AND map) > > > > Might be worth finally clarifying why there is no "XOR" operation in > > this xor_pos routine, i.e. that XORALLBITS is identical to asking if the > > hweight of the (hpa & map) is odd or even. > > > > Thanks for the review Dan! > > Well, I would except that a few lines below, you suggest I use > an XOR operand ;) but I know what you mean. Edited in v2. > > Actually, I've abandoned this separate function in v2. Since > the LOC have been whittled down, seems useless. > > > > + */ > > > + > > > + while ((map & (1ULL << restore_pos)) == 0) > > > + restore_pos++; > > > > This is just open coded ffs()? > > > So it is! Using ffs() in v2. > > > > + > > > + restore_value = (hweight64(hpa & map) & 1); > > > + if (restore_value) > > > + hpa |= (1ULL << restore_pos); > > > + else > > > + hpa &= ~(1ULL << restore_pos); > > > > It feels like this conditional mask / set can just be an xor operation? > > > > hpa ^= ((hweight64(hpa & map) & 1) << restore_pos); > > I've taken the XOR operand piece, but didn't collapse into the > one-liner you suggest. See what you think in v1 please. > > > > > Otherwise I question the & and | operations relative to the HPA bit > > position already being 0 or 1. > > > > They are gone in next rev, but FWIW here's the truth about those ops: > > if restore_value == 1, using |= always sets the bit at restore_pos > like this: > restore_value current_value |= value > 1 0 1 > 1 1 1 > > if restore_value == 0, using &= always clears the bit at restore_pos > like this: > restore_value current_value |= value ^^ should be &= > 0 0 0 > 0 1 0 > I guess my own typo proves ^= is cleaner ;) > > > + > > > + return hpa; > > > +} > > > + > > > +static u64 cxl_xor_trans(struct cxl_root_decoder *cxlrd, u64 hpa, int iw) > > > > Ok, so the driver has cxl_trace_hpa() and now cxl_xor_trans() and > > "addr_trans". Can these all just be called "translate" because "trace" > > feels like tracing, "trans" is only saving 4 characters, and "addr" is > > redundant as nothing else needs translating besides addresses in CXL > > land. > > I think i've got it: > > Existing: > cxl_trace_hpa() -> cxl_translate() > > Introduced in this set: > cxl_addr_trans_fn -> cxl_translate_fn > addr_trans -> translate > cxl_xor_trans() -> cxl_xor_translate() > > > > > [..] > > > diff --git a/drivers/cxl/core/trace.c b/drivers/cxl/core/trace.c > > > index d0403dc3c8ab..a7ea4a256036 100644 > > > --- a/drivers/cxl/core/trace.c > > > +++ b/drivers/cxl/core/trace.c > > > @@ -36,6 +36,7 @@ static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos) > > > static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr, > > > struct cxl_endpoint_decoder *cxled) > > > { > > > + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); > > > u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa; > > > struct cxl_region_params *p = &cxlr->params; > > > int pos = cxled->pos; > > > @@ -75,6 +76,10 @@ static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr, > > > /* Apply the hpa_offset to the region base address */ > > > hpa = hpa_offset + p->res->start; > > > > > > + /* An addr_trans helper is defined for XOR math */ > > > > Rather then calling out XOR math since that is an ACPI'ism I would just > > say something like: "root decoder overrides typical modulo decode" > > Got it. > > -- Alison >