From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
"palmer@dabbelt.com" <palmer@dabbelt.com>,
"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
"ardb@kernel.org" <ardb@kernel.org>,
"anup@brainfault.org" <anup@brainfault.org>,
"Li, Haicheng" <haicheng.li@intel.com>,
"ajones@ventanamicro.com" <ajones@ventanamicro.com>,
"Liu, Yujie" <yujie.liu@intel.com>,
"charlie@rivosinc.com" <charlie@rivosinc.com>,
"linux-riscv@lists.infradead.org"
<linux-riscv@lists.infradead.org>,
"linux-efi@vger.kernel.org" <linux-efi@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v5 2/2] riscv: Optimize bitops with Zbb extension
Date: Sun, 12 Nov 2023 07:05:11 +0000 [thread overview]
Message-ID: <DM8PR11MB575167A40A568B75E58E940AB8ACA@DM8PR11MB5751.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAMuHMdUQGtenM=_sNntW4mQ0K-7G=5_OhxG-AgQffMbR276W1w@mail.gmail.com>
Hi Geert,
> -----Original Message-----
> From: Geert Uytterhoeven <geert@linux-m68k.org>
> Sent: Friday, November 10, 2023 5:25 PM
> To: Wang, Xiao W <xiao.w.wang@intel.com>
> Cc: paul.walmsley@sifive.com; palmer@dabbelt.com;
> aou@eecs.berkeley.edu; ardb@kernel.org; anup@brainfault.org; Li, Haicheng
> <haicheng.li@intel.com>; ajones@ventanamicro.com; Liu, Yujie
> <yujie.liu@intel.com>; charlie@rivosinc.com; linux-riscv@lists.infradead.org;
> linux-efi@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v5 2/2] riscv: Optimize bitops with Zbb extension
>
> Hi Xiao,
>
> On Tue, Oct 31, 2023 at 7:37 AM Xiao Wang <xiao.w.wang@intel.com> wrote:
> > This patch leverages the alternative mechanism to dynamically optimize
> > bitops (including __ffs, __fls, ffs, fls) with Zbb instructions. When
> > Zbb ext is not supported by the runtime CPU, legacy implementation is
> > used. If Zbb is supported, then the optimized variants will be selected
> > via alternative patching.
> >
> > The legacy bitops support is taken from the generic C implementation as
> > fallback.
> >
> > If the parameter is a build-time constant, we leverage compiler builtin to
> > calculate the result directly, this approach is inspired by x86 bitops
> > implementation.
> >
> > EFI stub runs before the kernel, so alternative mechanism should not be
> > used there, this patch introduces a macro NO_ALTERNATIVE for this
> purpose.
> >
> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> > Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
>
> Thanks for your patch, which is now commit 457926b253200bd9 ("riscv:
> Optimize bitops with Zbb extension") in riscv/for-next.
>
> > --- a/arch/riscv/include/asm/bitops.h
> > +++ b/arch/riscv/include/asm/bitops.h
> > @@ -15,13 +15,261 @@
> > #include <asm/barrier.h>
> > #include <asm/bitsperlong.h>
> >
> > +#if !defined(CONFIG_RISCV_ISA_ZBB) || defined(NO_ALTERNATIVE)
> > #include <asm-generic/bitops/__ffs.h>
> > -#include <asm-generic/bitops/ffz.h>
> > -#include <asm-generic/bitops/fls.h>
> > #include <asm-generic/bitops/__fls.h>
> > +#include <asm-generic/bitops/ffs.h>
> > +#include <asm-generic/bitops/fls.h>
> > +
> > +#else
> > +#include <asm/alternative-macros.h>
> > +#include <asm/hwcap.h>
> > +
> > +#if (BITS_PER_LONG == 64)
> > +#define CTZW "ctzw "
> > +#define CLZW "clzw "
> > +#elif (BITS_PER_LONG == 32)
> > +#define CTZW "ctz "
> > +#define CLZW "clz "
> > +#else
> > +#error "Unexpected BITS_PER_LONG"
> > +#endif
> > +
> > +static __always_inline unsigned long variable__ffs(unsigned long word)
> > +{
> > + int num;
> > +
> > + asm_volatile_goto(ALTERNATIVE("j %l[legacy]", "nop", 0,
> > + RISCV_ISA_EXT_ZBB, 1)
> > + : : : : legacy);
> > +
> > + asm volatile (".option push\n"
> > + ".option arch,+zbb\n"
> > + "ctz %0, %1\n"
> > + ".option pop\n"
> > + : "=r" (word) : "r" (word) :);
> > +
> > + return word;
> > +
> > +legacy:
> > + num = 0;
> > +#if BITS_PER_LONG == 64
> > + if ((word & 0xffffffff) == 0) {
> > + num += 32;
> > + word >>= 32;
> > + }
> > +#endif
> > + if ((word & 0xffff) == 0) {
> > + num += 16;
> > + word >>= 16;
> > + }
> > + if ((word & 0xff) == 0) {
> > + num += 8;
> > + word >>= 8;
> > + }
> > + if ((word & 0xf) == 0) {
> > + num += 4;
> > + word >>= 4;
> > + }
> > + if ((word & 0x3) == 0) {
> > + num += 2;
> > + word >>= 2;
> > + }
> > + if ((word & 0x1) == 0)
> > + num += 1;
> > + return num;
> > +}
>
> Surely we can do better than duplicating include/asm-generic/bitops/__ffs.h?
>
> E.g. rename the generic implementation to generic___ffs():
>
> -static __always_inline unsigned long __ffs(unsigned long word)
> +static __always_inline unsigned long generic__ffs(unsigned long word)
> {
> ...
> }
>
> +#ifndef __ffs
> +#define __ffs(x) generic__ffs(x)
> +#endif
>
> and explicitly calling the generic one here?
>
> Same comment for the other functions.
Thanks for the suggestion. I just tried your above example and got build error of
"__ffs" redefinition, I think we can change the macro condition to:
+#ifndef __HAVE_ARCH___FFS
+#define __ffs(word) generic___ffs(word)
+#endif
Besides, adding a "generic_" prefix to __ffs would make it as "generic___ffs".
I saw similar API names in include/asm-generic/bitops/generic-non-atomic.h.
I would send a patch for this.
BRs,
Xiao
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-
> m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
next prev parent reply other threads:[~2023-11-12 7:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-31 6:45 [PATCH v5 0/2] riscv: Optimize bitops with Zbb extension Xiao Wang
2023-10-31 6:45 ` [PATCH v5 1/2] riscv: Rearrange hwcap.h and cpufeature.h Xiao Wang
2023-10-31 6:45 ` [PATCH v5 2/2] riscv: Optimize bitops with Zbb extension Xiao Wang
2023-11-10 9:24 ` Geert Uytterhoeven
2023-11-12 7:05 ` Wang, Xiao W [this message]
2023-11-09 22:40 ` [PATCH v5 0/2] " patchwork-bot+linux-riscv
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM8PR11MB575167A40A568B75E58E940AB8ACA@DM8PR11MB5751.namprd11.prod.outlook.com \
--to=xiao.w.wang@intel.com \
--cc=ajones@ventanamicro.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=ardb@kernel.org \
--cc=charlie@rivosinc.com \
--cc=geert@linux-m68k.org \
--cc=haicheng.li@intel.com \
--cc=linux-efi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=yujie.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).