From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6E99C433EF for ; Sun, 28 Nov 2021 18:07:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358962AbhK1SKg (ORCPT ); Sun, 28 Nov 2021 13:10:36 -0500 Received: from rere.qmqm.pl ([91.227.64.183]:30059 "EHLO rere.qmqm.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231276AbhK1SIg (ORCPT ); Sun, 28 Nov 2021 13:08:36 -0500 Received: from remote.user (localhost [127.0.0.1]) by rere.qmqm.pl (Postfix) with ESMTPSA id 4J2Gbw0CGMzWG; Sun, 28 Nov 2021 19:05:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=rere.qmqm.pl; s=1; t=1638122717; bh=sEqOa0ZEPEWEUlBRSLFAiVwd6iLDymwK+DV3juGIkQI=; h=Date:From:To:Cc:Subject:From; b=XxvJmPPWazylznPkDFNVygPkohQ37AeYwywQvuepL1w31LahkAW0gP0LAhZyVJpcw QCkB6BljQdra5dN8zW66BSxqJNFsOQZ3SXN/IvfR7aJdEfhChJ01RFDgEXy7ZRidew 5AuU0watcW2VhJ8VLEKa/g48uwHd6hUjzetMQJoBwvwxwQVSA3IWggzLzC2joJiPHp 8JY+UhUQ+oYzz2QsWC6IREdA0RA0M1lN6rt3R0c026YIrrdzB0zCSa9TOmB+kCXZPl HMxpZwG7mci2iV7G+PnmCZt5b87M/uk6yQ9shXg1YFXT3Bi4Ka8VZS9Szm3f/vvMWn w1e6Lkbu+s3yw== X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.3 at mail Date: Sun, 28 Nov 2021 19:05:14 +0100 From: =?iso-8859-2?Q?Micha=B3_Miros=B3aw?= To: Yury Norov Cc: linux-kernel@vger.kernel.org, "James E.J. Bottomley" , "Martin K. Petersen" , "Paul E. McKenney" , "Rafael J. Wysocki" , Alexander Shishkin , Alexey Klimov , Amitkumar Karwar , Andi Kleen , Andrew Lunn , Andrew Morton , Andy Gross , Andy Lutomirski , Andy Shevchenko , Anup Patel , Ard Biesheuvel , Arnaldo Carvalho de Melo , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christoph Hellwig , Christoph Lameter , Daniel Vetter , Dave Hansen , David Airlie , David Laight , Dennis Zhou , Dinh Nguyen , Geetha sowjanya , Geert Uytterhoeven , Greg Kroah-Hartman , Guo Ren , Hans de Goede , Heiko Carstens , Ian Rogers , Ingo Molnar , Jakub Kicinski , Jason Wessel , Jens Axboe , Jiri Olsa , Jonathan Cameron , Juri Lelli , Kalle Valo , Kees Cook , Krzysztof Kozlowski , Lee Jones , Marc Zyngier , Marcin Wojtas , Mark Gross , Mark Rutland , Matti Vaittinen , Mauro Carvalho Chehab , Mel Gorman , Michael Ellerman , Mike Marciniszyn , Nicholas Piggin , Palmer Dabbelt , Peter Zijlstra , Petr Mladek , Randy Dunlap , Rasmus Villemoes , Roy Pledge , Russell King , Saeed Mahameed , Sagi Grimberg , Sergey Senozhatsky , Solomon Peachy , Stephen Boyd , Stephen Rothwell , Steven Rostedt , Subbaraya Sundeep , Sudeep Holla , Sunil Goutham , Tariq Toukan , Tejun Heo , Thomas Bogendoerfer , Thomas Gleixner , Ulf Hansson , Vincent Guittot , Vineet Gupta , Viresh Kumar , Vivien Didelot , Vlastimil Babka , Will Deacon , bcm-kernel-feedback-list@broadcom.com, kvm@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-csky@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-csky@vger.kernel.org Message-ID: <20211128180514.nepaRaB8WjMDQcy9o9_XdZpNhfU0WCdBjPs62W6tRS4@z> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > In many cases people use bitmap_weight()-based functions like this: > > if (num_present_cpus() > 1) > do_something(); > > This may take considerable amount of time on many-cpus machines because > num_present_cpus() will traverse every word of underlying cpumask > unconditionally. > > We can significantly improve on it for many real cases if stop traversing > the mask as soon as we count present cpus to any number greater than 1: > > if (num_present_cpus_gt(1)) > do_something(); > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > functions together with corresponding wrappers in cpumask and nodemask. Having slept on it I have more structured thoughts: First, I like substituting bitmap_empty/full where possible - I think the change stands on its own, so could be split and sent as is. I don't like the proposed API very much. One problem is that it hides the comparison operator and makes call sites less readable: bitmap_weight(...) > N becomes: bitmap_weight_gt(..., N) and: bitmap_weight(...) <= N becomes: bitmap_weight_lt(..., N+1) or: !bitmap_weight_gt(..., N) I'd rather see something resembling memcmp() API that's known enough to be easier to grasp. For above examples: bitmap_weight_cmp(..., N) > 0 bitmap_weight_cmp(..., N) <= 0 ... This would also make the implementation easier in not having to copy and paste the code three times. Could also use a simple optimization reducing code size: #include int bitmap_weight_cmp(long *bits, size_t nbits, size_t cmp) { for (size_t i = 0; i < nbits / BITS_PER_LONG; ++i, ++bits) if (check_sub_overflow(cmp, popcount(*bits), &cmp)) return 1; nbits %= BITS_PER_LONG; if (nbits && check_sub_overflow(cmp, popcount(*bits & GENMASK(nbits)), &cmp)) return 1; return cmp ? -1 : 0; } Best Regards Michał Mirosław