From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751845AbcBEH4z (ORCPT ); Fri, 5 Feb 2016 02:56:55 -0500 Received: from pegase1.c-s.fr ([93.17.236.30]:56990 "EHLO mailhub1.si.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750818AbcBEH4y (ORCPT ); Fri, 5 Feb 2016 02:56:54 -0500 Subject: Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync() To: Denis Kirjanov References: <42d5343703a9e67b5a2d94c8877bc0098448f71b.1454538980.git.christophe.leroy@c-s.fr> <56B35537.3050708@c-s.fr> Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org From: Christophe Leroy Message-ID: <56B455C3.6020702@c-s.fr> Date: Fri, 5 Feb 2016 08:56:51 +0100 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 05/02/2016 08:52, Denis Kirjanov a écrit : > On 2/4/16, Christophe Leroy wrote: >> >> Le 04/02/2016 12:37, Denis Kirjanov a écrit : >>> On 2/4/16, Christophe Leroy wrote: >>>> This simplification helps the compiler. We now have only one test >>>> instead of two, so it reduces the number of branches. >>>> >>>> Signed-off-by: Christophe Leroy >>>> --- >>>> v2: new >>>> v3: no change >>>> v4: no change >>>> v5: no change >>>> >>>> arch/powerpc/mm/dma-noncoherent.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/arch/powerpc/mm/dma-noncoherent.c >>>> b/arch/powerpc/mm/dma-noncoherent.c >>>> index 169aba4..2dc74e5 100644 >>>> --- a/arch/powerpc/mm/dma-noncoherent.c >>>> +++ b/arch/powerpc/mm/dma-noncoherent.c >>>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int >>>> direction) >>>> * invalidate only when cache-line aligned otherwise there is >>>> * the potential for discarding uncommitted data from the cache >>>> */ >>>> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1))) >>>> + if ((start | end) & (L1_CACHE_BYTES - 1)) >>>> flush_dcache_range(start, end); >>>> else >>>> invalidate_dcache_range(start, end); >>> The previous version of address cache-line aligned check reads perfectly >>> fine. >>> What's the benefit of this micro optimization? >> With this optimisation we avoid one unneccessary test and two associated >> jumps. Taking into account that __dma_sync() is one of the top ten CPU >> consummers, I believe it is worth it: >> >> > Yeah, looks better. Did you compile the kernel with default compiler flags? > > Thanks! Yes I did Christophe