All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
@ 2015-06-11 11:52 Thomas Petazzoni
  2015-06-15 10:42 ` Dirk Behme
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Thomas Petazzoni @ 2015-06-11 11:52 UTC (permalink / raw)
  To: linux-arm-kernel

The Cortex-A9 has a L1 prefetch capability documented at
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/Chdejhgd.html:

  The Cortex-A9 data cache implements an automatic prefetcher that
  monitors cache misses done by the processor. This unit can monitor
  and prefetch two independent data streams. It can be activated in
  software using a CP15 Auxiliary Control Register bit. See Auxiliary
  Control Register.

This commit enables this L1 prefetch feature unconditionally on all
Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
register. Note that since this bit only exists on Cortex-A9 but not on
Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
one of those two other cores.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 arch/arm/mm/proc-v7.S | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 3d1054f..106ea4d 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -257,8 +257,11 @@ ENDPROC(cpu_pj4b_do_resume)
  *	It is assumed that:
  *	- cache type register is implemented
  */
-__v7_ca5mp_setup:
 __v7_ca9mp_setup:
+	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
+	orr	r10, r10, #(1 << 2)		@ L1 prefetch
+	b	1f
+__v7_ca5mp_setup:
 __v7_cr7mp_setup:
 	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
 	b	1f
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-11 11:52 [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
@ 2015-06-15 10:42 ` Dirk Behme
  2015-06-15 14:56   ` Thomas Petazzoni
  2015-06-15 11:11 ` Russell King - ARM Linux
  2015-06-16 13:47 ` Rob Herring
  2 siblings, 1 reply; 9+ messages in thread
From: Dirk Behme @ 2015-06-15 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 11.06.2015 13:52, Thomas Petazzoni wrote:
> The Cortex-A9 has a L1 prefetch capability documented at
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/Chdejhgd.html:
>
>    The Cortex-A9 data cache implements an automatic prefetcher that
>    monitors cache misses done by the processor. This unit can monitor
>    and prefetch two independent data streams. It can be activated in
>    software using a CP15 Auxiliary Control Register bit. See Auxiliary
>    Control Register.
>
> This commit enables this L1 prefetch feature unconditionally on all
> Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
> register. Note that since this bit only exists on Cortex-A9 but not on
> Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
> one of those two other cores.


Have you observed or measured any performance improvements or changes 
using this change?

Besta regards

Dirk


> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>   arch/arm/mm/proc-v7.S | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 3d1054f..106ea4d 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -257,8 +257,11 @@ ENDPROC(cpu_pj4b_do_resume)
>    *	It is assumed that:
>    *	- cache type register is implemented
>    */
> -__v7_ca5mp_setup:
>   __v7_ca9mp_setup:
> +	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
> +	orr	r10, r10, #(1 << 2)		@ L1 prefetch
> +	b	1f
> +__v7_ca5mp_setup:
>   __v7_cr7mp_setup:
>   	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
>   	b	1f

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-11 11:52 [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
  2015-06-15 10:42 ` Dirk Behme
@ 2015-06-15 11:11 ` Russell King - ARM Linux
  2015-06-15 14:57   ` Thomas Petazzoni
  2015-06-16 13:47 ` Rob Herring
  2 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-06-15 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 11, 2015 at 01:52:30PM +0200, Thomas Petazzoni wrote:
> The Cortex-A9 has a L1 prefetch capability documented at
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/Chdejhgd.html:
> 
>   The Cortex-A9 data cache implements an automatic prefetcher that
>   monitors cache misses done by the processor. This unit can monitor
>   and prefetch two independent data streams. It can be activated in
>   software using a CP15 Auxiliary Control Register bit. See Auxiliary
>   Control Register.
> 
> This commit enables this L1 prefetch feature unconditionally on all
> Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
> register. Note that since this bit only exists on Cortex-A9 but not on
> Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
> one of those two other cores.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

I'd prefer not to take this until after the next merge window, because I
don't want to deal with conflicts that this may cause with other branches
in my tree.

We're at -rc8 now, only a week away from -final, now is not really the
time to be taking new code into git trees anyway.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-15 10:42 ` Dirk Behme
@ 2015-06-15 14:56   ` Thomas Petazzoni
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Petazzoni @ 2015-06-15 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Dirk Behme,

On Mon, 15 Jun 2015 12:42:29 +0200, Dirk Behme wrote:

> > This commit enables this L1 prefetch feature unconditionally on all
> > Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
> > register. Note that since this bit only exists on Cortex-A9 but not on
> > Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
> > one of those two other cores.
> 
> Have you observed or measured any performance improvements or changes 
> using this change?

No, I haven't done any measurement myself, I merely wanted to propagate
some of the optimizations that were implemented in the Marvell BSP and
open the discussion around enabling the L1 prefetching feature of the
A9.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-15 11:11 ` Russell King - ARM Linux
@ 2015-06-15 14:57   ` Thomas Petazzoni
  2015-06-15 15:05     ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Petazzoni @ 2015-06-15 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

Russell,

On Mon, 15 Jun 2015 12:11:03 +0100, Russell King - ARM Linux wrote:

> I'd prefer not to take this until after the next merge window, because I
> don't want to deal with conflicts that this may cause with other branches
> in my tree.
> 
> We're at -rc8 now, only a week away from -final, now is not really the
> time to be taking new code into git trees anyway.

I should have mentioned that explicitly, but I clearly didn't expect
this patch to be taken for the upcoming 4.2 merge window as it's way
too late. Targeting 4.3 is perfectly fine.

However, can you consider my "ARM: smp_scu: enable coherent speculative
linefills", which has been sent almost three months ago:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/332106.html

Thanks,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-15 14:57   ` Thomas Petazzoni
@ 2015-06-15 15:05     ` Russell King - ARM Linux
  2015-06-15 15:15       ` Thomas Petazzoni
  0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-06-15 15:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 15, 2015 at 04:57:53PM +0200, Thomas Petazzoni wrote:
> Russell,
> 
> On Mon, 15 Jun 2015 12:11:03 +0100, Russell King - ARM Linux wrote:
> 
> > I'd prefer not to take this until after the next merge window, because I
> > don't want to deal with conflicts that this may cause with other branches
> > in my tree.
> > 
> > We're at -rc8 now, only a week away from -final, now is not really the
> > time to be taking new code into git trees anyway.
> 
> I should have mentioned that explicitly, but I clearly didn't expect
> this patch to be taken for the upcoming 4.2 merge window as it's way
> too late. Targeting 4.3 is perfectly fine.
> 
> However, can you consider my "ARM: smp_scu: enable coherent speculative
> linefills", which has been sent almost three months ago:
> 
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/332106.html

I looked at that, and decided I didn't have enough information to know
whether that's a good idea to apply or not for all the different variants
we have out there.  It's something that needs to be merged early in the
cycle to give it enough time to be (hopefully) tested by people.

A lot of these "enable optimisation X" patches need to go through that
treatment.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-15 15:05     ` Russell King - ARM Linux
@ 2015-06-15 15:15       ` Thomas Petazzoni
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Petazzoni @ 2015-06-15 15:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Mon, 15 Jun 2015 16:05:22 +0100, Russell King - ARM Linux wrote:

> > However, can you consider my "ARM: smp_scu: enable coherent speculative
> > linefills", which has been sent almost three months ago:
> > 
> >   http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/332106.html
> 
> I looked at that, and decided I didn't have enough information to know
> whether that's a good idea to apply or not for all the different variants
> we have out there.  It's something that needs to be merged early in the
> cycle to give it enough time to be (hopefully) tested by people.
> 
> A lot of these "enable optimisation X" patches need to go through that
> treatment.

Makes perfect sense. Can we get these "enable optimization X" patches
merged early enough in the 4.3 cycle, i.e right after 4.2-rc1 is
released?

Thanks a lot,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-11 11:52 [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
  2015-06-15 10:42 ` Dirk Behme
  2015-06-15 11:11 ` Russell King - ARM Linux
@ 2015-06-16 13:47 ` Rob Herring
  2015-06-16 14:10   ` Thomas Petazzoni
  2 siblings, 1 reply; 9+ messages in thread
From: Rob Herring @ 2015-06-16 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 11, 2015 at 6:52 AM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> The Cortex-A9 has a L1 prefetch capability documented at
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/Chdejhgd.html:
>
>   The Cortex-A9 data cache implements an automatic prefetcher that
>   monitors cache misses done by the processor. This unit can monitor
>   and prefetch two independent data streams. It can be activated in
>   software using a CP15 Auxiliary Control Register bit. See Auxiliary
>   Control Register.
>
> This commit enables this L1 prefetch feature unconditionally on all
> Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
> register. Note that since this bit only exists on Cortex-A9 but not on
> Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
> one of those two other cores.

Does this work in non-secure mode? Perhaps it is just ignored. If so,
it deserves a comment along the lines of "This bit should be set by
the bootloader and setting it has no effect in non-secure mode."

Rob

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  arch/arm/mm/proc-v7.S | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 3d1054f..106ea4d 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -257,8 +257,11 @@ ENDPROC(cpu_pj4b_do_resume)
>   *     It is assumed that:
>   *     - cache type register is implemented
>   */
> -__v7_ca5mp_setup:
>  __v7_ca9mp_setup:
> +       mov     r10, #(1 << 0)                  @ Cache/TLB ops broadcasting
> +       orr     r10, r10, #(1 << 2)             @ L1 prefetch
> +       b       1f
> +__v7_ca5mp_setup:
>  __v7_cr7mp_setup:
>         mov     r10, #(1 << 0)                  @ Cache/TLB ops broadcasting
>         b       1f
> --
> 2.1.0
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9
  2015-06-16 13:47 ` Rob Herring
@ 2015-06-16 14:10   ` Thomas Petazzoni
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Petazzoni @ 2015-06-16 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Rob Herring,

On Tue, 16 Jun 2015 08:47:33 -0500, Rob Herring wrote:

> > This commit enables this L1 prefetch feature unconditionally on all
> > Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
> > register. Note that since this bit only exists on Cortex-A9 but not on
> > Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
> > one of those two other cores.
> 
> Does this work in non-secure mode? Perhaps it is just ignored. If so,
> it deserves a comment along the lines of "This bit should be set by
> the bootloader and setting it has no effect in non-secure mode."

According to
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388i/CIHCHFCG.html,
bit 0 (which was already set by the code before my patch) and bit 2
don't have any difference in terms of secure/non-secure access.

The CP15 register as a whole is RO in non-secure if NSACR.NS_SMP = 0.
If NSACR.NS_SMP = 1, all bits are write-ignored in non-secure, except
the SMP bit (bit 6).

So I assume that if this piece of code is currently setting bit 0,
we can assume that setting bit 2 in the same place is also OK. However,
setting bit 0 or 2 is probably going to raise an undefined instruction
exception in non-secure when NSACR.NS_SMP = 0, but that is not a
behavior introduced by my patch.

Thanks for the review!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-16 14:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-11 11:52 [PATCH] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
2015-06-15 10:42 ` Dirk Behme
2015-06-15 14:56   ` Thomas Petazzoni
2015-06-15 11:11 ` Russell King - ARM Linux
2015-06-15 14:57   ` Thomas Petazzoni
2015-06-15 15:05     ` Russell King - ARM Linux
2015-06-15 15:15       ` Thomas Petazzoni
2015-06-16 13:47 ` Rob Herring
2015-06-16 14:10   ` Thomas Petazzoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.