All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
@ 2015-07-08 15:51 dinguyen at opensource.altera.com
  2015-07-08 16:51 ` Russell King - ARM Linux
  0 siblings, 1 reply; 13+ messages in thread
From: dinguyen at opensource.altera.com @ 2015-07-08 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dinh Nguyen <dinguyen@opensource.altera.com>

The commit "02b4e2756e01 ARM: v7 setup function should invalidate L1 cache"
caused the SoCFPGA to not boot reliably. About 20% of the time or roughly
(1 in 5), booting the platform would cause this kernel panic:

CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted
4.1.0-rc8-next-20150617-00002-gdd1f624 #1
Hardware name: Altera SOCFPGA
task: eecaeac0 ti: eecce000 task.ti: eecce000
PC is at vfp_notifier+0x58/0x12c
LR is at notifier_call_chain+0x44/0x84
pc : [<c000a6bc>]    lr : [<c003d134>]    psr: 80000193
sp : eeccff48  ip : c06563c8  fp : eeccffd4
r10: eecaef80  r9 : ef1f1300  r8 : 00000002
r7 : eecd0000  r6 : c0656bc0  r5 : 00000000  r4 : eecd0000
r3 : c000a664  r2 : eecd0000  r1 : 00000002  r0 : c06563c8
Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 0000404a  DAC: 00000015
Process swapper/1 (pid: 0, stack limit = 0xeecce218)
Stack: (0xeeccff48 to 0xeecd0000)
ff40:                   c000a664 ffffffff 00000000 c003d134 eecd0018 eecaeac0
ff60: c06648e0 0b52d2f9 c048cfa8 c003d18c 00000000 f0002100 00000001 c003d1ac
ff80: 00000000 eecaeac0 c064f300 c001369c c064b304 c0013140 00000000 ef1ed328
ffa0: eeccffe8 c001e760 c0486ec4 2eba2000 c06957c0 c06524dc 00000015 c06957c0
ffc0: c048c778 c064b304 c06957c0 00000000 eeccffdc c0486ec4 eeccffe4 c0487138
ffe0: 00000001 c00544e8 c0009494 c0697bc0 00000000 000094ac 7ef5bffd 3f39b3f8
[<c000a6bc>] (vfp_notifier) from [<c003d134>] (notifier_call_chain+0x44/0x84)
[<c003d134>] (notifier_call_chain) from [<c003d18c>]
(__atomic_notifier_call_chain+0x18/0x20)
[<c003d18c>] (__atomic_notifier_call_chain) from [<c003d1ac>]
(atomic_notifier_call_chain+0x18/0x20)
[<c003d1ac>] (atomic_notifier_call_chain) from [<c001369c>]
(__switch_to+0x34/0x58)
Code: e3a03002 e5843208 e3a00000 e8bd8038 (eef85a10)
---[ end trace 9eaea9661b3b550a ]---
Kernel panic - not syncing: Attempted to kill the idle task!
SMP: failed to stop secondary CPUs
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

So this patch puts back the call to v7_invalidate_l1 in the secondary_startup
path, and the platform is now able to boot up reliably.

Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
---
 arch/arm/mach-socfpga/core.h    |    1 +
 arch/arm/mach-socfpga/headsmp.S |    5 +++++
 arch/arm/mach-socfpga/platsmp.c |    2 +-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-socfpga/core.h b/arch/arm/mach-socfpga/core.h
index 7259c37..30d7988 100644
--- a/arch/arm/mach-socfpga/core.h
+++ b/arch/arm/mach-socfpga/core.h
@@ -33,6 +33,7 @@
 
 #define RSTMGR_MPUMODRST_CPU1		0x2     /* CPU1 Reset */
 
+extern void socfpga_secondary_startup(void);
 extern void socfpga_init_clocks(void);
 extern void socfpga_sysmgr_init(void);
 
diff --git a/arch/arm/mach-socfpga/headsmp.S b/arch/arm/mach-socfpga/headsmp.S
index 5d94b7a..a521413 100644
--- a/arch/arm/mach-socfpga/headsmp.S
+++ b/arch/arm/mach-socfpga/headsmp.S
@@ -33,3 +33,8 @@ ARM_BE8(rev	r4, r4)
 1:	.long	.
 	.long	socfpga_cpu1start_addr
 ENTRY(secondary_trampoline_end)
+
+ENTRY(socfpga_secondary_startup)
+	bl      v7_invalidate_l1
+	b       secondary_startup
+ENDPROC(socfpga_secondary_startup)
diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c
index c6f1df8..7ed6127 100644
--- a/arch/arm/mach-socfpga/platsmp.c
+++ b/arch/arm/mach-socfpga/platsmp.c
@@ -40,7 +40,7 @@ static int socfpga_boot_secondary(unsigned int cpu, struct task_struct *idle)
 
 		memcpy(phys_to_virt(0), &secondary_trampoline, trampoline_size);
 
-		writel(virt_to_phys(secondary_startup),
+		writel(virt_to_phys(socfpga_secondary_startup),
 		       sys_manager_base_addr + (socfpga_cpu1start_addr & 0x000000ff));
 
 		flush_cache_all();
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 15:51 [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup dinguyen at opensource.altera.com
@ 2015-07-08 16:51 ` Russell King - ARM Linux
  2015-07-08 19:13   ` Dinh Nguyen
  0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2015-07-08 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 08, 2015 at 10:51:51AM -0500, dinguyen at opensource.altera.com wrote:
> From: Dinh Nguyen <dinguyen@opensource.altera.com>
> 
> The commit "02b4e2756e01 ARM: v7 setup function should invalidate L1 cache"
> caused the SoCFPGA to not boot reliably. About 20% of the time or roughly
> (1 in 5), booting the platform would cause this kernel panic:
> 
> CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
> Modules linked in:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted
> 4.1.0-rc8-next-20150617-00002-gdd1f624 #1
> Hardware name: Altera SOCFPGA
> task: eecaeac0 ti: eecce000 task.ti: eecce000
> PC is at vfp_notifier+0x58/0x12c
> LR is at notifier_call_chain+0x44/0x84
> pc : [<c000a6bc>]    lr : [<c003d134>]    psr: 80000193
> sp : eeccff48  ip : c06563c8  fp : eeccffd4
> r10: eecaef80  r9 : ef1f1300  r8 : 00000002
> r7 : eecd0000  r6 : c0656bc0  r5 : 00000000  r4 : eecd0000
> r3 : c000a664  r2 : eecd0000  r1 : 00000002  r0 : c06563c8
> Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> Control: 10c5387d  Table: 0000404a  DAC: 00000015
> Process swapper/1 (pid: 0, stack limit = 0xeecce218)
> Stack: (0xeeccff48 to 0xeecd0000)
> ff40:                   c000a664 ffffffff 00000000 c003d134 eecd0018 eecaeac0
> ff60: c06648e0 0b52d2f9 c048cfa8 c003d18c 00000000 f0002100 00000001 c003d1ac
> ff80: 00000000 eecaeac0 c064f300 c001369c c064b304 c0013140 00000000 ef1ed328
> ffa0: eeccffe8 c001e760 c0486ec4 2eba2000 c06957c0 c06524dc 00000015 c06957c0
> ffc0: c048c778 c064b304 c06957c0 00000000 eeccffdc c0486ec4 eeccffe4 c0487138
> ffe0: 00000001 c00544e8 c0009494 c0697bc0 00000000 000094ac 7ef5bffd 3f39b3f8
> [<c000a6bc>] (vfp_notifier) from [<c003d134>] (notifier_call_chain+0x44/0x84)
> [<c003d134>] (notifier_call_chain) from [<c003d18c>]
> (__atomic_notifier_call_chain+0x18/0x20)
> [<c003d18c>] (__atomic_notifier_call_chain) from [<c003d1ac>]
> (atomic_notifier_call_chain+0x18/0x20)
> [<c003d1ac>] (atomic_notifier_call_chain) from [<c001369c>]
> (__switch_to+0x34/0x58)
> Code: e3a03002 e5843208 e3a00000 e8bd8038 (eef85a10)
> ---[ end trace 9eaea9661b3b550a ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> SMP: failed to stop secondary CPUs
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
> 
> So this patch puts back the call to v7_invalidate_l1 in the secondary_startup
> path, and the platform is now able to boot up reliably.
> 
> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>

Can you print the value of CPACR in the oops and reproduce please?
I suspect, somehow, the CPACR is not allowing CPU1 access to the VFP,
but this is not controlled by data as such (it can only go wrong if
the hotplug notifier call chain misses calling into the VFP code.)

We ought to understand what the cause of this is before reverting
this part of the patch.  Not having a SoCFPGA board means that I can't
do any investigation of this myself.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 16:51 ` Russell King - ARM Linux
@ 2015-07-08 19:13   ` Dinh Nguyen
  2015-07-08 21:07     ` Russell King - ARM Linux
  0 siblings, 1 reply; 13+ messages in thread
From: Dinh Nguyen @ 2015-07-08 19:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/08/2015 11:51 AM, Russell King - ARM Linux wrote:
> On Wed, Jul 08, 2015 at 10:51:51AM -0500, dinguyen at opensource.altera.com wrote:
>> From: Dinh Nguyen <dinguyen@opensource.altera.com>
>>
>> The commit "02b4e2756e01 ARM: v7 setup function should invalidate L1 cache"
>> caused the SoCFPGA to not boot reliably. About 20% of the time or roughly
>> (1 in 5), booting the platform would cause this kernel panic:
>>
>> CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
>> Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
>> Modules linked in:
>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted
>> 4.1.0-rc8-next-20150617-00002-gdd1f624 #1
>> Hardware name: Altera SOCFPGA
>> task: eecaeac0 ti: eecce000 task.ti: eecce000
>> PC is at vfp_notifier+0x58/0x12c
>> LR is at notifier_call_chain+0x44/0x84
>> pc : [<c000a6bc>]    lr : [<c003d134>]    psr: 80000193
>> sp : eeccff48  ip : c06563c8  fp : eeccffd4
>> r10: eecaef80  r9 : ef1f1300  r8 : 00000002
>> r7 : eecd0000  r6 : c0656bc0  r5 : 00000000  r4 : eecd0000
>> r3 : c000a664  r2 : eecd0000  r1 : 00000002  r0 : c06563c8
>> Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
>> Control: 10c5387d  Table: 0000404a  DAC: 00000015
>> Process swapper/1 (pid: 0, stack limit = 0xeecce218)
>> Stack: (0xeeccff48 to 0xeecd0000)
>> ff40:                   c000a664 ffffffff 00000000 c003d134 eecd0018 eecaeac0
>> ff60: c06648e0 0b52d2f9 c048cfa8 c003d18c 00000000 f0002100 00000001 c003d1ac
>> ff80: 00000000 eecaeac0 c064f300 c001369c c064b304 c0013140 00000000 ef1ed328
>> ffa0: eeccffe8 c001e760 c0486ec4 2eba2000 c06957c0 c06524dc 00000015 c06957c0
>> ffc0: c048c778 c064b304 c06957c0 00000000 eeccffdc c0486ec4 eeccffe4 c0487138
>> ffe0: 00000001 c00544e8 c0009494 c0697bc0 00000000 000094ac 7ef5bffd 3f39b3f8
>> [<c000a6bc>] (vfp_notifier) from [<c003d134>] (notifier_call_chain+0x44/0x84)
>> [<c003d134>] (notifier_call_chain) from [<c003d18c>]
>> (__atomic_notifier_call_chain+0x18/0x20)
>> [<c003d18c>] (__atomic_notifier_call_chain) from [<c003d1ac>]
>> (atomic_notifier_call_chain+0x18/0x20)
>> [<c003d1ac>] (atomic_notifier_call_chain) from [<c001369c>]
>> (__switch_to+0x34/0x58)
>> Code: e3a03002 e5843208 e3a00000 e8bd8038 (eef85a10)
>> ---[ end trace 9eaea9661b3b550a ]---
>> Kernel panic - not syncing: Attempted to kill the idle task!
>> SMP: failed to stop secondary CPUs
>> ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
>>
>> So this patch puts back the call to v7_invalidate_l1 in the secondary_startup
>> path, and the platform is now able to boot up reliably.
>>
>> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
> 
> Can you print the value of CPACR in the oops and reproduce please?
> I suspect, somehow, the CPACR is not allowing CPU1 access to the VFP,
> but this is not controlled by data as such (it can only go wrong if
> the hotplug notifier call chain misses calling into the VFP code.)
> 

The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
user mode access.

Dinh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 19:13   ` Dinh Nguyen
@ 2015-07-08 21:07     ` Russell King - ARM Linux
  2015-07-08 21:55       ` Dinh Nguyen
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Russell King - ARM Linux @ 2015-07-08 21:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
> The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
> user mode access.

Hmm.

I think what you've found is a(nother) latent bug in the CPU bring up
code.

For SMP CPUs, the sequence we're following during early initialisation is:

1. Enable SMP coherency.
2. Invalidate the caches.

If the cache contains rubbish, enabling SMP coherency before invalidating
the cache is plainly an absurd thing to do.

Can you try the patch below - not tested in any way, so you may need to
tweak it, but it should allow us to prove that point.

diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 0716bbe19872..db5137fc297d 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -275,6 +275,10 @@ __v7_b15mp_setup:
 __v7_ca17mp_setup:
 	mov	r10, #0
 1:
+	adr	r12, __v7_setup_stack		@ the local stack
+	stmia	r12, {r0-r5, r7, r9-r11, lr}
+	bl      v7_invalidate_l1
+	ldmia	r12, {r0-r5, r7, r9-r11, lr}
 #ifdef CONFIG_SMP
 	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
 	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
@@ -283,7 +287,7 @@ __v7_ca17mp_setup:
 	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
 	mcreq	p15, 0, r0, c1, c0, 1
 #endif
-	b	__v7_setup
+	b	__v7_setup_cont
 
 /*
  * Errata:
@@ -417,6 +421,7 @@ __v7_setup:
 	bl      v7_invalidate_l1
 	ldmia	r12, {r0-r5, r7, r9, r11, lr}
 
+__v7_setup_cont:
 	and	r0, r9, #0xff000000		@ ARM?
 	teq	r0, #0x41000000
 	bne	__errata_finish
@@ -480,7 +485,7 @@ ENDPROC(__v7_setup)
 
 	.align	2
 __v7_setup_stack:
-	.space	4 * 11				@ 11 registers
+	.space	4 * 12				@ 12 registers
 
 	__INITDATA
 
-- 
FTTC broadband for 0.8mile line: currently@10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 21:07     ` Russell King - ARM Linux
@ 2015-07-08 21:55       ` Dinh Nguyen
  2015-07-09  3:52       ` Jisheng Zhang
  2015-07-15 19:23       ` Dinh Nguyen
  2 siblings, 0 replies; 13+ messages in thread
From: Dinh Nguyen @ 2015-07-08 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/08/2015 04:07 PM, Russell King - ARM Linux wrote:
> On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
>> The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
>> user mode access.
> 
> Hmm.
> 
> I think what you've found is a(nother) latent bug in the CPU bring up
> code.
> 
> For SMP CPUs, the sequence we're following during early initialisation is:
> 
> 1. Enable SMP coherency.
> 2. Invalidate the caches.
> 
> If the cache contains rubbish, enabling SMP coherency before invalidating
> the cache is plainly an absurd thing to do.
> 
> Can you try the patch below - not tested in any way, so you may need to
> tweak it, but it should allow us to prove that point.
> 
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 0716bbe19872..db5137fc297d 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -275,6 +275,10 @@ __v7_b15mp_setup:
>  __v7_ca17mp_setup:
>  	mov	r10, #0
>  1:
> +	adr	r12, __v7_setup_stack		@ the local stack
> +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> +	bl      v7_invalidate_l1
> +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
>  #ifdef CONFIG_SMP
>  	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
>  	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> @@ -283,7 +287,7 @@ __v7_ca17mp_setup:
>  	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
>  	mcreq	p15, 0, r0, c1, c0, 1
>  #endif
> -	b	__v7_setup
> +	b	__v7_setup_cont
>  
>  /*
>   * Errata:
> @@ -417,6 +421,7 @@ __v7_setup:
>  	bl      v7_invalidate_l1
>  	ldmia	r12, {r0-r5, r7, r9, r11, lr}
>  
> +__v7_setup_cont:
>  	and	r0, r9, #0xff000000		@ ARM?
>  	teq	r0, #0x41000000
>  	bne	__errata_finish
> @@ -480,7 +485,7 @@ ENDPROC(__v7_setup)
>  
>  	.align	2
>  __v7_setup_stack:
> -	.space	4 * 11				@ 11 registers
> +	.space	4 * 12				@ 12 registers
>  
>  	__INITDATA
>  
> 


This patch seems to have fixed the issue. The SoCFPGA platform is now
booting/rebooting reliably. Also, the patch was applicable as-is.

Also, I went back and studied up on the CPACR register. I misquoted the
value in my previous response, I was looking at CPACR on CPU0. For CPU1,
when the error happens, the value of CPACR is 0x0, so CP10 and CP11 were
set to "Access denied".

Dinh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 21:07     ` Russell King - ARM Linux
  2015-07-08 21:55       ` Dinh Nguyen
@ 2015-07-09  3:52       ` Jisheng Zhang
  2015-07-09  7:57         ` Russell King - ARM Linux
  2015-07-15 19:23       ` Dinh Nguyen
  2 siblings, 1 reply; 13+ messages in thread
From: Jisheng Zhang @ 2015-07-09  3:52 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Russell,

On Wed, 8 Jul 2015 22:07:34 +0100
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
> > The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
> > user mode access.
> 
> Hmm.
> 
> I think what you've found is a(nother) latent bug in the CPU bring up
> code.
> 
> For SMP CPUs, the sequence we're following during early initialisation is:
> 
> 1. Enable SMP coherency.
> 2. Invalidate the caches.
> 
> If the cache contains rubbish, enabling SMP coherency before invalidating
> the cache is plainly an absurd thing to do.
> 
> Can you try the patch below - not tested in any way, so you may need to
> tweak it, but it should allow us to prove that point.
> 
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 0716bbe19872..db5137fc297d 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -275,6 +275,10 @@ __v7_b15mp_setup:
>  __v7_ca17mp_setup:
>  	mov	r10, #0
>  1:
> +	adr	r12, __v7_setup_stack		@ the local stack
> +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> +	bl      v7_invalidate_l1
> +	ldmia	r12, {r0-r5, r7, r9-r11, lr}

Some CPUs such as CA7 need enable SMP before any cache maintenance.

CA7 TRM says something about SMP bit:
"You must ensure this bit is set to 1 before the caches and MMU are enabled,
or any cache and TLB maintenance operations are performed."

So it seems we need to use different path for different CPUs.

Also CA7 would invalidate L1 automatically once reset, can we remove the
invalidate op in CA7 case?

I'm not sure I understand the code correctly, criticism is welcome.

Thanks,
Jisheng

>  #ifdef CONFIG_SMP
>  	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
>  	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> @@ -283,7 +287,7 @@ __v7_ca17mp_setup:
>  	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
>  	mcreq	p15, 0, r0, c1, c0, 1
>  #endif
> -	b	__v7_setup
> +	b	__v7_setup_cont
>  
>  /*
>   * Errata:
> @@ -417,6 +421,7 @@ __v7_setup:
>  	bl      v7_invalidate_l1
>  	ldmia	r12, {r0-r5, r7, r9, r11, lr}
>  
> +__v7_setup_cont:
>  	and	r0, r9, #0xff000000		@ ARM?
>  	teq	r0, #0x41000000
>  	bne	__errata_finish
> @@ -480,7 +485,7 @@ ENDPROC(__v7_setup)
>  
>  	.align	2
>  __v7_setup_stack:
> -	.space	4 * 11				@ 11 registers
> +	.space	4 * 12				@ 12 registers
>  
>  	__INITDATA
>  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-09  3:52       ` Jisheng Zhang
@ 2015-07-09  7:57         ` Russell King - ARM Linux
  2015-07-09  8:17           ` Jisheng Zhang
  0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2015-07-09  7:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 09, 2015 at 11:52:49AM +0800, Jisheng Zhang wrote:
> Dear Russell,
> 
> On Wed, 8 Jul 2015 22:07:34 +0100
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> 
> > On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
> > > The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
> > > user mode access.
> > 
> > Hmm.
> > 
> > I think what you've found is a(nother) latent bug in the CPU bring up
> > code.
> > 
> > For SMP CPUs, the sequence we're following during early initialisation is:
> > 
> > 1. Enable SMP coherency.
> > 2. Invalidate the caches.
> > 
> > If the cache contains rubbish, enabling SMP coherency before invalidating
> > the cache is plainly an absurd thing to do.
> > 
> > Can you try the patch below - not tested in any way, so you may need to
> > tweak it, but it should allow us to prove that point.
> > 
> > diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> > index 0716bbe19872..db5137fc297d 100644
> > --- a/arch/arm/mm/proc-v7.S
> > +++ b/arch/arm/mm/proc-v7.S
> > @@ -275,6 +275,10 @@ __v7_b15mp_setup:
> >  __v7_ca17mp_setup:
> >  	mov	r10, #0
> >  1:
> > +	adr	r12, __v7_setup_stack		@ the local stack
> > +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> > +	bl      v7_invalidate_l1
> > +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
> 
> Some CPUs such as CA7 need enable SMP before any cache maintenance.
> 
> CA7 TRM says something about SMP bit:
> "You must ensure this bit is set to 1 before the caches and MMU are enabled,
> or any cache and TLB maintenance operations are performed."

Frankly, that's wrong for two reasons.  Think about it for a moment...

If the cache contains crap - in other words, it contains random
uninitialised data in the cache lines at random locations, some of
which are marked valid and some of which are marked dirty - then
enabling the SMP bit puts the caches into coherent mode, and they
join the coherent cluster.

That means those cache lines containing crap become visible to other
CPUs in the cluster, and can be migrated to other CPUs, and the crap
data in them becomes visible to other CPUs.  This leads to state
corruption on other CPUs in the cluster.

Moreover, the cache invalidation of the local L1 cache is broadcast
to other CPUs in the cluster, and _their_ caches are also invalidated,
again, leading to state corruption on already running CPUs.  We don't
want the invalidation of the incoming CPU to be broadcast to the other
CPUs.

This is all round a very bad thing.

> Also CA7 would invalidate L1 automatically once reset, can we remove the
> invalidate op in CA7 case?

No, because we enter this path from multiple different situations, eg,
after the decompressor has run, after the boot loader has run which
may have enabled caches and not properly invalidated them prior to
calling the kernel.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-09  7:57         ` Russell King - ARM Linux
@ 2015-07-09  8:17           ` Jisheng Zhang
  2015-07-14 12:15             ` Dinh Nguyen
  0 siblings, 1 reply; 13+ messages in thread
From: Jisheng Zhang @ 2015-07-09  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Russell,

On Thu, 9 Jul 2015 08:57:17 +0100
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Thu, Jul 09, 2015 at 11:52:49AM +0800, Jisheng Zhang wrote:
> > Dear Russell,
> > 
> > On Wed, 8 Jul 2015 22:07:34 +0100
> > Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> > 
> > > On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
> > > > The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
> > > > user mode access.
> > > 
> > > Hmm.
> > > 
> > > I think what you've found is a(nother) latent bug in the CPU bring up
> > > code.
> > > 
> > > For SMP CPUs, the sequence we're following during early initialisation is:
> > > 
> > > 1. Enable SMP coherency.
> > > 2. Invalidate the caches.
> > > 
> > > If the cache contains rubbish, enabling SMP coherency before invalidating
> > > the cache is plainly an absurd thing to do.
> > > 
> > > Can you try the patch below - not tested in any way, so you may need to
> > > tweak it, but it should allow us to prove that point.
> > > 
> > > diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> > > index 0716bbe19872..db5137fc297d 100644
> > > --- a/arch/arm/mm/proc-v7.S
> > > +++ b/arch/arm/mm/proc-v7.S
> > > @@ -275,6 +275,10 @@ __v7_b15mp_setup:
> > >  __v7_ca17mp_setup:
> > >  	mov	r10, #0
> > >  1:
> > > +	adr	r12, __v7_setup_stack		@ the local stack
> > > +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> > > +	bl      v7_invalidate_l1
> > > +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
> > 
> > Some CPUs such as CA7 need enable SMP before any cache maintenance.
> > 
> > CA7 TRM says something about SMP bit:
> > "You must ensure this bit is set to 1 before the caches and MMU are enabled,
> > or any cache and TLB maintenance operations are performed."
> 
> Frankly, that's wrong for two reasons.  Think about it for a moment...
> 
> If the cache contains crap - in other words, it contains random
> uninitialised data in the cache lines at random locations, some of
> which are marked valid and some of which are marked dirty - then
> enabling the SMP bit puts the caches into coherent mode, and they
> join the coherent cluster.
> 
> That means those cache lines containing crap become visible to other
> CPUs in the cluster, and can be migrated to other CPUs, and the crap
> data in them becomes visible to other CPUs.  This leads to state
> corruption on other CPUs in the cluster.
> 
> Moreover, the cache invalidation of the local L1 cache is broadcast
> to other CPUs in the cluster, and _their_ caches are also invalidated,
> again, leading to state corruption on already running CPUs.  We don't
> want the invalidation of the incoming CPU to be broadcast to the other
> CPUs.
> 
> This is all round a very bad thing.
> 
> > Also CA7 would invalidate L1 automatically once reset, can we remove the
> > invalidate op in CA7 case?
> 
> No, because we enter this path from multiple different situations, eg,
> after the decompressor has run, after the boot loader has run which
> may have enabled caches and not properly invalidated them prior to
> calling the kernel.
> 

Got it. Thanks very much for your detailed explanation!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-09  8:17           ` Jisheng Zhang
@ 2015-07-14 12:15             ` Dinh Nguyen
  2015-07-15 19:04               ` Russell King - ARM Linux
  0 siblings, 1 reply; 13+ messages in thread
From: Dinh Nguyen @ 2015-07-14 12:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On 7/9/15 3:17 AM, Jisheng Zhang wrote:
> Dear Russell,
> 
> On Thu, 9 Jul 2015 08:57:17 +0100
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> 
>> On Thu, Jul 09, 2015 at 11:52:49AM +0800, Jisheng Zhang wrote:
>>> Dear Russell,
>>>
>>> On Wed, 8 Jul 2015 22:07:34 +0100
>>> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
>>>
>>>> On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
>>>>> The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
>>>>> user mode access.
>>>>
>>>> Hmm.
>>>>
>>>> I think what you've found is a(nother) latent bug in the CPU bring up
>>>> code.
>>>>
>>>> For SMP CPUs, the sequence we're following during early initialisation is:
>>>>
>>>> 1. Enable SMP coherency.
>>>> 2. Invalidate the caches.
>>>>
>>>> If the cache contains rubbish, enabling SMP coherency before invalidating
>>>> the cache is plainly an absurd thing to do.
>>>>
>>>> Can you try the patch below - not tested in any way, so you may need to
>>>> tweak it, but it should allow us to prove that point.
>>>>
>>>> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
>>>> index 0716bbe19872..db5137fc297d 100644
>>>> --- a/arch/arm/mm/proc-v7.S
>>>> +++ b/arch/arm/mm/proc-v7.S
>>>> @@ -275,6 +275,10 @@ __v7_b15mp_setup:
>>>>  __v7_ca17mp_setup:
>>>>  	mov	r10, #0
>>>>  1:
>>>> +	adr	r12, __v7_setup_stack		@ the local stack
>>>> +	stmia	r12, {r0-r5, r7, r9-r11, lr}
>>>> +	bl      v7_invalidate_l1
>>>> +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
>>>
>>> Some CPUs such as CA7 need enable SMP before any cache maintenance.
>>>
>>> CA7 TRM says something about SMP bit:
>>> "You must ensure this bit is set to 1 before the caches and MMU are enabled,
>>> or any cache and TLB maintenance operations are performed."
>>
>> Frankly, that's wrong for two reasons.  Think about it for a moment...
>>
>> If the cache contains crap - in other words, it contains random
>> uninitialised data in the cache lines at random locations, some of
>> which are marked valid and some of which are marked dirty - then
>> enabling the SMP bit puts the caches into coherent mode, and they
>> join the coherent cluster.
>>
>> That means those cache lines containing crap become visible to other
>> CPUs in the cluster, and can be migrated to other CPUs, and the crap
>> data in them becomes visible to other CPUs.  This leads to state
>> corruption on other CPUs in the cluster.
>>
>> Moreover, the cache invalidation of the local L1 cache is broadcast
>> to other CPUs in the cluster, and _their_ caches are also invalidated,
>> again, leading to state corruption on already running CPUs.  We don't
>> want the invalidation of the incoming CPU to be broadcast to the other
>> CPUs.
>>
>> This is all round a very bad thing.
>>
>>> Also CA7 would invalidate L1 automatically once reset, can we remove the
>>> invalidate op in CA7 case?
>>
>> No, because we enter this path from multiple different situations, eg,
>> after the decompressor has run, after the boot loader has run which
>> may have enabled caches and not properly invalidated them prior to
>> calling the kernel.
>>
> 
> Got it. Thanks very much for your detailed explanation!
> 

Just wondering if you are still planning to send this patch and if you
need me to do anything to help?

Thanks,
Dinh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-14 12:15             ` Dinh Nguyen
@ 2015-07-15 19:04               ` Russell King - ARM Linux
  2015-07-15 20:11                 ` Dinh Nguyen
  0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2015-07-15 19:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 14, 2015 at 07:15:29AM -0500, Dinh Nguyen wrote:
> Hi Russell,
> 
> Just wondering if you are still planning to send this patch and if you
> need me to do anything to help?

Giving a tested-by tag would help. Thanks. :)

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-08 21:07     ` Russell King - ARM Linux
  2015-07-08 21:55       ` Dinh Nguyen
  2015-07-09  3:52       ` Jisheng Zhang
@ 2015-07-15 19:23       ` Dinh Nguyen
  2015-07-16 16:11         ` Steffen Trumtrar
  2 siblings, 1 reply; 13+ messages in thread
From: Dinh Nguyen @ 2015-07-15 19:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/08/2015 04:07 PM, Russell King - ARM Linux wrote:
> On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
>> The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
>> user mode access.
> 
> Hmm.
> 
> I think what you've found is a(nother) latent bug in the CPU bring up
> code.
> 
> For SMP CPUs, the sequence we're following during early initialisation is:
> 
> 1. Enable SMP coherency.
> 2. Invalidate the caches.
> 
> If the cache contains rubbish, enabling SMP coherency before invalidating
> the cache is plainly an absurd thing to do.
> 
> Can you try the patch below - not tested in any way, so you may need to
> tweak it, but it should allow us to prove that point.
> 
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 0716bbe19872..db5137fc297d 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -275,6 +275,10 @@ __v7_b15mp_setup:
>  __v7_ca17mp_setup:
>  	mov	r10, #0
>  1:
> +	adr	r12, __v7_setup_stack		@ the local stack
> +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> +	bl      v7_invalidate_l1
> +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
>  #ifdef CONFIG_SMP
>  	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
>  	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> @@ -283,7 +287,7 @@ __v7_ca17mp_setup:
>  	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
>  	mcreq	p15, 0, r0, c1, c0, 1
>  #endif
> -	b	__v7_setup
> +	b	__v7_setup_cont
>  
>  /*
>   * Errata:
> @@ -417,6 +421,7 @@ __v7_setup:
>  	bl      v7_invalidate_l1
>  	ldmia	r12, {r0-r5, r7, r9, r11, lr}
>  
> +__v7_setup_cont:
>  	and	r0, r9, #0xff000000		@ ARM?
>  	teq	r0, #0x41000000
>  	bne	__errata_finish
> @@ -480,7 +485,7 @@ ENDPROC(__v7_setup)
>  
>  	.align	2
>  __v7_setup_stack:
> -	.space	4 * 11				@ 11 registers
> +	.space	4 * 12				@ 12 registers
>  
>  	__INITDATA
>  
> 

For this patch, please feel free to add:

Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>

Thanks,
Dinh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-15 19:04               ` Russell King - ARM Linux
@ 2015-07-15 20:11                 ` Dinh Nguyen
  0 siblings, 0 replies; 13+ messages in thread
From: Dinh Nguyen @ 2015-07-15 20:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/15/2015 02:04 PM, Russell King - ARM Linux wrote:
> On Tue, Jul 14, 2015 at 07:15:29AM -0500, Dinh Nguyen wrote:
>> Hi Russell,
>>
>> Just wondering if you are still planning to send this patch and if you
>> need me to do anything to help?
> 
> Giving a tested-by tag would help. Thanks. :)
> 

Okay, I've given my Tested-by in a follow-up response that has the patch
still in the email.

Thanks,
Dinh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup
  2015-07-15 19:23       ` Dinh Nguyen
@ 2015-07-16 16:11         ` Steffen Trumtrar
  0 siblings, 0 replies; 13+ messages in thread
From: Steffen Trumtrar @ 2015-07-16 16:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi, Russell!

On Wed, Jul 15, 2015 at 02:23:52PM -0500, Dinh Nguyen wrote:
> On 07/08/2015 04:07 PM, Russell King - ARM Linux wrote:
> > On Wed, Jul 08, 2015 at 02:13:32PM -0500, Dinh Nguyen wrote:
> >> The value of CPACR is 0x00F00000. So cp11 and cp10 are privileged and
> >> user mode access.
> > 
> > Hmm.
> > 
> > I think what you've found is a(nother) latent bug in the CPU bring up
> > code.
> > 
> > For SMP CPUs, the sequence we're following during early initialisation is:
> > 
> > 1. Enable SMP coherency.
> > 2. Invalidate the caches.
> > 
> > If the cache contains rubbish, enabling SMP coherency before invalidating
> > the cache is plainly an absurd thing to do.
> > 
> > Can you try the patch below - not tested in any way, so you may need to
> > tweak it, but it should allow us to prove that point.
> > 
> > diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> > index 0716bbe19872..db5137fc297d 100644
> > --- a/arch/arm/mm/proc-v7.S
> > +++ b/arch/arm/mm/proc-v7.S
> > @@ -275,6 +275,10 @@ __v7_b15mp_setup:
> >  __v7_ca17mp_setup:
> >  	mov	r10, #0
> >  1:
> > +	adr	r12, __v7_setup_stack		@ the local stack
> > +	stmia	r12, {r0-r5, r7, r9-r11, lr}
> > +	bl      v7_invalidate_l1
> > +	ldmia	r12, {r0-r5, r7, r9-r11, lr}
> >  #ifdef CONFIG_SMP
> >  	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
> >  	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> > @@ -283,7 +287,7 @@ __v7_ca17mp_setup:
> >  	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
> >  	mcreq	p15, 0, r0, c1, c0, 1
> >  #endif
> > -	b	__v7_setup
> > +	b	__v7_setup_cont
> >  
> >  /*
> >   * Errata:
> > @@ -417,6 +421,7 @@ __v7_setup:
> >  	bl      v7_invalidate_l1
> >  	ldmia	r12, {r0-r5, r7, r9, r11, lr}
> >  
> > +__v7_setup_cont:
> >  	and	r0, r9, #0xff000000		@ ARM?
> >  	teq	r0, #0x41000000
> >  	bne	__errata_finish
> > @@ -480,7 +485,7 @@ ENDPROC(__v7_setup)
> >  
> >  	.align	2
> >  __v7_setup_stack:
> > -	.space	4 * 11				@ 11 registers
> > +	.space	4 * 12				@ 12 registers
> >  
> >  	__INITDATA
> >  
> > 
> 
> For this patch, please feel free to add:
> 
> Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>
> 

I just ran into the same problem as Dinh. This patch fixed it for me, too.
Without it, 4.2-rc2 (at least) is pretty broken for me :-(
So, you may also add

	Tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>

When you send a proper patch.

Thanks,
Steffen Trumtrar

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-07-16 16:11 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-08 15:51 [PATCH] ARM: socfpga: put back v7_invalidate_l1 in socfpga_secondary_startup dinguyen at opensource.altera.com
2015-07-08 16:51 ` Russell King - ARM Linux
2015-07-08 19:13   ` Dinh Nguyen
2015-07-08 21:07     ` Russell King - ARM Linux
2015-07-08 21:55       ` Dinh Nguyen
2015-07-09  3:52       ` Jisheng Zhang
2015-07-09  7:57         ` Russell King - ARM Linux
2015-07-09  8:17           ` Jisheng Zhang
2015-07-14 12:15             ` Dinh Nguyen
2015-07-15 19:04               ` Russell King - ARM Linux
2015-07-15 20:11                 ` Dinh Nguyen
2015-07-15 19:23       ` Dinh Nguyen
2015-07-16 16:11         ` Steffen Trumtrar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.