All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* crash in perf_event_read
@ 2017-01-31  7:04 Alexei Starovoitov
  2017-01-31 10:27 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2017-01-31  7:04 UTC (permalink / raw
  To: Peter Zijlstra, linux-kernel

Hi Peter,

rarely I'm seeing the following crash:
[40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
[40196.179636] IP: perf_event_read+0xd3/0x1a0
[40196.188669] PGD 82e93a067
[40196.188670] PUD 7e1ddf067
[40196.194629] PMD 0
[40196.200589]
[40196.208284] Oops: 0000 [#1] SMP
[40196.208285] Modules linked in:
[40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
[40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
[40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
[40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
[40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
[40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
[40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
[40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
[40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
[40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
[40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
[40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
[40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[40196.208309] Call Trace:
[40196.208313]  ? __might_sleep+0x4a/0x80
[40196.208316]  perf_event_read_value+0x45/0x130
[40196.208318]  perf_read+0x84/0x2d0
[40196.208322]  __vfs_read+0x28/0x110
[40196.208325]  ? security_file_permission+0x9b/0xc0
[40196.208327]  vfs_read+0xa5/0x170
[40196.208327]  SyS_read+0x46/0xa0
[40196.208329]  do_syscall_64+0x4d/0xb0
[40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
[40196.208333] RIP: 0033:0x7ff1b6f5716d
[40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
[40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
[40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
[40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
[40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
[40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
[40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
[40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
[40196.208353] CR2: 000000000000a11a

The RIP points to this asm:
   0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
   0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
   0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
   0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
   0xffffffff8115fcae <+174>:   movslq %edi,%rcx
   0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
   0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
   0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
   0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
   0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
   0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)

which is this C code:
perf_event_read():
  local_cpu = get_cpu();
  cpu_to_read = find_cpu_to_read(event, local_cpu);
  put_cpu();

find_cpu_to_read():
  event_pkg =  topology_physical_package_id(event_cpu);
  local_pkg =  topology_physical_package_id(local_cpu);
  if (event_pkg == local_pkg)

If I read the asm correctly at the time of the crash
event_cpu == RDI == 00000000ffffffff
or in other words event->oncpu == -1
which I think is technically possible here.

Any suggestions how to fix this?
Happy to test any patches, though I don't know how to reproduce reliably.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31  7:04 crash in perf_event_read Alexei Starovoitov
@ 2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Peter Zijlstra @ 2017-01-31 10:27 UTC (permalink / raw
  To: Alexei Starovoitov; +Cc: linux-kernel, Ingo Molnar, eranian, davidcc

On Mon, Jan 30, 2017 at 11:04:08PM -0800, Alexei Starovoitov wrote:
> Hi Peter,
> 
> rarely I'm seeing the following crash:
> [40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
> [40196.179636] IP: perf_event_read+0xd3/0x1a0
> [40196.188669] PGD 82e93a067
> [40196.188670] PUD 7e1ddf067
> [40196.194629] PMD 0
> [40196.200589]
> [40196.208284] Oops: 0000 [#1] SMP
> [40196.208285] Modules linked in:
> [40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
> [40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
> [40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
> [40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
> [40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
> [40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
> [40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
> [40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
> [40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
> [40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
> [40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
> [40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
> [40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [40196.208309] Call Trace:
> [40196.208313]  ? __might_sleep+0x4a/0x80
> [40196.208316]  perf_event_read_value+0x45/0x130
> [40196.208318]  perf_read+0x84/0x2d0
> [40196.208322]  __vfs_read+0x28/0x110
> [40196.208325]  ? security_file_permission+0x9b/0xc0
> [40196.208327]  vfs_read+0xa5/0x170
> [40196.208327]  SyS_read+0x46/0xa0
> [40196.208329]  do_syscall_64+0x4d/0xb0
> [40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
> [40196.208333] RIP: 0033:0x7ff1b6f5716d
> [40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
> [40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
> [40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
> [40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
> [40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
> [40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
> [40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
> [40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
> [40196.208353] CR2: 000000000000a11a
> 
> The RIP points to this asm:
>    0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
>    0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
>    0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
>    0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
>    0xffffffff8115fcae <+174>:   movslq %edi,%rcx
>    0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
>    0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
>    0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
>    0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
>    0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
>    0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)
> 
> which is this C code:
> perf_event_read():
>   local_cpu = get_cpu();
>   cpu_to_read = find_cpu_to_read(event, local_cpu);
>   put_cpu();
> 
> find_cpu_to_read():
>   event_pkg =  topology_physical_package_id(event_cpu);
>   local_pkg =  topology_physical_package_id(local_cpu);
>   if (event_pkg == local_pkg)
> 
> If I read the asm correctly at the time of the crash
> event_cpu == RDI == 00000000ffffffff
> or in other words event->oncpu == -1
> which I think is technically possible here.
> 
> Any suggestions how to fix this?
> Happy to test any patches, though I don't know how to reproduce reliably.

Right you are, that's buggered.

Something like the below (compile tested only) ought to cure things I
think.

---
Subject: perf: Fix crash in perf_event_read()

Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.

Rework the code to more explicitly deal with event->oncpu being -1.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 88676ff98c0f..8d479be0c9a9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3538,14 +3538,15 @@ struct perf_read_data {
 	int ret;
 };
 
-static int find_cpu_to_read(struct perf_event *event, int local_cpu)
+static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
-	int event_cpu = event->oncpu;
 	u16 local_pkg, event_pkg;
 
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
-		event_pkg =  topology_physical_package_id(event_cpu);
-		local_pkg =  topology_physical_package_id(local_cpu);
+		int local_cpu = smp_processor_id();
+
+		event_pkg = topology_physical_package_id(event_cpu);
+		local_pkg = topology_physical_package_id(local_cpu);
 
 		if (event_pkg == local_pkg)
 			return local_cpu;
@@ -3675,7 +3676,7 @@ u64 perf_event_read_local(struct perf_event *event)
 
 static int perf_event_read(struct perf_event *event, bool group)
 {
-	int ret = 0, cpu_to_read, local_cpu;
+	int event_cpu, ret = 0;
 
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3688,9 +3689,12 @@ static int perf_event_read(struct perf_event *event, bool group)
 			.ret = 0,
 		};
 
-		local_cpu = get_cpu();
-		cpu_to_read = find_cpu_to_read(event, local_cpu);
-		put_cpu();
+		event_cpu = READ_ONCE(event->oncpu);
+		if ((unsigned)event_cpu >= nr_cpu_ids)
+			return 0;
+
+		preempt_disable();
+		event_cpu = __perf_event_read_cpu(event, event_cpu);
 
 		/*
 		 * Purposely ignore the smp_call_function_single() return
@@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
 		 * Therefore, either way, we'll have an up-to-date event count
 		 * after this.
 		 */
-		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
+		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
+		preempt_enable();
 		ret = data.ret;
 	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
 		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31 10:27 ` Peter Zijlstra
@ 2017-01-31 18:50   ` Alexei Starovoitov
  2017-02-02  7:43   ` David Carrillo-Cisneros
  2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: Alexei Starovoitov @ 2017-01-31 18:50 UTC (permalink / raw
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, eranian, davidcc

On Tue, Jan 31, 2017 at 11:27:10AM +0100, Peter Zijlstra wrote:
> Subject: perf: Fix crash in perf_event_read()
> 
> Alexei had his box explode because doing read() on a package
> (rapl/uncore) event that isn't currently scheduled in ends up doing an
> out-of-bounds load.
> 
> Rework the code to more explicitly deal with event->oncpu being -1.
> 
> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Carrillo-Cisneros <davidcc@google.com>
> Cc: Stephane Eranian <eranian@google.com>
> Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks for the quick fix!
Everything is clean after several hours.
Tested-by: Alexei Starovoitov <ast@kernel.org>

>  		/*
>  		 * Purposely ignore the smp_call_function_single() return
> @@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
>  		 * Therefore, either way, we'll have an up-to-date event count
>  		 * after this.
>  		 */

in that comment above (not shown in this diff) there is a line:
"If event->oncpu isn't a valid CPU it means..."
it doesn't match the code due to merge conflict resolution in 2cc538412a.
Probably makes sense to fix it s/event->oncpu/event_cpu/ as part of this diff.

> -		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
> +		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
> +		preempt_enable();
>  		ret = data.ret;
>  	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
>  		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash in perf_event_read
  2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
@ 2017-02-02  7:43   ` David Carrillo-Cisneros
  2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: David Carrillo-Cisneros @ 2017-02-02  7:43 UTC (permalink / raw
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, linux-kernel, Ingo Molnar, Stephane Eranian

On Tue, Jan 31, 2017 at 2:27 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Jan 30, 2017 at 11:04:08PM -0800, Alexei Starovoitov wrote:
>> Hi Peter,
>>
>> rarely I'm seeing the following crash:
>> [40196.164255] BUG: unable to handle kernel paging request at 000000000000a11a
>> [40196.179636] IP: perf_event_read+0xd3/0x1a0
>> [40196.188669] PGD 82e93a067
>> [40196.188670] PUD 7e1ddf067
>> [40196.194629] PMD 0
>> [40196.200589]
>> [40196.208284] Oops: 0000 [#1] SMP
>> [40196.208285] Modules linked in:
>> [40196.208299] CPU: 24 PID: 4423 Comm: dynoKernelMon Not tainted 4.10.0-rc5-01189-gc6e0ad0ee5b0 #599
>> [40196.208300] Hardware name: Quanta Mono Lake-M.2 SATA 20F20BU0270/Mono Lake-M.2 SATA, BIOS F20_3A12 10/24/2016
>> [40196.208301] task: ffff8807e3b65580 task.stack: ffffc90009748000
>> [40196.208302] RIP: 0010:perf_event_read+0xd3/0x1a0
>> [40196.208303] RSP: 0018:ffffc9000974bd48 EFLAGS: 00010202
>> [40196.208304] RAX: 000000000000a040 RBX: ffff8807b79fd000 RCX: 0000000000000000
>> [40196.208304] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 00000000ffffffff
>> [40196.208305] RBP: ffffc9000974bd80 R08: 0000000000000000 R09: 0000000000000000
>> [40196.208305] R10: ffff8807cf8c7038 R11: 0000000000000000 R12: ffffc9000974bde8
>> [40196.208306] R13: 0000000000000000 R14: ffff8807b79fd000 R15: 00000000000004e0
>> [40196.208307] FS:  00007ff1b45ff700(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
>> [40196.208307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [40196.208308] CR2: 000000000000a11a CR3: 0000000850298000 CR4: 00000000003406e0
>> [40196.208308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [40196.208309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [40196.208309] Call Trace:
>> [40196.208313]  ? __might_sleep+0x4a/0x80
>> [40196.208316]  perf_event_read_value+0x45/0x130
>> [40196.208318]  perf_read+0x84/0x2d0
>> [40196.208322]  __vfs_read+0x28/0x110
>> [40196.208325]  ? security_file_permission+0x9b/0xc0
>> [40196.208327]  vfs_read+0xa5/0x170
>> [40196.208327]  SyS_read+0x46/0xa0
>> [40196.208329]  do_syscall_64+0x4d/0xb0
>> [40196.208332]  entry_SYSCALL64_slow_path+0x25/0x25
>> [40196.208333] RIP: 0033:0x7ff1b6f5716d
>> [40196.208334] RSP: 002b:00007ff1b45fd080 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
>> [40196.208335] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1b6f5716d
>> [40196.208335] RDX: 0000000000000020 RSI: 00007ff1b45fd090 RDI: 0000000000000054
>> [40196.208336] RBP: 00007ff1b45fd0e0 R08: 00007ff1b65e45c0 R09: 00007ff1b45fc9b1
>> [40196.208336] R10: 0000000000000020 R11: 0000000000000293 R12: 0000000000000000
>> [40196.208337] R13: 00007ff1b5687000 R14: 00000004a817c7fe R15: 00000000000004e0
>> [40196.208337] Code: 60 02 00 00 74 30 48 63 cf 48 c7 c0 40 a0 00 00 48 8b 34 cd c0 93 d0 81 48 63 ca 48 8b 0c cd c0 93 d0 81 0f b7 8c 08 da 00 00 00 <66> 39 8c 30 da 00 00 00 0f 44 fa 48 8d 55 d0 b9 01 00 00 00 48
>> [40196.208353] RIP: perf_event_read+0xd3/0x1a0 RSP: ffffc9000974bd48
>> [40196.208353] CR2: 000000000000a11a
>>
>> The RIP points to this asm:
>>    0xffffffff8115fc9b <+155>:   mov    %gs:0x7eeaa486(%rip),%edx        # 0xa128 <cpu_number>
>>    0xffffffff8115fca2 <+162>:   testb  $0x2,0x68(%rbx)
>>    0xffffffff8115fca6 <+166>:   mov    0x260(%rdi),%edi
>>    0xffffffff8115fcac <+172>:   je     0xffffffff8115fcde <perf_event_read+222>
>>    0xffffffff8115fcae <+174>:   movslq %edi,%rcx
>>    0xffffffff8115fcb1 <+177>:   mov    $0xa040,%rax
>>    0xffffffff8115fcb8 <+184>:   mov    -0x7e2f6c40(,%rcx,8),%rsi
>>    0xffffffff8115fcc0 <+192>:   movslq %edx,%rcx
>>    0xffffffff8115fcc3 <+195>:   mov    -0x7e2f6c40(,%rcx,8),%rcx
>>    0xffffffff8115fccb <+203>:   movzwl 0xda(%rax,%rcx,1),%ecx
>>    0xffffffff8115fcd3 <+211>:   cmp    %cx,0xda(%rax,%rsi,1)
>>
>> which is this C code:
>> perf_event_read():
>>   local_cpu = get_cpu();
>>   cpu_to_read = find_cpu_to_read(event, local_cpu);
>>   put_cpu();
>>
>> find_cpu_to_read():
>>   event_pkg =  topology_physical_package_id(event_cpu);
>>   local_pkg =  topology_physical_package_id(local_cpu);
>>   if (event_pkg == local_pkg)
>>
>> If I read the asm correctly at the time of the crash
>> event_cpu == RDI == 00000000ffffffff
>> or in other words event->oncpu == -1
>> which I think is technically possible here.
>>
>> Any suggestions how to fix this?
>> Happy to test any patches, though I don't know how to reproduce reliably.
>
> Right you are, that's buggered.
>
> Something like the below (compile tested only) ought to cure things I
> think.
>
> ---
> Subject: perf: Fix crash in perf_event_read()
>
> Alexei had his box explode because doing read() on a package
> (rapl/uncore) event that isn't currently scheduled in ends up doing an
> out-of-bounds load.
>
> Rework the code to more explicitly deal with event->oncpu being -1.
>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: David Carrillo-Cisneros <davidcc@google.com>
> Cc: Stephane Eranian <eranian@google.com>
> Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/events/core.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 88676ff98c0f..8d479be0c9a9 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3538,14 +3538,15 @@ struct perf_read_data {
>         int ret;
>  };
>
> -static int find_cpu_to_read(struct perf_event *event, int local_cpu)
> +static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>  {
> -       int event_cpu = event->oncpu;
>         u16 local_pkg, event_pkg;
>
>         if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
> -               event_pkg =  topology_physical_package_id(event_cpu);
> -               local_pkg =  topology_physical_package_id(local_cpu);
> +               int local_cpu = smp_processor_id();
> +
> +               event_pkg = topology_physical_package_id(event_cpu);
> +               local_pkg = topology_physical_package_id(local_cpu);
>
>                 if (event_pkg == local_pkg)
>                         return local_cpu;
> @@ -3675,7 +3676,7 @@ u64 perf_event_read_local(struct perf_event *event)
>
>  static int perf_event_read(struct perf_event *event, bool group)
>  {
> -       int ret = 0, cpu_to_read, local_cpu;
> +       int event_cpu, ret = 0;
>
>         /*
>          * If event is enabled and currently active on a CPU, update the
> @@ -3688,9 +3689,12 @@ static int perf_event_read(struct perf_event *event, bool group)
>                         .ret = 0,
>                 };
>
> -               local_cpu = get_cpu();
> -               cpu_to_read = find_cpu_to_read(event, local_cpu);
> -               put_cpu();
> +               event_cpu = READ_ONCE(event->oncpu);
> +               if ((unsigned)event_cpu >= nr_cpu_ids)
> +                       return 0;

Yeah, I missed that one. Thanks for fixing it.

> +
> +               preempt_disable();
> +               event_cpu = __perf_event_read_cpu(event, event_cpu);
>
>                 /*
>                  * Purposely ignore the smp_call_function_single() return
> @@ -3702,7 +3706,8 @@ static int perf_event_read(struct perf_event *event, bool group)
>                  * Therefore, either way, we'll have an up-to-date event count
>                  * after this.
>                  */
> -               (void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
> +               (void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
> +               preempt_enable();
>                 ret = data.ret;
>         } else if (event->state == PERF_EVENT_STATE_INACTIVE) {
>                 struct perf_event_context *ctx = event->ctx;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:perf/urgent] perf/core: Fix crash in perf_event_read()
  2017-01-31 10:27 ` Peter Zijlstra
  2017-01-31 18:50   ` Alexei Starovoitov
  2017-02-02  7:43   ` David Carrillo-Cisneros
@ 2017-02-10  8:32   ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Peter Zijlstra @ 2017-02-10  8:32 UTC (permalink / raw
  To: linux-tip-commits
  Cc: alexei.starovoitov, peterz, linux-kernel, torvalds, davidcc,
	mingo, hpa, tglx, ast

Commit-ID:  451d24d1e5f40bad000fa9abe36ddb16fc9928cb
Gitweb:     http://git.kernel.org/tip/451d24d1e5f40bad000fa9abe36ddb16fc9928cb
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 31 Jan 2017 11:27:10 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 10 Feb 2017 09:04:50 +0100

perf/core: Fix crash in perf_event_read()

Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.

Rework the code to more explicitly deal with event->oncpu being -1.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index e5aaa80..e235bb9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3487,14 +3487,15 @@ struct perf_read_data {
 	int ret;
 };
 
-static int find_cpu_to_read(struct perf_event *event, int local_cpu)
+static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
-	int event_cpu = event->oncpu;
 	u16 local_pkg, event_pkg;
 
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
-		event_pkg =  topology_physical_package_id(event_cpu);
-		local_pkg =  topology_physical_package_id(local_cpu);
+		int local_cpu = smp_processor_id();
+
+		event_pkg = topology_physical_package_id(event_cpu);
+		local_pkg = topology_physical_package_id(local_cpu);
 
 		if (event_pkg == local_pkg)
 			return local_cpu;
@@ -3624,7 +3625,7 @@ u64 perf_event_read_local(struct perf_event *event)
 
 static int perf_event_read(struct perf_event *event, bool group)
 {
-	int ret = 0, cpu_to_read, local_cpu;
+	int event_cpu, ret = 0;
 
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3637,21 +3638,25 @@ static int perf_event_read(struct perf_event *event, bool group)
 			.ret = 0,
 		};
 
-		local_cpu = get_cpu();
-		cpu_to_read = find_cpu_to_read(event, local_cpu);
-		put_cpu();
+		event_cpu = READ_ONCE(event->oncpu);
+		if ((unsigned)event_cpu >= nr_cpu_ids)
+			return 0;
+
+		preempt_disable();
+		event_cpu = __perf_event_read_cpu(event, event_cpu);
 
 		/*
 		 * Purposely ignore the smp_call_function_single() return
 		 * value.
 		 *
-		 * If event->oncpu isn't a valid CPU it means the event got
+		 * If event_cpu isn't a valid CPU it means the event got
 		 * scheduled out and that will have updated the event count.
 		 *
 		 * Therefore, either way, we'll have an up-to-date event count
 		 * after this.
 		 */
-		(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
+		(void)smp_call_function_single(event_cpu, __perf_event_read, &data, 1);
+		preempt_enable();
 		ret = data.ret;
 	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
 		struct perf_event_context *ctx = event->ctx;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-02-10  8:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-31  7:04 crash in perf_event_read Alexei Starovoitov
2017-01-31 10:27 ` Peter Zijlstra
2017-01-31 18:50   ` Alexei Starovoitov
2017-02-02  7:43   ` David Carrillo-Cisneros
2017-02-10  8:32   ` [tip:perf/urgent] perf/core: Fix crash in perf_event_read() tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.