From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751526AbbFLDJq (ORCPT ); Thu, 11 Jun 2015 23:09:46 -0400 Received: from mail-qg0-f41.google.com ([209.85.192.41]:33872 "EHLO mail-qg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750808AbbFLDJp (ORCPT ); Thu, 11 Jun 2015 23:09:45 -0400 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Thu, 11 Jun 2015 23:15:25 -0400 (EDT) To: linux-kernel@vger.kernel.org cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Stephane Eranian Subject: perf: aux area related crash and warnings Message-ID: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The fuzzer turned these up (this is 4.1-rc7 with the fasync patch applied) on a Haswell. I'm listing the crash first, but the warning happened earlier, not sure if it is related. [36298.986117] BUG: spinlock recursion on CPU#4, perf_fuzzer/3410 [36298.992915] lock: 0xffff88011edf7cd0, .magic: dead4ead, .owner: perf_fuzzer/3410, .owner_cpu: 4 [36299.002919] CPU: 4 PID: 3410 Comm: perf_fuzzer Tainted: G W 4.1.0-rc7+ #155 [36299.012152] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [36299.020606] ffff88011edf7cd0 ffff88011eb059a0 ffffffff816d7229 0000000000000054 [36299.029199] ffff8800c2f4ac50 ffff88011eb059c0 ffffffff810c2895 ffff88011edf7cd0 [36299.037796] ffffffff81a1e481 ffff88011eb059e0 ffffffff810c2916 ffff88011edf7cd0 [36299.046338] Call Trace: [36299.049501] [] dump_stack+0x45/0x57 [36299.056284] [] spin_dump+0x85/0xe0 [36299.062282] [] spin_bug+0x26/0x30 [36299.068111] [] do_raw_spin_lock+0x13f/0x180 [36299.074897] [] _raw_spin_lock+0x39/0x40 [36299.081276] [] ? free_pcppages_bulk+0x39/0x620 [36299.088340] [] free_pcppages_bulk+0x39/0x620 [36299.095182] [] ? free_pages_prepare+0x3a4/0x550 [36299.102291] [] ? kfree_debugcheck+0x16/0x40 [36299.108987] [] free_hot_cold_page+0x178/0x1a0 [36299.115850] [] __free_pages+0x37/0x50 [36299.121991] [] rb_free_aux+0xba/0xf0 [36299.128034] [] perf_aux_output_end+0xb7/0xf0 [36299.134793] [] intel_bts_interrupt+0x8e/0xd0 [36299.141543] [] intel_pmu_handle_irq+0x4f/0x450 [36299.148482] [] ? check_chain_key+0x128/0x1e0 [36299.155249] [] perf_event_nmi_handler+0x2b/0x50 [36299.162273] [] nmi_handle+0xa0/0x150 [36299.168278] [] ? nmi_handle+0x5/0x150 [36299.174377] [] default_do_nmi+0x4a/0x140 [36299.180735] [] do_nmi+0x98/0xe0 [36299.186219] [] end_repeat_nmi+0x1e/0x2e [36299.192501] [] ? __lock_acquire.isra.31+0x27e/0x1000 [36299.199951] [] ? __lock_acquire.isra.31+0x27e/0x1000 [36299.207410] [] ? __lock_acquire.isra.31+0x27e/0x1000 [36299.214898] <> [] ? __lock_acquire.isra.31+0x3b9/0x1000 and while I was trying to cut and paste that, the locked haswell just took down the network switch so I can't get the rest until tomorrow. The warning was [27716.785131] WARNING: CPU: 2 PID: 17655 at kernel/events/ring_buffer.c:282 perf_aux_output_begin+0x1ce/0x1f0() which corresponds to /* * Nesting is not supported for AUX area, make sure nested * writers are caught early */ if (WARN_ON_ONCE(local_xchg(&rb->aux_nest, 1))) goto err_put; again just lost access to the machine with the serial console, for the full backtrace it will have to wait until I'm not remote. Vince