From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751526AbbFLDJq (ORCPT <rfc822;w@1wt.eu>);
	Thu, 11 Jun 2015 23:09:46 -0400
Received: from mail-qg0-f41.google.com ([209.85.192.41]:33872 "EHLO
	mail-qg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750808AbbFLDJp (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 11 Jun 2015 23:09:45 -0400
From: Vince Weaver <vincent.weaver@maine.edu>
X-Google-Original-From: Vince Weaver <vince@vincent-weaver-1.umelst.maine.edu>
Date: Thu, 11 Jun 2015 23:15:25 -0400 (EDT)
To: linux-kernel@vger.kernel.org
cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, Ingo Molnar <mingo@redhat.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Stephane Eranian <eranian@gmail.com>
Subject: perf: aux area related crash and warnings
Message-ID: <alpine.DEB.2.20.1506112305510.2757@vincent-weaver-1.umelst.maine.edu>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


The fuzzer turned these up (this is 4.1-rc7 with the fasync patch 
applied) on a Haswell.  I'm listing the crash first, but the warning
happened earlier, not sure if it is related.

[36298.986117] BUG: spinlock recursion on CPU#4, perf_fuzzer/3410
[36298.992915]  lock: 0xffff88011edf7cd0, .magic: dead4ead, .owner: perf_fuzzer/3410, .owner_cpu: 4
[36299.002919] CPU: 4 PID: 3410 Comm: perf_fuzzer Tainted: G        W       4.1.0-rc7+ #155
[36299.012152] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[36299.020606]  ffff88011edf7cd0 ffff88011eb059a0 ffffffff816d7229 0000000000000054
[36299.029199]  ffff8800c2f4ac50 ffff88011eb059c0 ffffffff810c2895 ffff88011edf7cd0
[36299.037796]  ffffffff81a1e481 ffff88011eb059e0 ffffffff810c2916 ffff88011edf7cd0
[36299.046338] Call Trace:
[36299.049501]  <NMI>  [<ffffffff816d7229>] dump_stack+0x45/0x57
[36299.056284]  [<ffffffff810c2895>] spin_dump+0x85/0xe0
[36299.062282]  [<ffffffff810c2916>] spin_bug+0x26/0x30
[36299.068111]  [<ffffffff810c2acf>] do_raw_spin_lock+0x13f/0x180
[36299.074897]  [<ffffffff816de6e9>] _raw_spin_lock+0x39/0x40
[36299.081276]  [<ffffffff8117a039>] ? free_pcppages_bulk+0x39/0x620
[36299.088340]  [<ffffffff8117a039>] free_pcppages_bulk+0x39/0x620
[36299.095182]  [<ffffffff81177e14>] ? free_pages_prepare+0x3a4/0x550
[36299.102291]  [<ffffffff811c9936>] ? kfree_debugcheck+0x16/0x40
[36299.108987]  [<ffffffff8117a938>] free_hot_cold_page+0x178/0x1a0
[36299.115850]  [<ffffffff8117aa47>] __free_pages+0x37/0x50
[36299.121991]  [<ffffffff8116ae0a>] rb_free_aux+0xba/0xf0
[36299.128034]  [<ffffffff8116b0e7>] perf_aux_output_end+0xb7/0xf0
[36299.134793]  [<ffffffff81037b0e>] intel_bts_interrupt+0x8e/0xd0
[36299.141543]  [<ffffffff810338bf>] intel_pmu_handle_irq+0x4f/0x450
[36299.148482]  [<ffffffff810bc288>] ? check_chain_key+0x128/0x1e0
[36299.155249]  [<ffffffff8102a4ab>] perf_event_nmi_handler+0x2b/0x50
[36299.162273]  [<ffffffff810185d0>] nmi_handle+0xa0/0x150
[36299.168278]  [<ffffffff81018535>] ? nmi_handle+0x5/0x150
[36299.174377]  [<ffffffff8101887a>] default_do_nmi+0x4a/0x140
[36299.180735]  [<ffffffff81018a08>] do_nmi+0x98/0xe0
[36299.186219]  [<ffffffff816e13ef>] end_repeat_nmi+0x1e/0x2e
[36299.192501]  [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.199951]  [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.207410]  [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.214898]  <<EOE>>  [<ffffffff810bdd89>] ? __lock_acquire.isra.31+0x3b9/0x1000

and while I was trying to cut and paste that, the locked haswell just took 
down the network switch so I can't get the rest until tomorrow.

The warning was
[27716.785131] WARNING: CPU: 2 PID: 17655 at kernel/events/ring_buffer.c:282 perf_aux_output_begin+0x1ce/0x1f0()

which corresponds to 
        /*
         * Nesting is not supported for AUX area, make sure nested
         * writers are caught early
         */
        if (WARN_ON_ONCE(local_xchg(&rb->aux_nest, 1)))
                goto err_put;

again just lost access to the machine with the serial console, for the 
full backtrace it will have to wait until I'm not remote.

Vince