From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757507AbcATVjV (ORCPT ); Wed, 20 Jan 2016 16:39:21 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:36063 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751034AbcATVjT (ORCPT ); Wed, 20 Jan 2016 16:39:19 -0500 Date: Wed, 20 Jan 2016 13:39:01 -0800 From: Shaohua Li To: Jan Kara CC: LKML , , Tejun Heo , Daniel Bilik , Sasha Levin Subject: Re: Crashes with 874bbfe600a6 in 3.18.25 Message-ID: <20160120213901.GA755895@devbig084.prn1.facebook.com> References: <20160120211926.GJ10810@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160120211926.GJ10810@quack.suse.cz> User-Agent: Mutt/1.5.20 (2009-12-10) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-01-20_07:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 20, 2016 at 10:19:26PM +0100, Jan Kara wrote: > Hello, > > a friend of mine started seeing crashes with 3.18.25 kernel - once > appropriate load is put on the machine it crashes within minutes. He > tracked down that reverting commit 874bbfe600a6 (this is the commit ID from > Linus' tree, in stable tree the commit ID is 1e7af294dd03) "workqueue: make > sure delayed work run in local cpu" makes the kernel stable again. I'm > attaching screenshot of the crash - sadly the initial part is missing but > it seems that we crashed when processing timers on otherwise idle CPU. This > is a production machine so experimentation is not easy but if we really > need more information it may be possible to reproduce the issue again and > gather it. > > Anyone has idea what is going on? I was looking into the code for a while > but so far I have no good explanation. It would be good to understand the > cause instead of just blindly reverting the commit from stable tree... Tejun fixed a bug in timer: 22b886dd10180939. is it included in 3.18.25? Thanks, Shaohua