From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965154AbcBCRNX (ORCPT ); Wed, 3 Feb 2016 12:13:23 -0500 Received: from mail-wm0-f51.google.com ([74.125.82.51]:35147 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964785AbcBCRNT (ORCPT ); Wed, 3 Feb 2016 12:13:19 -0500 Message-ID: <1454519595.6148.18.camel@gmail.com> Subject: Re: Crashes with 874bbfe600a6 in 3.18.25 From: Mike Galbraith To: Tejun Heo Cc: Michal Hocko , Jiri Slaby , Thomas Gleixner , Petr Mladek , Jan Kara , Ben Hutchings , Sasha Levin , Shaohua Li , LKML , stable@vger.kernel.org, Daniel Bilik Date: Wed, 03 Feb 2016 18:13:15 +0100 In-Reply-To: <20160203170652.GI14091@mtj.duckdns.org> References: <20160122160903.GH32380@htj.duckdns.org> <1453515623.3734.156.camel@decadent.org.uk> <20160126093400.GV24938@quack.suse.cz> <20160126111438.GA731@pathway.suse.cz> <56B1C9E4.4020400@suse.cz> <20160203122855.GB6762@dhcp22.suse.cz> <20160203162441.GE14091@mtj.duckdns.org> <1454518913.6148.15.camel@gmail.com> <20160203170652.GI14091@mtj.duckdns.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote: > On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote: > > Hm, so it's ok to queue work to an offline CPU? What happens if it > > doesn't come back for an eternity or two? > > Right now, it just loses affinity. A more interesting case is a cpu > going offline whlie work items bound to the cpu are still running and > the root problem is that we've never distinguished between affinity > for correctness and optimization and thus can't flush or warn on the > stagglers. The plan is to ensure that all correctness users specify > the CPU explicitly. Once we're there, we can warn on illegal usages. Ah, and the rest (the vast majority) can then be safely deflected away from nohz_full cpus. -Mike