From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1758063AbcBCRHH (ORCPT <rfc822;w@1wt.eu>);
	Wed, 3 Feb 2016 12:07:07 -0500
Received: from mail-yk0-f178.google.com ([209.85.160.178]:34059 "EHLO
	mail-yk0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1757936AbcBCRHA (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 3 Feb 2016 12:07:00 -0500
Date: Wed, 3 Feb 2016 11:59:01 -0500
From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>, Thomas Gleixner <tglx@linutronix.de>,
        Petr Mladek <pmladek@suse.com>, Jan Kara <jack@suse.cz>,
        Ben Hutchings <ben@decadent.org.uk>,
        Sasha Levin <sasha.levin@oracle.com>, Shaohua Li <shli@fb.com>,
        LKML <linux-kernel@vger.kernel.org>, stable@vger.kernel.org,
        Daniel Bilik <daniel.bilik@neosystem.cz>
Subject: Re: Crashes with 874bbfe600a6 in 3.18.25
Message-ID: <20160203165901.GH14091@mtj.duckdns.org>
References: <20160122160903.GH32380@htj.duckdns.org>
 <1453515623.3734.156.camel@decadent.org.uk>
 <alpine.DEB.2.11.1601231710210.3886@nanos>
 <20160126093400.GV24938@quack.suse.cz>
 <20160126111438.GA731@pathway.suse.cz>
 <alpine.DEB.2.11.1601261352010.3886@nanos>
 <56B1C9E4.4020400@suse.cz>
 <20160203122855.GB6762@dhcp22.suse.cz>
 <20160203162441.GE14091@mtj.duckdns.org>
 <20160203164852.GK6757@dhcp22.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160203164852.GK6757@dhcp22.suse.cz>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Feb 03, 2016 at 05:48:52PM +0100, Michal Hocko wrote:
> > So, the proper fix here is keeping cpu <-> node mapping stable across
> > cpu on/offlining which has been being worked on for a long time now.
> > The patchst is pending and it fixes other issues too.
> 
> What if that node was memory offlined as well? It just doesn't make any
> sense to stick to the old node when the old cpu went away already. If

Whether a memory node is offlined or not doesn't affect how cpus map
to the node.  The mapping is something which is fixed at physical and
firmware level throughout while the system is running.  If the node
becomes memory-less what changes is the memory allocation strategy for
the node, not how cpus map to nodes.  The only problem here is that we
currently lose how we mapped logical IDs to physical ones across
off/online cycles.

> anything and add_timer_on also for WORK_CPU_UNBOUND is really required
> then we should at least preserve WORK_CPU_UNBOUND in dwork->cpu so that
> __queue_work can actually move on to the local CPU properly and handle
> the offline cpu properly.

delayed_work->cpu is determined on queueing time.  Dealing with
offlined cpus at execution is completley fine.  There's no need to
"preserve" anything.

> > It's not bogus.  We can't flip a property that has been guaranteed
> > without any provision for verification.  Why do you think vmstat blow
> > up in the first place?
> 
> Because it wants to have a strong per-cpu guarantee while it used
> to fail to tell so. My understanding was that this is exactly what
> queue_delayed_work_on is for while WORK_CPU_UNBOUND tells that the
> caller doesn't really insist on any particular CPU (just local CPU is
> preferred).

What you said just doesn't fit the reality.  Again, think about why
vmstat crashed.  Why is this difficult to understand?

-- 
tejun