LKML Archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tejun Heo <tj@kernel.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>,
	Michal Hocko <mhocko@kernel.org>, Jiri Slaby <jslaby@suse.cz>,
	Thomas Gleixner <tglx@linutronix.de>,
	Petr Mladek <pmladek@suse.com>, Jan Kara <jack@suse.cz>,
	Ben Hutchings <ben@decadent.org.uk>,
	Sasha Levin <sasha.levin@oracle.com>, Shaohua Li <shli@fb.com>,
	LKML <linux-kernel@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Daniel Bilik <daniel.bilik@neosystem.cz>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: Crashes with 874bbfe600a6 in 3.18.25
Date: Tue, 9 Feb 2016 10:06:04 -0800	[thread overview]
Message-ID: <CA+55aFzpBgyWHh9bHUNW2vX+nJRLAmtXV3VFVazppb+SaY78AQ@mail.gmail.com> (raw)
In-Reply-To: <20160209175101.GB3741@mtj.duckdns.org>

On Tue, Feb 9, 2016 at 9:51 AM, Tejun Heo <tj@kernel.org> wrote:
>>
>>  (a) actually dequeue timers and work queues that are bound to a
>> particular CPU when a CPU goes down.
>>
> This goes the same for work items and timers.  If we want to do
> explicit dequeueing or flushing of cpu-bound stuff on cpu down, we'll
> have to either dedicate *_on() interfaces for correctness or introduce
> a separate set of interfaces to use for optimization and correctness.

We already do that. "add_timer_on()" for timers, and cpu !=
WORK_CPU_UNBOUND for work items.

>    Maybe we can get away with
> declaring that _on() usages are absolute.

I really think that anything else would be odd as hell. If you asked
for a timer (or work) on a particular CPU, and you get it on another
one, that's a bug.

It's much better to just dequeue those entries and say "sorry, your
CPU went away".

Of course, we could play around with just run them early at CPU-down
time (and anybody trying to requeue would get an error because the CPU
is in the process of going down), but that sounds like more work for
any users, and like a much more fundamental difference. The "just
silently dequeue" makes more sense, and pairs well with anything that
sets things up on CPU-up time (which a percpu entity will have to do
anyway).

> So, how about reverting 874bbfe6 and performing random foreign
> queueing during -rc's for a couple cycles so that we can at least find
> out the broken ones quickly in devel branch and backport fixes as
> they're found?

Yeah, that sounds good to me. Having some "cpu work/timer debug"
config option that ends up spreading out non-cpu-specific timers and
work in order to find bugs sounds like a good idea. And I don't think
it should be limited to rc releases, I think lots of people might be
willing to run that (the same way we had people - and even
distributions - that did PAGEALLOC_DEBUG which is a lot bigger
hammer).

             Linus

  reply	other threads:[~2016-02-09 18:06 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-20 21:19 Crashes with 874bbfe600a6 in 3.18.25 Jan Kara
2016-01-20 21:39 ` Shaohua Li
2016-01-21  9:52   ` Jan Kara
2016-01-21 13:29     ` Sasha Levin
2016-01-22  1:10     ` Sasha Levin
2016-01-22 16:09       ` Tejun Heo
2016-01-23  2:20         ` Ben Hutchings
2016-01-23 16:11           ` Thomas Gleixner
2016-01-26  9:34             ` Jan Kara
2016-01-26  9:49               ` Thomas Gleixner
2016-01-26 11:14               ` Petr Mladek
2016-01-26 13:09                 ` Thomas Gleixner
2016-02-03  9:35                   ` Jiri Slaby
2016-02-03 10:41                     ` Thomas Gleixner
2016-02-03 12:28                     ` Michal Hocko
2016-02-03 16:24                       ` Tejun Heo
2016-02-03 16:48                         ` Michal Hocko
2016-02-03 16:59                           ` Tejun Heo
2016-02-04  6:37                             ` Michal Hocko
2016-02-04  7:40                               ` Michal Hocko
2016-02-03 17:01                         ` Mike Galbraith
2016-02-03 17:06                           ` Tejun Heo
2016-02-03 17:13                             ` Mike Galbraith
2016-02-03 17:15                               ` Tejun Heo
2016-02-04  2:00                             ` Mike Galbraith
2016-02-05 16:49                               ` Tejun Heo
2016-02-05 20:47                                 ` Mike Galbraith
2016-02-05 20:54                                   ` Tejun Heo
2016-02-05 20:59                                     ` Mike Galbraith
2016-02-05 21:06                                       ` Tejun Heo
2016-02-06 13:07                                         ` Henrique de Moraes Holschuh
2016-02-07  5:19                                           ` Mike Galbraith
2016-02-07  5:59                                             ` Mike Galbraith
2016-02-09 15:31                                         ` Mike Galbraith
2016-02-09 16:39                                           ` Linus Torvalds
2016-02-09 16:50                                             ` Tejun Heo
2016-02-09 17:04                                               ` Mike Galbraith
2016-02-09 17:54                                                 ` Tejun Heo
2016-02-09 17:56                                                   ` Mike Galbraith
2016-02-09 18:02                                                     ` Mike Galbraith
2016-02-09 18:27                                                       ` Tejun Heo
2016-02-09 17:04                                               ` Linus Torvalds
2016-02-09 17:51                                                 ` Tejun Heo
2016-02-09 18:06                                                   ` Linus Torvalds [this message]
2016-02-04 10:04                             ` Mike Galbraith
2016-02-04 10:46                               ` Thomas Gleixner
2016-02-04 11:07                                 ` Mike Galbraith
2016-02-04 11:20                                 ` Jan Kara
2016-02-04 16:39                                   ` Daniel Bilik
2016-02-05  2:40                                     ` Mike Galbraith
2016-02-05  8:11                                       ` Daniel Bilik
2016-02-05  8:33                                         ` Mike Galbraith
2016-02-03 18:46                         ` Thomas Gleixner
2016-02-03 19:01                           ` Tejun Heo
2016-02-03 19:05                             ` Thomas Gleixner
2016-02-03 19:15                               ` Tejun Heo
2016-02-05  5:44                         ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFzpBgyWHh9bHUNW2vX+nJRLAmtXV3VFVazppb+SaY78AQ@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=ben@decadent.org.uk \
    --cc=daniel.bilik@neosystem.cz \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=pmladek@suse.com \
    --cc=sasha.levin@oracle.com \
    --cc=shli@fb.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).