All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-kernel@vger.kernel.org,
	Valentin Schneider <valentin.schneider@arm.com>,
	Qian Cai <cai@redhat.com>,
	Vincent Donnefort <vincent.donnefort@arm.com>,
	Dexuan Cui <decui@microsoft.com>,
	Lai Jiangshan <laijs@linux.alibaba.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively
Date: Tue, 12 Jan 2021 09:14:11 -0800	[thread overview]
Message-ID: <20210112171411.GA22823@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20210111215052.GA19589@paulmck-ThinkPad-P72>

On Mon, Jan 11, 2021 at 01:50:52PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 11, 2021 at 10:09:07AM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 11, 2021 at 06:16:39PM +0100, Peter Zijlstra wrote:
> > > 
> > > While thinking more about this, I'm thinking a big part of the problem
> > > is that we're not dinstinguishing between geniuine per-cpu kthreads and
> > > kthreads that just happen to be per-cpu.
> > > 
> > > Geniuine per-cpu kthreads are kthread_bind() and have PF_NO_SETAFFINITY,
> > > but sadly a lot of non-per-cpu kthreads, that might happen to still be
> > > per-cpu also have that -- again workqueue does that even to it's unbound
> > > workers :-(
> > > 
> > > Now, anything created by smpboot, is created through
> > > kthread_create_on_cpu() and that additionally sets to_kthread(p)->flags
> > > KTHREAD_IS_PER_CPU.
> > > 
> > > And I'm thinking that might be sufficient, if we modify
> > > is_per_cpu_kthread() to check that, then we only match smpboot threads
> > > (which include the hotplug and stopper threads, but notably not the idle
> > > thread)
> > > 
> > > Sadly it appears like io_uring() uses kthread_create_on_cpu() without
> > > then having any hotplug crud on, so that needs additinoal frobbing.
> > > 
> > > Also, init_task is PF_KTHREAD but doesn't have a struct kthread on.. and
> > > I suppose bound workqueues don't go through this either.
> > > 
> > > Let me rummage around a bit...
> > > 
> > > This seems to not insta-explode... opinions?
> > 
> > It passes quick tests on -rcu both with and without the rcutorture fixes,
> > which is encouraging.  I will start a more vigorous test in about an hour.
> 
> And 672 ten-minute instances of RUDE01 passed with this patch applied
> and with my rcutorture patch reverted.  So looking good, thank you!!!

Still on the yesterday's patch, an overnight 12-hour run hit workqueue
warnings in three of four instances of the SRCU-P scenario, two
at not quite three hours in and the third at about ten hours in.
All runs were otherwise successful.  One of the runs also had "BUG:
using __this_cpu_read() in preemptible" as well, so that is the warning
shown below.  There was a series of these BUGs, then things settled down.

This is the warning at the end of process_one_work() that is complaining
about being on the wrong CPU.

I will fire up some tests on the new series.

							Thanx, Paul

------------------------------------------------------------------------

WARNING: CPU: 0 PID: 413 at kernel/workqueue.c:2193 process_one_work+0x8c/0x5f0
Modules linked in:
CPU: 0 PID: 413 Comm: kworker/3:3 Not tainted 5.11.0-rc3+ #1104
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 04/01/2014
Workqueue:  0x0 (events)
RIP: 0010:process_one_work+0x8c/0x5f0
Code: 48 8b 46 38 41 83 e6 20 48 89 45 c0 48 8b 46 40 48 89 45 c8 41 f6 44 24 4c 04 75 10 65 8b 05 eb 5d 78 59 41 39 44 24 40 74 02 <0f> 0b 48 ba eb 83 b5 80 46 86 c8 61 48 0f af d3 48 c1 ea 3a 49 8b
RSP: 0018:ffffb5a540847e70 EFLAGS: 00010006
RAX: 0000000000000000 RBX: ffff8fcc5f4f27e0 RCX: 2b970af959bb2a7d
RDX: ffff8fcc5f4f27e8 RSI: ffff8fcc5f4f27e0 RDI: ffff8fcc4306e3c0
RBP: ffffb5a540847ed0 R08: 0000000000000001 R09: ffff8fcc425e4680
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8fcc5f4eadc0
R13: ffff8fcc5f4ef700 R14: 0000000000000000 R15: ffff8fcc4306e3c0
FS:  0000000000000000(0000) GS:ffff8fcc5f400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004001e1 CR3: 0000000003084000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? process_one_work+0x5f0/0x5f0
 worker_thread+0x28/0x3c0
 ? process_one_work+0x5f0/0x5f0
 kthread+0x13b/0x160
 ? kthread_insert_work_sanity_check+0x50/0x50
 ret_from_fork+0x22/0x30
irq event stamp: 138141554
hardirqs last  enabled at (138141553): [<ffffffffa74a928f>] _raw_spin_unlock_irq+0x1f/0x40
hardirqs last disabled at (138141554): [<ffffffffa74a9071>] _raw_spin_lock_irq+0x41/0x50
softirqs last  enabled at (138140828): [<ffffffffa68ece37>] srcu_invoke_callbacks+0xe7/0x1a0
softirqs last disabled at (138140824): [<ffffffffa68ece37>] srcu_invoke_callbacks+0xe7/0x1a0
---[ end trace e31d6dded2c52564 ]---
kvm-guest: stealtime: cpu 3, msr 1f4d7b00
BUG: using __this_cpu_read() in preemptible [00000000] code: kworker/3:3/413
caller is refresh_cpu_vm_stats+0x1a6/0x320
CPU: 5 PID: 413 Comm: kworker/3:3 Tainted: G        W         5.11.0-rc3+ #1104
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: mm_percpu_wq vmstat_update
Call Trace:
 dump_stack+0x77/0x97
 check_preemption_disabled+0xb6/0xd0
 refresh_cpu_vm_stats+0x1a6/0x320
 vmstat_update+0xe/0x60
 process_one_work+0x2a0/0x5f0
 ? process_one_work+0x5f0/0x5f0
 worker_thread+0x28/0x3c0
 ? process_one_work+0x5f0/0x5f0
 kthread+0x13b/0x160
 ? kthread_insert_work_sanity_check+0x50/0x50
 ret_from_fork+0x22/0x30

  reply	other threads:[~2021-01-12 17:15 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-26  2:51 [PATCH -tip V3 0/8] workqueue: break affinity initiatively Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 1/8] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 2/8] workqueue: Manually break affinity on pool detachment Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask Lai Jiangshan
2021-01-04 13:56   ` Peter Zijlstra
2021-01-05  2:41     ` Lai Jiangshan
2021-01-05  2:53       ` Lai Jiangshan
2021-01-05  8:23       ` Lai Jiangshan
2021-01-05 13:17         ` Peter Zijlstra
2021-01-05 14:37           ` Lai Jiangshan
2021-01-05 14:40             ` Lai Jiangshan
2021-01-05 16:24         ` Peter Zijlstra
2020-12-26  2:51 ` [PATCH -tip V3 4/8] workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool Lai Jiangshan
     [not found]   ` <20201226101631.5448-1-hdanton@sina.com>
2020-12-27 14:04     ` Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 6/8] workqueue: reorganize workqueue_online_cpu() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 7/8] workqueue: reorganize workqueue_offline_cpu() unbind_workers() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 8/8] workqueue: Fix affinity of kworkers when attaching into pool Lai Jiangshan
     [not found]   ` <20201229100639.2086-1-hdanton@sina.com>
2020-12-29 10:13     ` Lai Jiangshan
2021-01-08 11:46 ` [PATCH -tip V3 0/8] workqueue: break affinity initiatively Peter Zijlstra
2021-01-11 10:07   ` Thomas Gleixner
2021-01-11 11:01     ` Peter Zijlstra
2021-01-11 15:00       ` Paul E. McKenney
2021-01-11 17:16       ` Peter Zijlstra
2021-01-11 18:09         ` Paul E. McKenney
2021-01-11 21:50           ` Paul E. McKenney
2021-01-12 17:14             ` Paul E. McKenney [this message]
2021-01-12 23:53               ` Paul E. McKenney
2021-01-15  9:11                 ` Peter Zijlstra
2021-01-15 13:04                   ` Peter Zijlstra
2021-01-16  6:00                     ` Lai Jiangshan
2021-01-11 19:21         ` Valentin Schneider
2021-01-11 20:23           ` Peter Zijlstra
2021-01-11 22:47             ` Valentin Schneider
2021-01-12  4:33             ` Lai Jiangshan
2021-01-12 14:53               ` Peter Zijlstra
2021-01-12 15:38                 ` Lai Jiangshan
2021-01-13 11:10                   ` Peter Zijlstra
2021-01-13 12:00                     ` Lai Jiangshan
2021-01-13 12:57                     ` Lai Jiangshan
2021-01-12 17:52               ` Valentin Schneider
2021-01-12 14:57           ` Jens Axboe
2021-01-12 15:51             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210112171411.GA22823@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=cai@redhat.com \
    --cc=decui@microsoft.com \
    --cc=jiangshanlai@gmail.com \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.donnefort@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.