Re: [PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Lai Jiangshan <jiangshanlai@gmail.com>
To: Hillf Danton <hdanton@sina.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Peter Zijlstra <peterz@infradead.org>, Qian Cai <cai@redhat.com>,
	Vincent Donnefort <vincent.donnefort@arm.com>,
	Dexuan Cui <decui@microsoft.com>,
	Lai Jiangshan <laijs@linux.alibaba.com>,
	Tejun Heo <tj@kernel.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>
Subject: Re: [PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool
Date: Sun, 27 Dec 2020 22:04:03 +0800	[thread overview]
Message-ID: <CAJhGHyDM89Kq_Dop-6c8_6B4K545MHMJDxGggpTmjxu4Wuz7zQ@mail.gmail.com> (raw)
In-Reply-To: <20201226101631.5448-1-hdanton@sina.com>

On Sat, Dec 26, 2020 at 6:16 PM Hillf Danton <hdanton@sina.com> wrote:
>
> Sat, 26 Dec 2020 10:51:13 +0800
> > From: Lai Jiangshan <laijs@linux.alibaba.com>
> >
> > There is possible that a per-node pool/woker's affinity is a single
> > CPU.  It can happen when the workqueue user changes the cpumask of the
> > workqueue or when wq_unbound_cpumask is changed by system adim via
> > /sys/devices/virtual/workqueue/cpumask.  And pool->attrs->cpumask
> > is workqueue's cpumask & wq_unbound_cpumask & possible_cpumask_of_the_node,
> > which can be a single CPU and makes the pool's workers to be "per cpu
> > kthread".
> >
> > And it can also happen when the cpu is the first online and has been
> > the only online cpu in pool->attrs->cpumask.  In this case, the worker
> > task cpumask is single cpu no matter what pool->attrs->cpumask since
> > commit d945b5e9f0e3 ("workqueue: Fix setting affinity of unbound worker
> > threads").
> >
> > And the scheduler won't break affinity on the "per cpu kthread" workers
> > when the CPU is going down, so we have to do it by our own.
> >
> > We do it by reusing existing restore_unbound_workers_cpumask() and rename
> > it to update_unbound_workers_cpumask().  When the number of the online
> > CPU of the pool goes from 1 to 0, we break the affinity initiatively.
> >
> > Note here, we even break the affinity for non-per-cpu-kthread workers,
> > because first, the code path is slow path which is not worth too much to
> > optimize, second, we don't need to rely on the code/conditions when the
> > scheduler forces breaking affinity for us.
> >
> > The way to break affinity is to set the workers' affinity to
> > cpu_possible_mask, so that we preserve the same behavisor when
> > the scheduler breaks affinity for us.
> >
> > Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
> > Acked-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> > ---
> >  kernel/workqueue.c | 48 ++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 40 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 0a95ae14d46f..79cc87df0cda 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -5019,16 +5019,18 @@ static void rebind_workers(struct worker_pool *pool)
> >  }
> >
> >  /**
> > - * restore_unbound_workers_cpumask - restore cpumask of unbound workers
> > + * update_unbound_workers_cpumask - update cpumask of unbound workers
> >   * @pool: unbound pool of interest
> > - * @cpu: the CPU which is coming up
> > + * @online: whether @cpu is coming up or going down
> > + * @cpu: the CPU which is coming up or going down
> >   *
> >   * An unbound pool may end up with a cpumask which doesn't have any online
> > - * CPUs.  When a worker of such pool get scheduled, the scheduler resets
> > - * its cpus_allowed.  If @cpu is in @pool's cpumask which didn't have any
> > - * online CPU before, cpus_allowed of all its workers should be restored.
> > + * CPUs.  We have to reset workers' cpus_allowed of such pool.  And we
> > + * restore the workers' cpus_allowed when the pool's cpumask has online
> > + * CPU.
> >   */
> > -static void restore_unbound_workers_cpumask(struct worker_pool *pool, int cpu)
> > +static void update_unbound_workers_cpumask(struct worker_pool *pool,
> > +                                        bool online, int cpu)
> >  {
> >       static cpumask_t cpumask;
> >       struct worker *worker;
> > @@ -5042,6 +5044,23 @@ static void restore_unbound_workers_cpumask(struct worker_pool *pool, int cpu)
> >
> >       cpumask_and(&cpumask, pool->attrs->cpumask, wq_online_cpumask);
> >
> > +     if (!online) {
> > +             if (cpumask_weight(&cpumask) > 0)
> > +                     return;
>
> We can apply the weight check also to the online case.
>
> > +             /*
> > +              * All unbound workers can be possibly "per cpu kthread"
> > +              * if this is the only online CPU in pool->attrs->cpumask
> > +              * from the last time it has been brought up until now.
> > +              * And the scheduler won't break affinity on the "per cpu
> > +              * kthread" workers when the CPU is going down, so we have
> > +              * to do it by our own.
> > +              */
> > +             for_each_pool_worker(worker, pool)
> > +                     WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
> > +
> > +             return;
> > +     }
> > +
> >       /* as we're called from CPU_ONLINE, the following shouldn't fail */
> >       for_each_pool_worker(worker, pool)
> >               WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, &cpumask) < 0);
>
> What is the reason that pool->attrs->cpumask is not restored if it is
> not a typo, given that restore appears in the change to the above doc?

reason:

d945b5e9f0e3 ("workqueue: Fix setting affinity of unbound worker
threads").

I don't like this change either, but I don't want to touch it
now.  I will improve it late by moving handling for unbound wq/pool/worker
to a work item (out of cpu hotplug processing) and so that
we can restore pool->attrs->cpumask to workers.

The reason is also the reason I drop the patch1 of the V2 patch.

Did you see any problem with d945b5e9f0e3 except for that it does not
update the comment and it is not so efficient.

>
> BTW is there a git tree available with this patchset tucked in?
>
> > @@ -5075,7 +5094,7 @@ int workqueue_online_cpu(unsigned int cpu)
> >               if (pool->cpu == cpu)
> >                       rebind_workers(pool);
> >               else if (pool->cpu < 0)
> > -                     restore_unbound_workers_cpumask(pool, cpu);
> > +                     update_unbound_workers_cpumask(pool, true, cpu);
> >
> >               mutex_unlock(&wq_pool_attach_mutex);
> >       }
> > @@ -5090,7 +5109,9 @@ int workqueue_online_cpu(unsigned int cpu)
> >
> >  int workqueue_offline_cpu(unsigned int cpu)
> >  {
> > +     struct worker_pool *pool;
> >       struct workqueue_struct *wq;
> > +     int pi;
> >
> >       /* unbinding per-cpu workers should happen on the local CPU */
> >       if (WARN_ON(cpu != smp_processor_id()))
> > @@ -5098,9 +5119,20 @@ int workqueue_offline_cpu(unsigned int cpu)
> >
> >       unbind_workers(cpu);
> >
> > -     /* update NUMA affinity of unbound workqueues */
> >       mutex_lock(&wq_pool_mutex);
> >       cpumask_clear_cpu(cpu, wq_online_cpumask);
> > +
> > +     /* update CPU affinity of workers of unbound pools */
> > +     for_each_pool(pool, pi) {
> > +             mutex_lock(&wq_pool_attach_mutex);
> > +
> > +             if (pool->cpu < 0)
> > +                     update_unbound_workers_cpumask(pool, false, cpu);
> > +
> > +             mutex_unlock(&wq_pool_attach_mutex);
> > +     }
> > +
> > +     /* update NUMA affinity of unbound workqueues */
> >       list_for_each_entry(wq, &workqueues, list)
> >               wq_update_unbound_numa(wq, cpu);
> >       mutex_unlock(&wq_pool_mutex);
> > --
> > 2.19.1.6.gb485710b

next prev parent reply	other threads:[~2020-12-27 14:05 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-26  2:51 [PATCH -tip V3 0/8] workqueue: break affinity initiatively Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 1/8] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 2/8] workqueue: Manually break affinity on pool detachment Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask Lai Jiangshan
2021-01-04 13:56   ` Peter Zijlstra
2021-01-05  2:41     ` Lai Jiangshan
2021-01-05  2:53       ` Lai Jiangshan
2021-01-05  8:23       ` Lai Jiangshan
2021-01-05 13:17         ` Peter Zijlstra
2021-01-05 14:37           ` Lai Jiangshan
2021-01-05 14:40             ` Lai Jiangshan
2021-01-05 16:24         ` Peter Zijlstra
2020-12-26  2:51 ` [PATCH -tip V3 4/8] workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool Lai Jiangshan
     [not found]   ` <20201226101631.5448-1-hdanton@sina.com>
2020-12-27 14:04     ` Lai Jiangshan [this message]
2020-12-26  2:51 ` [PATCH -tip V3 6/8] workqueue: reorganize workqueue_online_cpu() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 7/8] workqueue: reorganize workqueue_offline_cpu() unbind_workers() Lai Jiangshan
2020-12-26  2:51 ` [PATCH -tip V3 8/8] workqueue: Fix affinity of kworkers when attaching into pool Lai Jiangshan
     [not found]   ` <20201229100639.2086-1-hdanton@sina.com>
2020-12-29 10:13     ` Lai Jiangshan
2021-01-08 11:46 ` [PATCH -tip V3 0/8] workqueue: break affinity initiatively Peter Zijlstra
2021-01-11 10:07   ` Thomas Gleixner
2021-01-11 11:01     ` Peter Zijlstra
2021-01-11 15:00       ` Paul E. McKenney
2021-01-11 17:16       ` Peter Zijlstra
2021-01-11 18:09         ` Paul E. McKenney
2021-01-11 21:50           ` Paul E. McKenney
2021-01-12 17:14             ` Paul E. McKenney
2021-01-12 23:53               ` Paul E. McKenney
2021-01-15  9:11                 ` Peter Zijlstra
2021-01-15 13:04                   ` Peter Zijlstra
2021-01-16  6:00                     ` Lai Jiangshan
2021-01-11 19:21         ` Valentin Schneider
2021-01-11 20:23           ` Peter Zijlstra
2021-01-11 22:47             ` Valentin Schneider
2021-01-12  4:33             ` Lai Jiangshan
2021-01-12 14:53               ` Peter Zijlstra
2021-01-12 15:38                 ` Lai Jiangshan
2021-01-13 11:10                   ` Peter Zijlstra
2021-01-13 12:00                     ` Lai Jiangshan
2021-01-13 12:57                     ` Lai Jiangshan
2021-01-12 17:52               ` Valentin Schneider
2021-01-12 14:57           ` Jens Axboe
2021-01-12 15:51             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJhGHyDM89Kq_Dop-6c8_6B4K545MHMJDxGggpTmjxu4Wuz7zQ@mail.gmail.com \
    --to=jiangshanlai@gmail.com \
    --cc=bristot@redhat.com \
    --cc=cai@redhat.com \
    --cc=decui@microsoft.com \
    --cc=hdanton@sina.com \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.donnefort@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.