From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Maria Yu" <quic_aiquny@quicinc.com>, <ebiederm@xmission.com>
Cc: <kernel@quicinc.com>, <quic_pkondeti@quicinc.com>,
<keescook@chromium.or>, <viro@zeniv.linux.org.uk>,
<brauner@kernel.org>, <oleg@redhat.com>, <dhowells@redhat.com>,
<paul@paul-moore.com>, <jmorris@namei.org>, <serge@hallyn.com>,
<linux-mm@kvack.org>, <linux-fsdevel@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <keyrings@vger.kernel.org>,
<linux-security-module@vger.kernel.org>,
<linux-arm-msm@vger.kernel.org>
Subject: Re: [PATCH] kernel: Introduce a write lock/unlock wrapper for tasklist_lock
Date: Wed, 03 Jan 2024 16:04:54 +0200 [thread overview]
Message-ID: <CY54MOETXVFI.1102C6BQTO36@suppilovahvero> (raw)
In-Reply-To: <20231225081932.17752-1-quic_aiquny@quicinc.com>
On Mon Dec 25, 2023 at 10:19 AM EET, Maria Yu wrote:
> As a rwlock for tasklist_lock, there are multiple scenarios to acquire
> read lock which write lock needed to be waiting for.
> In freeze_process/thaw_processes it can take about 200+ms for holding read
> lock of tasklist_lock by walking and freezing/thawing tasks in commercial
> devices. And write_lock_irq will have preempt disabled and local irq
> disabled to spin until the tasklist_lock can be acquired. This leading to
> a bad responsive performance of current system.
> Take an example:
> 1. cpu0 is holding read lock of tasklist_lock to thaw_processes.
> 2. cpu1 is waiting write lock of tasklist_lock to exec a new thread with
> preempt_disabled and local irq disabled.
> 3. cpu2 is waiting write lock of tasklist_lock to do_exit with
> preempt_disabled and local irq disabled.
> 4. cpu3 is waiting write lock of tasklist_lock to do_exit with
> preempt_disabled and local irq disabled.
> So introduce a write lock/unlock wrapper for tasklist_lock specificly.
> The current taskslist_lock writers all have write_lock_irq to hold
> tasklist_lock, and write_unlock_irq to release tasklist_lock, that means
> the writers are not suitable or workable to wait on tasklist_lock in irq
> disabled scenarios. So the write lock/unlock wrapper here only follow the
> current design of directly use local_irq_disable and local_irq_enable,
> and not take already irq disabled writer callers into account.
> Use write_trylock in the loop and enabled irq for cpu to repsond if lock
> cannot be taken.
>
> Signed-off-by: Maria Yu <quic_aiquny@quicinc.com>
> ---
> fs/exec.c | 10 +++++-----
> include/linux/sched/task.h | 29 +++++++++++++++++++++++++++++
> kernel/exit.c | 16 ++++++++--------
> kernel/fork.c | 6 +++---
> kernel/ptrace.c | 12 ++++++------
> kernel/sys.c | 8 ++++----
> security/keys/keyctl.c | 4 ++--
> 7 files changed, 57 insertions(+), 28 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 4aa19b24f281..030eef6852eb 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1086,7 +1086,7 @@ static int de_thread(struct task_struct *tsk)
>
> for (;;) {
> cgroup_threadgroup_change_begin(tsk);
> - write_lock_irq(&tasklist_lock);
> + write_lock_tasklist_lock();
> /*
> * Do this under tasklist_lock to ensure that
> * exit_notify() can't miss ->group_exec_task
> @@ -1095,7 +1095,7 @@ static int de_thread(struct task_struct *tsk)
> if (likely(leader->exit_state))
> break;
> __set_current_state(TASK_KILLABLE);
> - write_unlock_irq(&tasklist_lock);
> + write_unlock_tasklist_lock();
> cgroup_threadgroup_change_end(tsk);
> schedule();
> if (__fatal_signal_pending(tsk))
> @@ -1150,7 +1150,7 @@ static int de_thread(struct task_struct *tsk)
> */
> if (unlikely(leader->ptrace))
> __wake_up_parent(leader, leader->parent);
> - write_unlock_irq(&tasklist_lock);
> + write_unlock_tasklist_lock();
> cgroup_threadgroup_change_end(tsk);
>
> release_task(leader);
> @@ -1198,13 +1198,13 @@ static int unshare_sighand(struct task_struct *me)
>
> refcount_set(&newsighand->count, 1);
>
> - write_lock_irq(&tasklist_lock);
> + write_lock_tasklist_lock();
> spin_lock(&oldsighand->siglock);
> memcpy(newsighand->action, oldsighand->action,
> sizeof(newsighand->action));
> rcu_assign_pointer(me->sighand, newsighand);
> spin_unlock(&oldsighand->siglock);
> - write_unlock_irq(&tasklist_lock);
> + write_unlock_tasklist_lock();
>
> __cleanup_sighand(oldsighand);
> }
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index a23af225c898..6f69d9a3c868 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -50,6 +50,35 @@ struct kernel_clone_args {
> * a separate lock).
> */
> extern rwlock_t tasklist_lock;
> +
> +/*
> + * Tasklist_lock is a special lock, it takes a good amount of time of
> + * taskslist_lock readers to finish, and the pure write_irq_lock api
> + * will do local_irq_disable at the very first, and put the current cpu
> + * waiting for the lock while is non-responsive for interrupts.
> + *
> + * The current taskslist_lock writers all have write_lock_irq to hold
> + * tasklist_lock, and write_unlock_irq to release tasklist_lock, that
> + * means the writers are not suitable or workable to wait on
> + * tasklist_lock in irq disabled scenarios. So the write lock/unlock
> + * wrapper here only follow the current design of directly use
> + * local_irq_disable and local_irq_enable.
> + */
> +static inline void write_lock_tasklist_lock(void)
> +{
> + while (1) {
> + local_irq_disable();
> + if (write_trylock(&tasklist_lock))
> + break;
> + local_irq_enable();
> + cpu_relax();
> + }
Maybe:
local_irq_disable();
while (!write_trylock(&tasklist_lock)) {
local_irq_enable();
cpu_relax();
local_irq_disable();
}
BR, Jarkko
next prev parent reply other threads:[~2024-01-03 14:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-25 8:19 [PATCH] kernel: Introduce a write lock/unlock wrapper for tasklist_lock Maria Yu
2023-12-25 8:26 ` Aiqun Yu (Maria)
2024-01-03 14:04 ` Jarkko Sakkinen [this message]
-- strict thread matches above, loose matches on Subject: below --
2023-12-13 10:17 Maria Yu
2023-12-13 16:22 ` Matthew Wilcox
2023-12-13 18:27 ` Eric W. Biederman
2023-12-15 5:52 ` Aiqun Yu (Maria)
2023-12-28 22:20 ` Matthew Wilcox
2024-01-02 2:19 ` Aiqun Yu (Maria)
2024-01-02 9:14 ` Matthew Wilcox
2024-01-03 2:58 ` Aiqun Yu (Maria)
2024-01-03 18:18 ` Matthew Wilcox
2024-01-04 0:46 ` Aiqun Yu (Maria)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CY54MOETXVFI.1102C6BQTO36@suppilovahvero \
--to=jarkko@kernel.org \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=ebiederm@xmission.com \
--cc=jmorris@namei.org \
--cc=keescook@chromium.or \
--cc=kernel@quicinc.com \
--cc=keyrings@vger.kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-security-module@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paul@paul-moore.com \
--cc=quic_aiquny@quicinc.com \
--cc=quic_pkondeti@quicinc.com \
--cc=serge@hallyn.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).