From: Gabriele Monaco <gmonaco@redhat.com>
To: linux-kernel@vger.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Shuah Khan <shuah@kernel.org>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Subject: [PATCH v6 0/3] sched: Restructure task_mm_cid_work for predictability
Date: Mon, 10 Feb 2025 16:32:49 +0100 [thread overview]
Message-ID: <20250210153253.460471-1-gmonaco@redhat.com> (raw)
This patchset moves the task_mm_cid_work to a preemptible and migratable
context. This reduces the impact of this task to the scheduling latency
of real time tasks.
The change makes the recurrence of the task a bit more predictable.
We also add optimisation and fixes to make sure the task_mm_cid_work
works as intended.
The behaviour causing latency was introduced in commit 223baf9d17f2
("sched: Fix performance regression introduced by mm_cid") which
introduced a task work tied to the scheduler tick.
That approach presents two possible issues:
* the task work runs before returning to user and causes, in fact, a
scheduling latency (with order of magnitude significant in PREEMPT_RT)
* periodic tasks with short runtime are less likely to run during the
tick, hence they might not run the task work at all
Patch 1 allows the mm_cids to be actually compacted when a process
reduces its number of threads, which was not the case since the same
mm_cids were reused to improve cache locality, more details in [4].
Patch 2 contains the main changes, removing the task_work on the
scheduler tick and using a delayed_work instead.
Additionally, we terminate the call immediately if we see that no mm_cid
is actually active, which could happen on processes sleeping for long
time or which exited but whose mm has not been freed yet.
Patch 3 adds a selftest to validate the functionality of the
task_mm_cid_work (i.e. to compact the mm_cids). The test fails if patch
1 is not applied and is flaky without patch 2. We expect it to always
pass with the entire patchset applied.
Changes since V5:
* Punctuation
Changes since V4 [1]:
* Fixes on the selftest
* Polished memory allocation and cleanup
* Handle the test failure in main
Changes since V3 [2]:
* Fixes on the selftest
* Minor style issues in comments and indentation
* Use of perror where possible
* Add a barrier to align threads execution
* Improve test failure and error handling
Changes since V2 [3]:
* Change the order of the patches
* Merge patches changing the main delayed_work logic
* Improved self-test to spawn 1 less thread and use the main one instead
Changes since V1 [4]:
* Re-arm the delayed_work at each invocation
* Cancel the work synchronously at mmdrop
* Remove next scan fields and completely rely on the delayed_work
* Shrink mm_cid allocation with nr thread/affinity (Mathieu Desnoyers)
* Add self test
Overhead comparison in [4]
[1] - https://lore.kernel.org/lkml/20250113074231.61638-4-gmonaco@redhat.com
[2] - https://lore.kernel.org/lkml/20241216130909.240042-1-gmonaco@redhat.com
[3] - https://lore.kernel.org/lkml/20241213095407.271357-1-gmonaco@redhat.com
[4] - https://lore.kernel.org/lkml/20241205083110.180134-2-gmonaco@redhat.com
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
To: Ingo Molnar <mingo@kernel.org>
To: Paul E. McKenney <paulmck@kernel.org>
To: Shuah Khan <shuah@kernel.org>
Gabriele Monaco (2):
sched: Move task_mm_cid_work to mm delayed work
rseq/selftests: Add test for mm_cid compaction
Mathieu Desnoyers (1):
sched: Compact RSEQ concurrency IDs with reduced threads and affinity
include/linux/mm_types.h | 23 +-
include/linux/sched.h | 1 -
kernel/sched/core.c | 66 +-----
kernel/sched/sched.h | 32 ++-
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 200 ++++++++++++++++++
7 files changed, 246 insertions(+), 79 deletions(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
--
2.48.1
next reply other threads:[~2025-02-10 15:33 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-10 15:32 Gabriele Monaco [this message]
2025-02-10 15:32 ` [PATCH v6 1/3] sched: Compact RSEQ concurrency IDs with reduced threads and affinity Gabriele Monaco
2025-02-13 14:56 ` Mathieu Desnoyers
2025-02-18 7:53 ` Peter Zijlstra
2025-02-20 15:08 ` [tip: sched/urgent] " tip-bot2 for Mathieu Desnoyers
2025-02-10 15:32 ` [PATCH v6 2/3] sched: Move task_mm_cid_work to mm delayed work Gabriele Monaco
2025-02-13 6:52 ` kernel test robot
2025-02-13 13:25 ` Gabriele Monaco
2025-02-13 13:55 ` Mathieu Desnoyers
2025-02-13 14:37 ` Gabriele Monaco
2025-02-13 14:52 ` Mathieu Desnoyers
2025-02-13 14:54 ` Gabriele Monaco
2025-02-13 17:31 ` Mathieu Desnoyers
2025-02-14 6:44 ` Gabriele Monaco
2025-02-10 15:32 ` [PATCH v6 3/3] rseq/selftests: Add test for mm_cid compaction Gabriele Monaco
2025-02-10 16:09 ` [PATCH v6 0/3] sched: Restructure task_mm_cid_work for predictability Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250210153253.460471-1-gmonaco@redhat.com \
--to=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.