All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
	mingo@kernel.org, jiangshanlai@gmail.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
	rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
	fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH tip/core/rcu 14/26] rcutorture: Judge RCU priority boosting on grace periods, not callbacks
Date: Tue, 11 May 2021 16:12:11 -0700	[thread overview]
Message-ID: <20210511231223.2895398-14-paulmck@kernel.org> (raw)
In-Reply-To: <20210511231149.GA2895263@paulmck-ThinkPad-P17-Gen-1>

Currently, rcutorture's testing of RCU priority boosting insists not
only that grace periods complete, but also that callbacks be invoked.
Although this is in fact what the user would want, ensuring that there
is sufficient CPU bandwidth devoted to callback execution is in fact
the user's responsibility.  One could argue that rcutorture can take on
that responsibility, which is true in theory.  But in practice, ensuring
sufficient CPU bandwidth to ksoftirqd, any rcuc kthreads, and any rcuo
kthreads is not particularly consistent with rcutorture's main job,
that of stress-testing RCU.  In addition, if the system administrator
(say) makes very poor choices when pinning rcuo kthreads and then runs
rcutorture, there really isn't much rcutorture can do.

Besides, RCU priority boosting only boosts lagging readers, not all the
machinery required to invoke callbacks in a timely fashion.

This commit therefore switches rcutorture's evaluation of RCU priority
boosting from callback execution to grace-period completion by using
the new start_poll_synchronize_rcu() and poll_state_synchronize_rcu()
functions.  When rcutorture is built in (as in when there is no innocent
workload to inconvenience), the ksoftirqd ktheads are boosted to real-time
priority 2 in order to allow timeouts to work properly in the face of
rcutorture's testing of RCU priority boosting.

Indeed, it is not as easy as it looks to create a reliable test of RCU
priority boosting without destroying the rest of the kernel!

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcutorture.c | 111 ++++++++++++++++++----------------------
 1 file changed, 51 insertions(+), 60 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index bf488f957948..06d08f4f3e52 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -245,12 +245,6 @@ static const char *rcu_torture_writer_state_getname(void)
 	return rcu_torture_writer_state_names[i];
 }
 
-#if defined(CONFIG_RCU_BOOST) && defined(CONFIG_PREEMPT_RT)
-# define rcu_can_boost() 1
-#else
-# define rcu_can_boost() 0
-#endif
-
 #ifdef CONFIG_RCU_TRACE
 static u64 notrace rcu_trace_clock_local(void)
 {
@@ -511,7 +505,7 @@ static struct rcu_torture_ops rcu_ops = {
 	.gp_kthread_dbg	= show_rcu_gp_kthreads,
 	.stall_dur	= rcu_jiffies_till_stall_check,
 	.irq_capable	= 1,
-	.can_boost	= rcu_can_boost(),
+	.can_boost	= IS_ENABLED(CONFIG_RCU_BOOST),
 	.extendables	= RCUTORTURE_MAX_EXTEND,
 	.name		= "rcu"
 };
@@ -891,25 +885,11 @@ static unsigned long rcutorture_seq_diff(unsigned long new, unsigned long old)
 
 /*
  * RCU torture priority-boost testing.  Runs one real-time thread per
- * CPU for moderate bursts, repeatedly registering RCU callbacks and
- * spinning waiting for them to be invoked.  If a given callback takes
- * too long to be invoked, we assume that priority inversion has occurred.
+ * CPU for moderate bursts, repeatedly starting grace periods and waiting
+ * for them to complete.  If a given grace period takes too long, we assume
+ * that priority inversion has occurred.
  */
 
-struct rcu_boost_inflight {
-	struct rcu_head rcu;
-	int inflight;
-};
-
-static void rcu_torture_boost_cb(struct rcu_head *head)
-{
-	struct rcu_boost_inflight *rbip =
-		container_of(head, struct rcu_boost_inflight, rcu);
-
-	/* Ensure RCU-core accesses precede clearing ->inflight */
-	smp_store_release(&rbip->inflight, 0);
-}
-
 static int old_rt_runtime = -1;
 
 static void rcu_torture_disable_rt_throttle(void)
@@ -936,15 +916,18 @@ static void rcu_torture_enable_rt_throttle(void)
 	old_rt_runtime = -1;
 }
 
-static bool rcu_torture_boost_failed(unsigned long start, unsigned long end)
+static bool rcu_torture_boost_failed(unsigned long gp_state, unsigned long start, unsigned long end)
 {
 	static int dbg_done;
 
 	if (end - start > test_boost_duration * HZ - HZ / 2) {
 		VERBOSE_TOROUT_STRING("rcu_torture_boost boosting failed");
 		n_rcu_torture_boost_failure++;
-		if (!xchg(&dbg_done, 1) && cur_ops->gp_kthread_dbg)
+		if (!xchg(&dbg_done, 1) && cur_ops->gp_kthread_dbg) {
+			pr_info("Boost inversion thread ->rt_priority %u gp_state %lu jiffies %lu\n",
+				current->rt_priority, gp_state, end - start);
 			cur_ops->gp_kthread_dbg();
+		}
 
 		return true; /* failed */
 	}
@@ -954,21 +937,20 @@ static bool rcu_torture_boost_failed(unsigned long start, unsigned long end)
 
 static int rcu_torture_boost(void *arg)
 {
-	unsigned long call_rcu_time;
 	unsigned long endtime;
+	unsigned long gp_state;
+	unsigned long gp_state_time;
 	unsigned long oldstarttime;
-	struct rcu_boost_inflight rbi = { .inflight = 0 };
 
 	VERBOSE_TOROUT_STRING("rcu_torture_boost started");
 
 	/* Set real-time priority. */
 	sched_set_fifo_low(current);
 
-	init_rcu_head_on_stack(&rbi.rcu);
 	/* Each pass through the following loop does one boost-test cycle. */
 	do {
 		bool failed = false; // Test failed already in this test interval
-		bool firsttime = true;
+		bool gp_initiated = false;
 
 		/* Increment n_rcu_torture_boosts once per boost-test */
 		while (!kthread_should_stop()) {
@@ -992,33 +974,33 @@ static int rcu_torture_boost(void *arg)
 				goto checkwait;
 		}
 
-		/* Do one boost-test interval. */
+		// Do one boost-test interval.
 		endtime = oldstarttime + test_boost_duration * HZ;
 		while (time_before(jiffies, endtime)) {
-			/* If we don't have a callback in flight, post one. */
-			if (!smp_load_acquire(&rbi.inflight)) {
-				/* RCU core before ->inflight = 1. */
-				smp_store_release(&rbi.inflight, 1);
-				cur_ops->call(&rbi.rcu, rcu_torture_boost_cb);
-				/* Check if the boost test failed */
-				if (!firsttime && !failed)
-					failed = rcu_torture_boost_failed(call_rcu_time, jiffies);
-				call_rcu_time = jiffies;
-				firsttime = false;
+			// Has current GP gone too long?
+			if (gp_initiated && !failed && !cur_ops->poll_gp_state(gp_state))
+				failed = rcu_torture_boost_failed(gp_state, gp_state_time, jiffies);
+			// If we don't have a grace period in flight, start one.
+			if (!gp_initiated || cur_ops->poll_gp_state(gp_state)) {
+				gp_state = cur_ops->start_gp_poll();
+				gp_initiated = true;
+				gp_state_time = jiffies;
 			}
-			if (stutter_wait("rcu_torture_boost"))
+			if (stutter_wait("rcu_torture_boost")) {
 				sched_set_fifo_low(current);
+				// If the grace period already ended,
+				// we don't know when that happened, so
+				// start over.
+				if (cur_ops->poll_gp_state(gp_state))
+					gp_initiated = false;
+			}
 			if (torture_must_stop())
 				goto checkwait;
 		}
 
-		/*
-		 * If boost never happened, then inflight will always be 1, in
-		 * this case the boost check would never happen in the above
-		 * loop so do another one here.
-		 */
-		if (!firsttime && !failed && smp_load_acquire(&rbi.inflight))
-			rcu_torture_boost_failed(call_rcu_time, jiffies);
+		// In case the grace period extended beyond the end of the loop.
+		if (gp_initiated && !failed && !cur_ops->poll_gp_state(gp_state))
+			rcu_torture_boost_failed(gp_state, gp_state_time, jiffies);
 
 		/*
 		 * Set the start time of the next test interval.
@@ -1027,11 +1009,9 @@ static int rcu_torture_boost(void *arg)
 		 * interval.  Besides, we are running at RT priority,
 		 * so delays should be relatively rare.
 		 */
-		while (oldstarttime == boost_starttime &&
-		       !kthread_should_stop()) {
+		while (oldstarttime == boost_starttime && !kthread_should_stop()) {
 			if (mutex_trylock(&boost_mutex)) {
-				boost_starttime = jiffies +
-						  test_boost_interval * HZ;
+				boost_starttime = jiffies + test_boost_interval * HZ;
 				mutex_unlock(&boost_mutex);
 				break;
 			}
@@ -1043,15 +1023,11 @@ checkwait:	if (stutter_wait("rcu_torture_boost"))
 			sched_set_fifo_low(current);
 	} while (!torture_must_stop());
 
-	while (smp_load_acquire(&rbi.inflight))
-		schedule_timeout_uninterruptible(1); // rcu_barrier() deadlocks.
-
 	/* Clean up and exit. */
-	while (!kthread_should_stop() || smp_load_acquire(&rbi.inflight)) {
+	while (!kthread_should_stop()) {
 		torture_shutdown_absorb("rcu_torture_boost");
 		schedule_timeout_uninterruptible(1);
 	}
-	destroy_rcu_head_on_stack(&rbi.rcu);
 	torture_kthread_stopping("rcu_torture_boost");
 	return 0;
 }
@@ -2643,7 +2619,7 @@ static bool rcu_torture_can_boost(void)
 
 	if (!(test_boost == 1 && cur_ops->can_boost) && test_boost != 2)
 		return false;
-	if (!cur_ops->call)
+	if (!cur_ops->start_gp_poll || !cur_ops->poll_gp_state)
 		return false;
 
 	prio = rcu_get_gp_kthreads_prio();
@@ -2651,7 +2627,7 @@ static bool rcu_torture_can_boost(void)
 		return false;
 
 	if (prio < 2) {
-		if (boost_warn_once  == 1)
+		if (boost_warn_once == 1)
 			return false;
 
 		pr_alert("%s: WARN: RCU kthread priority too low to test boosting.  Skipping RCU boost test. Try passing rcutree.kthread_prio > 1 on the kernel command line.\n", KBUILD_MODNAME);
@@ -3129,6 +3105,21 @@ rcu_torture_init(void)
 		if (firsterr < 0)
 			goto unwind;
 		rcutor_hp = firsterr;
+
+		// Testing RCU priority boosting requires rcutorture do
+		// some serious abuse.  Counter this by running ksoftirqd
+		// at higher priority.
+		if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
+			for_each_online_cpu(cpu) {
+				struct sched_param sp;
+				struct task_struct *t;
+
+				t = per_cpu(ksoftirqd, cpu);
+				WARN_ON_ONCE(!t);
+				sp.sched_priority = 2;
+				sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+			}
+		}
 	}
 	shutdown_jiffies = jiffies + shutdown_secs * HZ;
 	firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
-- 
2.31.1.189.g2e36527f23


  parent reply	other threads:[~2021-05-11 23:13 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11 23:11 [PATCH tip/core/rcu 0/26] Torture-test updates for v5.14 Paul E. McKenney
2021-05-11 23:11 ` [PATCH tip/core/rcu 01/26] torture: Fix remaining erroneous torture.sh instance of $* Paul E. McKenney
2021-05-11 23:11 ` [PATCH tip/core/rcu 02/26] torture: Add "scenarios" option to kvm.sh --dryrun parameter Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 03/26] torture: Make kvm-again.sh use "scenarios" rather than "batches" file Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 04/26] refscale: Allow CPU hotplug to be enabled Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 05/26] rcuscale: " Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 06/26] torture: Add kvm-remote.sh script for distributed rcutorture test runs Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 07/26] refscale: Add acqrel, lock, and lock-irq Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 08/26] rcutorture: Abstract read-lock-held checks Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 09/26] torture: Fix grace-period rate output Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 10/26] torture: Abstract end-of-run summary Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 11/26] torture: Make kvm.sh use abstracted kvm-end-run-stats.sh Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 12/26] torture: Make the build machine control N in "make -jN" Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 13/26] torture: Make kvm-find-errors.sh account for kvm-remote.sh Paul E. McKenney
2021-05-11 23:12 ` Paul E. McKenney [this message]
2021-05-11 23:12 ` [PATCH tip/core/rcu 15/26] torture: Correctly fetch number of CPUs for non-English languages Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 16/26] torture: Set kvm.sh language to English Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 17/26] rcutorture: Delay-based false positives for RCU priority boosting tests Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 18/26] rcutorture: Consolidate rcu_torture_boost() timing and statistics Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 19/26] rcutorture: Make rcu_torture_boost_failed() check for GP end Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 20/26] rcutorture: Add BUSTED-BOOST to test RCU priority boosting tests Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 21/26] rcutorture: Forgive RCU boost failures when CPUs don't pass through QS Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 22/26] rcutorture: Don't count CPU-stalled time against priority boosting Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 23/26] torture: Make kvm-remote.sh account for network failure in pathname checks Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 24/26] torture: Don't cap remote runs by build-system number of CPUs Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 25/26] rcutorture: Move mem_dump_obj() tests into separate function Paul E. McKenney
2021-05-11 23:12 ` [PATCH tip/core/rcu 26/26] rcu: Don't penalize priority boosting when there is nothing to boost Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 01/10] kcsan: Add pointer to access-marking.txt to data_race() bullet Paul E. McKenney
2021-05-13 10:47   ` Akira Yokosawa
2021-05-13 10:53     ` Marco Elver
2021-05-13 17:50       ` Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 02/10] kcsan: Simplify value change detection Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 03/10] kcsan: Distinguish kcsan_report() calls Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 04/10] kcsan: Refactor passing watchpoint/other_info Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 05/10] kcsan: Fold panic() call into print_report() Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 06/10] kcsan: Refactor access_info initialization Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 07/10] kcsan: Remove reporting indirection Paul E. McKenney
2021-05-11 23:23 ` [PATCH tip/core/rcu 08/10] kcsan: Remove kcsan_report_type Paul E. McKenney
2021-05-11 23:24 ` [PATCH tip/core/rcu 09/10] kcsan: Report observed value changes Paul E. McKenney
2021-05-11 23:24 ` [PATCH tip/core/rcu 10/10] kcsan: Document "value changed" line Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210511231223.2895398-14-paulmck@kernel.org \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.