All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] locking/qrwlock: Fix interrupt handling problem
@ 2015-06-08 22:20 Waiman Long
  2015-06-08 22:20 ` [PATCH 1/2] locking/qrwlock: Fix bug in interrupt handling code Waiman Long
  2015-06-08 22:20 ` [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING Waiman Long
  0 siblings, 2 replies; 5+ messages in thread
From: Waiman Long @ 2015-06-08 22:20 UTC (permalink / raw
  To: Peter Zijlstra, Ingo Molnar, Arnd Bergmann
  Cc: linux-arch, linux-kernel, Scott J Norton, Douglas Hatch,
	Waiman Long

This patch series contains 2 patches on qrwlock. The first one is just
a recap of the patch that I sent a few weeks ago. The second one is to
optimize the writer slowpath.

Waiman Long (2):
  locking/qrwlock: Fix bug in interrupt handling code
  locking/qrwlock: Don't contend with readers when setting _QW_WAITING

 include/asm-generic/qrwlock.h |    4 +-
 kernel/locking/qrwlock.c      |   42 +++++++++++++++++++++++++++++++---------
 2 files changed, 34 insertions(+), 12 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] locking/qrwlock: Fix bug in interrupt handling code
  2015-06-08 22:20 [PATCH 0/2] locking/qrwlock: Fix interrupt handling problem Waiman Long
@ 2015-06-08 22:20 ` Waiman Long
  2015-06-08 22:20 ` [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING Waiman Long
  1 sibling, 0 replies; 5+ messages in thread
From: Waiman Long @ 2015-06-08 22:20 UTC (permalink / raw
  To: Peter Zijlstra, Ingo Molnar, Arnd Bergmann
  Cc: linux-arch, linux-kernel, Scott J Norton, Douglas Hatch,
	Waiman Long

The qrwlock is fair in the process context, but becoming unfair when
in the interrupt context to support use cases like the tasklist_lock.
However, the unfair code in the interrupt context has problem that
may cause deadlock.

The fast path increments the reader count. In the interrupt context,
the reader in the slowpath will wait until the writer release the
lock. However, if other readers have the lock and the writer is just
in the waiting mode. It will never get the write lock because the
that interrupt context reader has increment the count. This will
cause deadlock.

This patch fixes this problem by checking the state of the
reader/writer count retrieved at the fast path. If the writer
is in waiting mode, the reader will get the lock immediately and
return. Otherwise, it will wait until the writer release the lock
like before.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 include/asm-generic/qrwlock.h |    4 ++--
 kernel/locking/qrwlock.c      |   14 ++++++++------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h
index 6383d54..865d021 100644
--- a/include/asm-generic/qrwlock.h
+++ b/include/asm-generic/qrwlock.h
@@ -36,7 +36,7 @@
 /*
  * External function declarations
  */
-extern void queue_read_lock_slowpath(struct qrwlock *lock);
+extern void queue_read_lock_slowpath(struct qrwlock *lock, u32 cnts);
 extern void queue_write_lock_slowpath(struct qrwlock *lock);
 
 /**
@@ -105,7 +105,7 @@ static inline void queue_read_lock(struct qrwlock *lock)
 		return;
 
 	/* The slowpath will decrement the reader count, if necessary. */
-	queue_read_lock_slowpath(lock);
+	queue_read_lock_slowpath(lock, cnts);
 }
 
 /**
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index 00c12bb..d7d7557 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -43,22 +43,24 @@ rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
  * queue_read_lock_slowpath - acquire read lock of a queue rwlock
  * @lock: Pointer to queue rwlock structure
  */
-void queue_read_lock_slowpath(struct qrwlock *lock)
+void queue_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
 {
-	u32 cnts;
-
 	/*
 	 * Readers come here when they cannot get the lock without waiting
 	 */
 	if (unlikely(in_interrupt())) {
 		/*
-		 * Readers in interrupt context will spin until the lock is
-		 * available without waiting in the queue.
+		 * Readers in interrupt context will get the lock immediately
+		 * if the writer is just waiting (not holding the lock yet)
+		 * or they will spin until the lock is available without
+		 * waiting in the queue.
 		 */
-		cnts = smp_load_acquire((u32 *)&lock->cnts);
+		if ((cnts & _QW_WMASK) != _QW_LOCKED)
+			return;
 		rspin_until_writer_unlock(lock, cnts);
 		return;
 	}
+
 	atomic_sub(_QR_BIAS, &lock->cnts);
 
 	/*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING
  2015-06-08 22:20 [PATCH 0/2] locking/qrwlock: Fix interrupt handling problem Waiman Long
  2015-06-08 22:20 ` [PATCH 1/2] locking/qrwlock: Fix bug in interrupt handling code Waiman Long
@ 2015-06-08 22:20 ` Waiman Long
  2015-06-09 12:04   ` Peter Zijlstra
  1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2015-06-08 22:20 UTC (permalink / raw
  To: Peter Zijlstra, Ingo Molnar, Arnd Bergmann
  Cc: linux-arch, linux-kernel, Scott J Norton, Douglas Hatch,
	Waiman Long

The current cmpxchg() loop in setting the _QW_WAITING flag for writers
in queue_write_lock_slowpath() will contend with incoming readers
causing possibly extra cmpxchg() operations that are wasteful. This
patch changes the code to do a byte cmpxchg() to eliminate contention
with new readers.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/qrwlock.c |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index d7d7557..559198a 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -22,6 +22,26 @@
 #include <linux/hardirq.h>
 #include <asm/qrwlock.h>
 
+/*
+ * This internal data structure is used for optimizing access to some of
+ * the subfields within the atomic_t cnts.
+ */
+struct __qrwlock {
+	union {
+		atomic_t cnts;
+		struct {
+#ifdef __LITTLE_ENDIAN
+			u8 wmode;	/* Writer mode   */
+			u8 rcnts[3];	/* Reader counts */
+#else
+			u8 rcnts[3];	/* Reader counts */
+			u8 wmode;	/* Writer mode   */
+#endif
+		};
+	};
+	arch_spinlock_t	lock;
+};
+
 /**
  * rspin_until_writer_unlock - inc reader count & spin until writer is gone
  * @lock  : Pointer to queue rwlock structure
@@ -109,10 +129,10 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
 	 * or wait for a previous writer to go away.
 	 */
 	for (;;) {
-		cnts = atomic_read(&lock->cnts);
-		if (!(cnts & _QW_WMASK) &&
-		    (atomic_cmpxchg(&lock->cnts, cnts,
-				    cnts | _QW_WAITING) == cnts))
+		struct __qrwlock *l = (struct __qrwlock *)lock;
+
+		if (!READ_ONCE(l->wmode) &&
+		   (cmpxchg(&l->wmode, 0, _QW_WAITING) == 0))
 			break;
 
 		cpu_relax_lowlatency();
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING
  2015-06-08 22:20 ` [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING Waiman Long
@ 2015-06-09 12:04   ` Peter Zijlstra
  2015-06-09 15:23     ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2015-06-09 12:04 UTC (permalink / raw
  To: Waiman Long
  Cc: Ingo Molnar, Arnd Bergmann, linux-arch, linux-kernel,
	Scott J Norton, Douglas Hatch

On Mon, Jun 08, 2015 at 06:20:44PM -0400, Waiman Long wrote:
> The current cmpxchg() loop in setting the _QW_WAITING flag for writers
> in queue_write_lock_slowpath() will contend with incoming readers
> causing possibly extra cmpxchg() operations that are wasteful. This
> patch changes the code to do a byte cmpxchg() to eliminate contention
> with new readers.

This is very narrow, would not the main cost still be the cacheline
transfers?

Do you have any numbers to back this? I would feel much better about
this if there's real numbers attached.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING
  2015-06-09 12:04   ` Peter Zijlstra
@ 2015-06-09 15:23     ` Waiman Long
  0 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2015-06-09 15:23 UTC (permalink / raw
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnd Bergmann, linux-arch, linux-kernel,
	Scott J Norton, Douglas Hatch

On 06/09/2015 08:04 AM, Peter Zijlstra wrote:
> On Mon, Jun 08, 2015 at 06:20:44PM -0400, Waiman Long wrote:
>> The current cmpxchg() loop in setting the _QW_WAITING flag for writers
>> in queue_write_lock_slowpath() will contend with incoming readers
>> causing possibly extra cmpxchg() operations that are wasteful. This
>> patch changes the code to do a byte cmpxchg() to eliminate contention
>> with new readers.
> This is very narrow, would not the main cost still be the cacheline
> transfers?
>
> Do you have any numbers to back this? I would feel much better about
> this if there's real numbers attached.

I have just sent out a v2 patch with the microbenchmark data for the 2nd 
patch. The extra cmpxchg() because of reader contention should have 
about the same cost of a cacheline miss. The performance gain depends on 
how often this kind of reader contention happens.

Regards,
Longman

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-06-09 15:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-08 22:20 [PATCH 0/2] locking/qrwlock: Fix interrupt handling problem Waiman Long
2015-06-08 22:20 ` [PATCH 1/2] locking/qrwlock: Fix bug in interrupt handling code Waiman Long
2015-06-08 22:20 ` [PATCH 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING Waiman Long
2015-06-09 12:04   ` Peter Zijlstra
2015-06-09 15:23     ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.