[PATCH v6 0/6] locking/qspinlock: Enhance pvqspinlock performance

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <Waiman.Long@hpe.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Scott J Norton <scott.norton@hp.com>,
	Douglas Hatch <doug.hatch@hp.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Waiman Long <Waiman.Long@hpe.com>
Subject: [PATCH v6 0/6] locking/qspinlock: Enhance pvqspinlock performance
Date: Fri, 11 Sep 2015 14:37:32 -0400	[thread overview]
Message-ID: <1441996658-62854-1-git-send-email-Waiman.Long@hpe.com> (raw)

v5->v6:
 - Added a new patch 1 to relax the cmpxchg and xchg operations in
   the native code path to reduce performance overhead on non-x86
   architectures.
 - Updated the unconditional PV kick patch as suggested by PeterZ.
 - Added a new patch to allow one lock stealing attempt at slowpath
   entry point to reduce performance penalty due to lock waiter
   preemption.
 - Removed the pending bit and kick-ahead patches as they didn't show
   any noticeable performance improvement on top of the lock stealing
   patch.
 - Simplified the adaptive spinning patch as the lock stealing patch
   allows more aggressive pv_wait() without much performance penalty
   in non-overcommitted VMs.

v4->v5:
 - Rebased the patch to the latest tip tree.
 - Corrected the comments and commit log for patch 1.
 - Removed the v4 patch 5 as PV kick deferment is no longer needed with
   the new tip tree.
 - Simplified the adaptive spinning patch (patch 6) & improve its
   performance a bit further.
 - Re-ran the benchmark test with the new patch.

v3->v4:
 - Patch 1: add comment about possible racing condition in PV unlock.
 - Patch 2: simplified the pv_pending_lock() function as suggested by
   Davidlohr.
 - Move PV unlock optimization patch forward to patch 4 & rerun
   performance test.

v2->v3:
 - Moved deferred kicking enablement patch forward & move back
   the kick-ahead patch to make the effect of kick-ahead more visible.
 - Reworked patch 6 to make it more readable.
 - Reverted back to use state as a tri-state variable instead of
   adding an additional bistate variable.
 - Added performance data for different values of PV_KICK_AHEAD_MAX.
 - Add a new patch to optimize PV unlock code path performance.

v1->v2:
 - Take out the queued unfair lock patches
 - Add a patch to simplify the PV unlock code
 - Move pending bit and statistics collection patches to the front
 - Keep vCPU kicking in pv_kick_node(), but defer it to unlock time
   when appropriate.
 - Change the wait-early patch to use adaptive spinning to better
   balance the difference effect on normal and over-committed guests.
 - Add patch-to-patch performance changes in the patch commit logs.

This patchset tries to improve the performance of both regular and
over-commmitted VM guests. The adaptive spinning patch was inspired
by the "Do Virtual Machines Really Scale?" blog from Sanidhya Kashyap.

Patch 1 relaxes the memory order restriction of atomic operations by
using less restrictive _acquire and _release variants of cmpxchg()
and xchg(). This will reduce performance overhead when ported to other
non-x86 architectures.

Patch 2 simplifies the unlock code by removing the unnecessary
state check.

Patch 2 adds pending bit support to pvqspinlock improving performance
at light load.

Patch 3 optimizes the PV unlock code path performance for x86-64
architecture.

Patch 4 allows the collection of various count data that are useful
to see what is happening in the system. They do add a bit of overhead
when enabled slowing performance a tiny bit.

Patch 5 allows one lock stealing attempt at slowpath entry. This causes
a pretty big performance improvement for over-committed VM guests.

Patch 6 enables adaptive spinning in the queue nodes. This patch
leads to further performance improvement in over-committed guest,
though it is not as big as the previous patch.

Waiman Long (6):
  locking/qspinlock: relaxes cmpxchg & xchg ops in native code
  locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL
  locking/pvqspinlock, x86: Optimize PV unlock code path
  locking/pvqspinlock: Collect slowpath lock statistics
  locking/pvqspinlock: Allow 1 lock stealing attempt
  locking/pvqspinlock: Queue node adaptive spinning

 arch/x86/Kconfig                          |    7 +
 arch/x86/include/asm/qspinlock.h          |    2 +-
 arch/x86/include/asm/qspinlock_paravirt.h |   59 +++++
 include/asm-generic/qspinlock.h           |    6 +-
 kernel/locking/qspinlock.c                |   45 +++--
 kernel/locking/qspinlock_paravirt.h       |  378 +++++++++++++++++++++++++----
 6 files changed, 431 insertions(+), 66 deletions(-)

next             reply	other threads:[~2015-09-11 18:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 18:37 Waiman Long [this message]
2015-09-11 18:37 ` [PATCH v6 1/6] locking/qspinlock: relaxes cmpxchg & xchg ops in native code Waiman Long
2015-09-11 22:27   ` Davidlohr Bueso
2015-09-14 12:06     ` Peter Zijlstra
2015-09-14 18:40       ` Waiman Long
2015-09-14 15:16     ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 2/6] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL Waiman Long
2015-09-18  8:50   ` [tip:locking/core] locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL tip-bot for Waiman Long
2015-09-11 18:37 ` [PATCH v6 3/6] locking/pvqspinlock, x86: Optimize PV unlock code path Waiman Long
2015-09-11 18:37 ` [PATCH v6 4/6] locking/pvqspinlock: Collect slowpath lock statistics Waiman Long
2015-09-11 23:13   ` Davidlohr Bueso
2015-09-14 15:25     ` Waiman Long
2015-09-14 21:41       ` Davidlohr Bueso
2015-09-15  3:47         ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt Waiman Long
2015-09-14 13:57   ` Peter Zijlstra
2015-09-14 19:02     ` Waiman Long
2015-09-14 14:00   ` Peter Zijlstra
2015-09-14 19:15     ` Waiman Long
2015-09-14 19:38       ` Waiman Long
2015-09-15  8:24       ` Peter Zijlstra
2015-09-15 15:29         ` Waiman Long
2015-09-16 15:01           ` Peter Zijlstra
2015-09-17 15:08             ` Waiman Long
2015-09-14 14:04   ` Peter Zijlstra
2015-09-14 19:19     ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 6/6] locking/pvqspinlock: Queue node adaptive spinning Waiman Long
2015-09-14 14:10   ` Peter Zijlstra
2015-09-14 19:37     ` Waiman Long
2015-09-15  8:38       ` Peter Zijlstra
2015-09-15 15:32         ` Waiman Long
2015-09-16 15:03           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1441996658-62854-1-git-send-email-Waiman.Long@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=dave@stgolabs.net \
    --cc=doug.hatch@hp.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.