All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux@armlinux.org.uk, mingo@redhat.com, will.deacon@arm.com,
	arnd@arndb.de, longman@redhat.com, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, guohanjun@huawei.com,
	jglauber@marvell.com, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, dave.dice@oracle.com
Subject: Re: [PATCH v14 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Tue, 13 Apr 2021 13:30:06 +0200	[thread overview]
Message-ID: <YHWAvjymlE5svU71@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20210401153156.1165900-4-alex.kogan@oracle.com>

On Thu, Apr 01, 2021 at 11:31:53AM -0400, Alex Kogan wrote:

> +/*
> + * cna_splice_tail -- splice the next node from the primary queue onto
> + * the secondary queue.
> + */
> +static void cna_splice_next(struct mcs_spinlock *node,
> +			    struct mcs_spinlock *next,
> +			    struct mcs_spinlock *nnext)

You forgot to update the comment when you changed the name on this
thing.

> +/*
> + * cna_order_queue - check whether the next waiter in the main queue is on
> + * the same NUMA node as the lock holder; if not, and it has a waiter behind
> + * it in the main queue, move the former onto the secondary queue.
> + */
> +static void cna_order_queue(struct mcs_spinlock *node)
> +{
> +	struct mcs_spinlock *next = READ_ONCE(node->next);
> +	struct cna_node *cn = (struct cna_node *)node;
> +	int numa_node, next_numa_node;
> +
> +	if (!next) {
> +		cn->partial_order = LOCAL_WAITER_NOT_FOUND;
> +		return;
> +	}
> +
> +	numa_node = cn->numa_node;
> +	next_numa_node = ((struct cna_node *)next)->numa_node;
> +
> +	if (next_numa_node != numa_node) {
> +		struct mcs_spinlock *nnext = READ_ONCE(next->next);
> +
> +		if (nnext) {
> +			cna_splice_next(node, next, nnext);
> +			next = nnext;
> +		}
> +		/*
> +		 * Inherit NUMA node id of primary queue, to maintain the
> +		 * preference even if the next waiter is on a different node.
> +		 */
> +		((struct cna_node *)next)->numa_node = numa_node;
> +	}
> +}

So the obvious change since last time I looked a this is that it now
only looks 1 entry ahead. Which makes sense I suppose.

I'm not really a fan of the 'partial_order' name combined with that
silly enum { LOCAL_WAITER_FOUND, LOCAL_WAITER_NOT_FOUND }. That's just
really bad naming all around. The enum is about having a waiter while
the variable is about partial order, that doesn't match at all.

If you rename the variable to 'has_waiter' and simply use 0,1 values,
things would be ever so more readable. But I don't think that makes
sense, see below.

I'm also not sure about that whole numa_node thing, why would you
over-write the numa node, why at this point ?

> +
> +/* Abuse the pv_wait_head_or_lock() hook to get some work done */
> +static __always_inline u32 cna_wait_head_or_lock(struct qspinlock *lock,
> +						 struct mcs_spinlock *node)
> +{
> +	/*
> +	 * Try and put the time otherwise spent spin waiting on
> +	 * _Q_LOCKED_PENDING_MASK to use by sorting our lists.
> +	 */
> +	cna_order_queue(node);
> +
> +	return 0; /* we lied; we didn't wait, go do so now */

So here we inspect one entry ahead and then quit. I can't rmember, but
did we try something like:

	/*
	 * Try and put the time otherwise spent spin waiting on
	 * _Q_LOCKED_PENDING_MASK to use by sorting our lists.
	 * Move one entry at a go until either the list is fully
	 * sorted or we ran out of spin condition.
	 */
	while (READ_ONCE(lock->val) & _Q_LOCKED_PENDING_MASK &&
	       node->partial_order)
		cna_order_queue(node);

	return 0;

This will keep moving @next to the remote list until such a time that
we're forced to continue or @next is local.

> +}
> +
> +static inline void cna_lock_handoff(struct mcs_spinlock *node,
> +				 struct mcs_spinlock *next)
> +{
> +	struct cna_node *cn = (struct cna_node *)node;
> +	u32 val = 1;
> +
> +	u32 partial_order = cn->partial_order;
> +
> +	if (partial_order == LOCAL_WAITER_NOT_FOUND)
> +		cna_order_queue(node);
> +

AFAICT this is where playing silly games with ->numa_node belong; but
right along with that goes a comment that describes why any of that
makes sense.

I mean, if you leave your node, for any reason, why bother coming back
to it, why not accept it is a sign of the gods you're overdue for a
node-change?

Was the efficacy of this scheme tested?

> +	/*
> +	 * We have a local waiter, either real or fake one;
> +	 * reload @next in case it was changed by cna_order_queue().
> +	 */
> +	next = node->next;
> +	if (node->locked > 1)
> +		val = node->locked;	/* preseve secondary queue */

IIRC we used to do:

	val |= node->locked;

Which is simpler for not having branches. Why change a good thing?

> +
> +	arch_mcs_lock_handoff(&next->locked, val);
> +}

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Alex Kogan <alex.kogan@oracle.com>
Cc: linux@armlinux.org.uk, mingo@redhat.com, will.deacon@arm.com,
	arnd@arndb.de, longman@redhat.com, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, guohanjun@huawei.com,
	jglauber@marvell.com, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, dave.dice@oracle.com
Subject: Re: [PATCH v14 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock
Date: Tue, 13 Apr 2021 13:30:06 +0200	[thread overview]
Message-ID: <YHWAvjymlE5svU71@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20210401153156.1165900-4-alex.kogan@oracle.com>

On Thu, Apr 01, 2021 at 11:31:53AM -0400, Alex Kogan wrote:

> +/*
> + * cna_splice_tail -- splice the next node from the primary queue onto
> + * the secondary queue.
> + */
> +static void cna_splice_next(struct mcs_spinlock *node,
> +			    struct mcs_spinlock *next,
> +			    struct mcs_spinlock *nnext)

You forgot to update the comment when you changed the name on this
thing.

> +/*
> + * cna_order_queue - check whether the next waiter in the main queue is on
> + * the same NUMA node as the lock holder; if not, and it has a waiter behind
> + * it in the main queue, move the former onto the secondary queue.
> + */
> +static void cna_order_queue(struct mcs_spinlock *node)
> +{
> +	struct mcs_spinlock *next = READ_ONCE(node->next);
> +	struct cna_node *cn = (struct cna_node *)node;
> +	int numa_node, next_numa_node;
> +
> +	if (!next) {
> +		cn->partial_order = LOCAL_WAITER_NOT_FOUND;
> +		return;
> +	}
> +
> +	numa_node = cn->numa_node;
> +	next_numa_node = ((struct cna_node *)next)->numa_node;
> +
> +	if (next_numa_node != numa_node) {
> +		struct mcs_spinlock *nnext = READ_ONCE(next->next);
> +
> +		if (nnext) {
> +			cna_splice_next(node, next, nnext);
> +			next = nnext;
> +		}
> +		/*
> +		 * Inherit NUMA node id of primary queue, to maintain the
> +		 * preference even if the next waiter is on a different node.
> +		 */
> +		((struct cna_node *)next)->numa_node = numa_node;
> +	}
> +}

So the obvious change since last time I looked a this is that it now
only looks 1 entry ahead. Which makes sense I suppose.

I'm not really a fan of the 'partial_order' name combined with that
silly enum { LOCAL_WAITER_FOUND, LOCAL_WAITER_NOT_FOUND }. That's just
really bad naming all around. The enum is about having a waiter while
the variable is about partial order, that doesn't match at all.

If you rename the variable to 'has_waiter' and simply use 0,1 values,
things would be ever so more readable. But I don't think that makes
sense, see below.

I'm also not sure about that whole numa_node thing, why would you
over-write the numa node, why at this point ?

> +
> +/* Abuse the pv_wait_head_or_lock() hook to get some work done */
> +static __always_inline u32 cna_wait_head_or_lock(struct qspinlock *lock,
> +						 struct mcs_spinlock *node)
> +{
> +	/*
> +	 * Try and put the time otherwise spent spin waiting on
> +	 * _Q_LOCKED_PENDING_MASK to use by sorting our lists.
> +	 */
> +	cna_order_queue(node);
> +
> +	return 0; /* we lied; we didn't wait, go do so now */

So here we inspect one entry ahead and then quit. I can't rmember, but
did we try something like:

	/*
	 * Try and put the time otherwise spent spin waiting on
	 * _Q_LOCKED_PENDING_MASK to use by sorting our lists.
	 * Move one entry at a go until either the list is fully
	 * sorted or we ran out of spin condition.
	 */
	while (READ_ONCE(lock->val) & _Q_LOCKED_PENDING_MASK &&
	       node->partial_order)
		cna_order_queue(node);

	return 0;

This will keep moving @next to the remote list until such a time that
we're forced to continue or @next is local.

> +}
> +
> +static inline void cna_lock_handoff(struct mcs_spinlock *node,
> +				 struct mcs_spinlock *next)
> +{
> +	struct cna_node *cn = (struct cna_node *)node;
> +	u32 val = 1;
> +
> +	u32 partial_order = cn->partial_order;
> +
> +	if (partial_order == LOCAL_WAITER_NOT_FOUND)
> +		cna_order_queue(node);
> +

AFAICT this is where playing silly games with ->numa_node belong; but
right along with that goes a comment that describes why any of that
makes sense.

I mean, if you leave your node, for any reason, why bother coming back
to it, why not accept it is a sign of the gods you're overdue for a
node-change?

Was the efficacy of this scheme tested?

> +	/*
> +	 * We have a local waiter, either real or fake one;
> +	 * reload @next in case it was changed by cna_order_queue().
> +	 */
> +	next = node->next;
> +	if (node->locked > 1)
> +		val = node->locked;	/* preseve secondary queue */

IIRC we used to do:

	val |= node->locked;

Which is simpler for not having branches. Why change a good thing?

> +
> +	arch_mcs_lock_handoff(&next->locked, val);
> +}

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-04-13 11:30 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 15:31 [PATCH v14 0/6] Add NUMA-awareness to qspinlock Alex Kogan
2021-04-01 15:31 ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 1/6] locking/qspinlock: Rename mcs lock/unlock macros and make them more generic Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 2/6] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-13 11:30   ` Peter Zijlstra [this message]
2021-04-13 11:30     ` Peter Zijlstra
2021-04-14  2:29     ` [External] : " Alex Kogan
2021-04-14  2:29       ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 4/6] locking/qspinlock: Introduce starvation avoidance into CNA Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-13  6:03   ` Andi Kleen
2021-04-13  6:03     ` Andi Kleen
2021-04-13  6:12     ` Andi Kleen
2021-04-13  6:12       ` Andi Kleen
2021-04-13 21:01     ` [External] : " Alex Kogan
2021-04-13 21:01       ` Alex Kogan
2021-04-13 21:22       ` Andi Kleen
2021-04-13 21:22         ` Andi Kleen
2021-04-14  2:30         ` Alex Kogan
2021-04-14  2:30           ` Alex Kogan
2021-04-14 16:41       ` Waiman Long
2021-04-14 16:41         ` Waiman Long
2021-04-14 17:26         ` Andi Kleen
2021-04-14 17:26           ` Andi Kleen
2021-04-14 17:31           ` Waiman Long
2021-04-14 17:31             ` Waiman Long
2021-04-13 12:03   ` Peter Zijlstra
2021-04-13 12:03     ` Peter Zijlstra
2021-04-16  2:52     ` [External] : " Alex Kogan
2021-04-16  2:52       ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 5/6] locking/qspinlock: Avoid moving certain threads between waiting queues in CNA Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-01 15:31 ` [PATCH v14 6/6] locking/qspinlock: Introduce the shuffle reduction optimization into CNA Alex Kogan
2021-04-01 15:31   ` Alex Kogan
2021-04-14  7:47   ` Andreas Herrmann
2021-04-14  7:47     ` Andreas Herrmann
2021-04-14 18:18     ` [External] : " Alex Kogan
2021-04-14 18:18       ` Alex Kogan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YHWAvjymlE5svU71@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=jglauber@marvell.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.