From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752183AbbIOMlu (ORCPT ); Tue, 15 Sep 2015 08:41:50 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:49555 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751640AbbIOMls (ORCPT ); Tue, 15 Sep 2015 08:41:48 -0400 X-Helo: d03dlp01.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Tue, 15 Sep 2015 05:41:42 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Davidlohr Bueso , Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org, Davidlohr Bueso Subject: Re: [PATCH -tip 2/3] sched/wake_q: Relax to acquire semantics Message-ID: <20150915124142.GF4029@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1442216244-4409-1-git-send-email-dave@stgolabs.net> <1442216244-4409-2-git-send-email-dave@stgolabs.net> <20150914123241.GR18489@twins.programming.kicks-ass.net> <20150914210806.GG19736@linux-q0g1.site> <20150915094949.GA16853@twins.programming.kicks-ass.net> <20150915095512.GA18673@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150915095512.GA18673@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15091512-0033-0000-0000-000005E1EF08 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 15, 2015 at 11:55:12AM +0200, Peter Zijlstra wrote: > On Tue, Sep 15, 2015 at 11:49:49AM +0200, Peter Zijlstra wrote: > > On Mon, Sep 14, 2015 at 02:08:06PM -0700, Davidlohr Bueso wrote: > > > On Mon, 14 Sep 2015, Peter Zijlstra wrote: > > > > > > >On Mon, Sep 14, 2015 at 12:37:23AM -0700, Davidlohr Bueso wrote: > > > >> /* > > > >>+ * Atomically grab the task. If ->wake_q is non-nil (failed cmpxchg) > > > >>+ * then the task is already queued (by us or someone else) and will > > > >>+ * get the wakeup due to that. > > > >> * > > > >>+ * Use acquire semantics to add the next pointer, which pairs with the > > > >>+ * write barrier implied by the wakeup in wake_up_list(). > > > >> */ > > > >>+ if (cmpxchg_acquire(&node->next, NULL, WAKE_Q_TAIL)) > > > >> return; > > > >> > > > >> get_task_struct(task); > > > > > > > >I'm not seeing a _why_ on the acquire semantics. Not saying the patch is > > > >wrong, just saying I want words on why acquire is correct. > > > > > > Well, I was just taking advantage of removing the upper barrier. Considering > > > that the formal semantics, you are right that we need not actual acquire per-se > > > (ie for node->next) but instead merely ensure a barrier in wake_q_add(). This is > > > kind of why I had hinted of going full _relaxed(). We could also rephrase the > > > comment, something like: > > > > > > * Use ACQUIRE semantics to add the next pointer, such that > > > * wake_q_add() implies a full barrier. This pairs with the > > > * write barrier implied by the wakeup in wake_up_list(). > > > */ > > > > > > What do you think? > > > > Still befuddled. I'm thinking that if you want to remove a barrier, > > you'd remove that second and keep the first. That is RELEASE. > > > > That way, you know the stores prior to the wake queue are done by the > > time you observe the queued entry, and therefore (transitively) know > > those stores are done by the time you do the actual wakeup. > > > > Two issues with that though; firstly RELEASE is not actually guaranteed > > to be transitive -- now the only arch that does not implement it with a > > full barrier is ARGH64, so we could just ask Will, but I'm not sure its > > 'good' to start relying on this. > > Never mind, the PPC people will implement this with lwsync and that is > very much not transitive IIRC. I am probably lost on context, but... It turns out that lwsync is transitive in special cases. One of them is a series of release-acquire pairs, which can extend indefinitely. Does that help in this case? Thanx, Paul > That said, you could do: > > smp_mb__before_atomic(); > cmpxchg_relaxed(); > > Which would still be a full barrier and therefore transitive. However > this point still stands: > > > Secondly, the wake queues are not concurrent, they're in context, so I > > don't see ordering matter at all. The only reason its a cmpxchg() is > > because there is the (small) possibility of two contexts wanting to wake > > the same task, and we use task_struct storage for the queue. > > I don't think we need _any_ barriers here, unless we have concurrent > users of the wake queues (or want to allow any, do we?). >