From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757937AbbFQQhu (ORCPT ); Wed, 17 Jun 2015 12:37:50 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:47027 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753997AbbFQQhl (ORCPT ); Wed, 17 Jun 2015 12:37:41 -0400 X-Helo: d03dlp03.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Wed, 17 Jun 2015 09:37:31 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: umgwanakikbuti@gmail.com, mingo@elte.hu, ktkhai@parallels.com, rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com, pang.xunlei@linaro.org, oleg@redhat.com, wanpeng.li@linux.intel.com, linux-kernel@vger.kernel.org, Al Viro , Linus Torvalds Subject: Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier() Message-ID: <20150617163731.GD3913@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150611124636.448700267@infradead.org> <20150611124743.374180021@infradead.org> <20150611153341.GK3913@linux.vnet.ibm.com> <20150611214557.GA4249@linux.vnet.ibm.com> <20150617122924.GP3644@twins.programming.kicks-ass.net> <20150617145712.GZ3913@linux.vnet.ibm.com> <20150617154926.GE19282@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150617154926.GE19282@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15061716-0013-0000-0000-00000E7BCA33 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 17, 2015 at 05:49:27PM +0200, Peter Zijlstra wrote: > On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote: > > On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote: > > > I did leave off the READ/WRITE ONCE stuff, because I could not come up > > > with a scenario where it makes a difference -- I appreciate paranoia, > > > but I also think we should not overdo the thing. > > > > I can only conclude that you have not read this document: > > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html > > > > Specifically, please keep in mind that unless you mark either the variable > > or the memory access, the compiler is within its rights to assume that > > there are no concurrent accesses to that variable. For but one example, > > if you do a normal store to a given variable, then the compiler is > > within its rights to use that variable as temporary storage prior to > > that store. And yes, you can reasonably argue that no sane compiler > > would store something else to s->sequence given that it could free up > > a register by storing the incremented value, but the fact remains that > > you have given it permission to do so if it wants. > > This is the "Optimizations without Atomics" section you're referring to. > > It has words such as: "if the compiler can prove they are not accessed > in other threads concurrently" and "This requires escape analysis: the > compiler must see the full scope of the memory location 'p', or must > know that leaf functions don't capture 'p' and aren't used concurrently, > for this optimization to be valid." Yes, but... These qualifiers apply only in cases where the code at hand is not writing to the variable. In this case, it is legal for other threads to be concurrently reading the variable. After all, if the value of the variable is not changing, then the various read-side optimizations have no effect -- even the silly ones, like reading the variable one bit at a time. If the compiler introduces a write into code that was only reading the variable, then it is the compiler's job to ensure that no other thread is reading it. If the compiler were to introduce a write to such a variable that other threads might be reading, then the compiler would have introduced a data race, which would mean that the compiler introduced undefined behavior to correct code. The compiler therefore must be extremely careful when introducing a write to a variable that is only read by the code at hand. (Though there are still some "interesting" escape clauses for the compiler if the variable in question is allocated on the stack and the function can return without transferring control to some function in some other translation unit.) Recall that a data race occurs when there are multiple concurrent normal accesses to a given normal variable, at least one of which is a write. Here normal accesses and normal variables are those that are not marked as atomic (in the C11 sense). Accesses and variables marked as volatile also disable most (perhaps all) of the dangerous optimizations that lead to the undefined behavior. That said, many compiler people hate volatile, and will therefore automatically argue that it is useless in a misguided attempt to convince people not to use it. :-/ On the other hand, if the code is already writing to the variable (as it is in the s->sequence++ case), then if there are any concurrent accesses, the code -already- contains a data race. This data race invokes undefined behavior in the code as written, so the compiler is within its rights to do anything at all, even spawn the proverbial game of rogue. A somewhat more reasonable compiler is within its rights to assume that no one is concurrently reading any normal variable to which the code does a normal write. The compiler is therefore within its rights to use that variable as scratch storage at any time between the most recent read and the write in question. > But then it starts weasel wording and saying that the lack of > std::atomic<> usage implies a lack of concurrency, to which I strongly > object. Heh! The point of std::atomic<> (and of the equivalent C11 syntax) is to force the compiler to suppress optimizations that are unsafe for shared variables. We get more or less the same effect with volatile, protests from compiler people notwithstanding. I often tell the compiler guys that they have to expect make -some- concessions for being 30 years late to the concurrency party, but it nevertheless makes sense to future-proof our code where it is reasonable to do so. All that aside, I agree that "s->sequence++" is relatively low priority, given that the compiler can easily free up a register by storing the actual value. But that might well be a failure of imagination on my part. > Esp. seeing how -ffreestanding does not have access to any of the atomic > stuff since its library bits and not language bits (something which I've > often said was a failure in the spec). Agreed, given that atomics can almost always be inlined, it would be nice if -ffreestanding didn't cut off the compiler's concurrency nose to spite its concurrency face. The reasoning behind it is that it is legal (but, in my opinion, stupid) to create large atomic data structures. The compilers normally introduce locks to implement these, which is inappropriate in free-standing environments. I believe that a better strategy would be for -ffreestanding to implement only those atomics that are machine-word sized, as in ATOMIC_..._LOCK_FREE==2. Thanx, Paul