From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH] xen/HVM: atomically access pointers in bufioreq handling Date: Wed, 22 Jul 2015 09:34:14 -0600 Message-ID: <55AFD416020000780009426D__48797.0154652887$1437579331$gmane$org@prv-mh.provo.novell.com> References: <5582E14F0200007800086A37@mail.emea.novell.com> <55AFBEDC020000780009417A@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZHw2H-0000vz-UT for xen-devel@lists.xenproject.org; Wed, 22 Jul 2015 15:34:22 +0000 In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: xen-devel , qemu-devel@nongnu.org List-Id: xen-devel@lists.xenproject.org >>> On 22.07.15 at 16:50, wrote: > On Wed, 22 Jul 2015, Jan Beulich wrote: >> >> --- a/xen-hvm.c >> >> +++ b/xen-hvm.c >> >> @@ -981,19 +981,30 @@ static void handle_ioreq(XenIOState *sta >> >> >> >> static int handle_buffered_iopage(XenIOState *state) >> >> { >> >> + buffered_iopage_t *buf_page = state->buffered_io_page; >> >> buf_ioreq_t *buf_req = NULL; >> >> ioreq_t req; >> >> int qw; >> >> >> >> - if (!state->buffered_io_page) { >> >> + if (!buf_page) { >> >> return 0; >> >> } >> >> >> >> memset(&req, 0x00, sizeof(req)); >> >> >> >> - while (state->buffered_io_page->read_pointer != state->buffered_io_page->write_pointer) { >> >> - buf_req = &state->buffered_io_page->buf_ioreq[ >> >> - state->buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM]; >> >> + for (;;) { >> >> + uint32_t rdptr = buf_page->read_pointer, wrptr; >> >> + >> >> + xen_rmb(); >> > >> > We don't need this barrier. >> >> How would we not? We need to make sure we read in this order >> read_pointer, write_pointer, and read_pointer again (in the >> comparison). Only that way we can be certain to hold a matching >> pair in hands at the end. > > See below > > >> >> + wrptr = buf_page->write_pointer; >> >> + xen_rmb(); >> >> + if (rdptr != buf_page->read_pointer) { >> > >> > I think you have to use atomic_read to be sure that the second read to >> > buf_page->read_pointer is up to date and not optimized away. >> >> No, suppressing such an optimization is an intended (side) effect >> of the barriers used. > > I understand what you are saying but I am not sure if your assumption > is correct. Can the compiler optimize the second read anyway? No, it can't, due to the barrier. >> > But if I think that it would be best to simply use atomic_read to read >> > both pointers at once using uint64_t as type, so you are sure to get a >> > consistent view and there is no need for this check. >> >> But I'm specifically trying to avoid e.g. a locked cmpxchg8b here on >> ix86. > > OK, but we don't need cmpxchg8b, just: > > #define atomic_read(ptr) (*(__typeof__(*ptr) volatile*) (ptr)) This only gives the impression of being atomic when the type is wider than a machine word. There's no ix86 (i.e. 32-bit) instruction other than LOCK CMPXCHG8B (and possibly MMX/SSE/AVX ones) allowing to atomically read a 64-bit quantity. > something like: > > for (;;) { > uint64_t ptrs; > uint32_t rdptr, wrptr; > > ptrs = atomic_read((uint64_t*)&state->buffered_io_page->read_pointer); > rdptr = (uint32_t)ptrs; > wrptr = *(((uint32_t*)&ptrs) + 1); > > if (rdptr == wrptr) { > continue; > } > > [work] > > atomic_add(&buf_page->read_pointer, qw + 1); > } > > it would work, wouldn't it? Looks like so, but the amount of casts alone makes me wish for no-one to consider this (but I agree that the casts could be taken care of). Still I think (as btw done elsewhere) the lock free access is preferable. Jan