From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52947) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZHw2I-0007nI-ST for qemu-devel@nongnu.org; Wed, 22 Jul 2015 11:34:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZHw2F-0002Si-Ne for qemu-devel@nongnu.org; Wed, 22 Jul 2015 11:34:22 -0400 Received: from prv-mh.provo.novell.com ([137.65.248.74]:60613) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZHw2F-0002SW-HE for qemu-devel@nongnu.org; Wed, 22 Jul 2015 11:34:19 -0400 Message-Id: <55AFD416020000780009426D@prv-mh.provo.novell.com> Date: Wed, 22 Jul 2015 09:34:14 -0600 From: "Jan Beulich" References: <5582E14F0200007800086A37@mail.emea.novell.com> <55AFBEDC020000780009417A@prv-mh.provo.novell.com> In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Subject: Re: [Qemu-devel] [PATCH] xen/HVM: atomically access pointers in bufioreq handling List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefano Stabellini Cc: xen-devel , qemu-devel@nongnu.org >>> On 22.07.15 at 16:50, wrote: > On Wed, 22 Jul 2015, Jan Beulich wrote: >> >> --- a/xen-hvm.c >> >> +++ b/xen-hvm.c >> >> @@ -981,19 +981,30 @@ static void handle_ioreq(XenIOState *sta >> >> =20 >> >> static int handle_buffered_iopage(XenIOState *state) >> >> { >> >> + buffered_iopage_t *buf_page =3D state->buffered_io_page; >> >> buf_ioreq_t *buf_req =3D NULL; >> >> ioreq_t req; >> >> int qw; >> >> =20 >> >> - if (!state->buffered_io_page) { >> >> + if (!buf_page) { >> >> return 0; >> >> } >> >> =20 >> >> memset(&req, 0x00, sizeof(req)); >> >> =20 >> >> - while (state->buffered_io_page->read_pointer !=3D state->buffere= d_io_page->write_pointer) { >> >> - buf_req =3D &state->buffered_io_page->buf_ioreq[ >> >> - state->buffered_io_page->read_pointer % IOREQ_BUFFER_SLO= T_NUM]; >> >> + for (;;) { >> >> + uint32_t rdptr =3D buf_page->read_pointer, wrptr; >> >> + >> >> + xen_rmb(); >> >=20 >> > We don't need this barrier. >>=20 >> How would we not? We need to make sure we read in this order >> read_pointer, write_pointer, and read_pointer again (in the >> comparison). Only that way we can be certain to hold a matching >> pair in hands at the end. >=20 > See below >=20 >=20 >> >> + wrptr =3D buf_page->write_pointer; >> >> + xen_rmb(); >> >> + if (rdptr !=3D buf_page->read_pointer) { >> >=20 >> > I think you have to use atomic_read to be sure that the second read = to >> > buf_page->read_pointer is up to date and not optimized away. >>=20 >> No, suppressing such an optimization is an intended (side) effect >> of the barriers used. >=20 > I understand what you are saying but I am not sure if your assumption > is correct. Can the compiler optimize the second read anyway? No, it can't, due to the barrier. >> > But if I think that it would be best to simply use atomic_read to = read >> > both pointers at once using uint64_t as type, so you are sure to get = a >> > consistent view and there is no need for this check. >>=20 >> But I'm specifically trying to avoid e.g. a locked cmpxchg8b here on >> ix86. >=20 > OK, but we don't need cmpxchg8b, just: >=20 > #define atomic_read(ptr) (*(__typeof__(*ptr) volatile*) (ptr)) This only gives the impression of being atomic when the type is wider than a machine word. There's no ix86 (i.e. 32-bit) instruction other than LOCK CMPXCHG8B (and possibly MMX/SSE/AVX ones) allowing to atomically read a 64-bit quantity. > something like: >=20 > for (;;) { > uint64_t ptrs; > uint32_t rdptr, wrptr; > =20 > ptrs =3D atomic_read((uint64_t*)&state->buffered_io_page->read_point= er); > rdptr =3D (uint32_t)ptrs; > wrptr =3D *(((uint32_t*)&ptrs) + 1); > =20 > if (rdptr =3D=3D wrptr) { > continue; > } > =20 > [work] > =20 > atomic_add(&buf_page->read_pointer, qw + 1); > } >=20 > it would work, wouldn't it? Looks like so, but the amount of casts alone makes me wish for no-one to consider this (but I agree that the casts could be taken care of). Still I think (as btw done elsewhere) the lock free access is preferable. Jan