From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: QEMU bumping memory bug analysis Date: Tue, 9 Jun 2015 13:45:05 +0100 Message-ID: <1433853905.7108.550.camel@citrix.com> References: <20150605164354.GK29102@zion.uk.xensource.com> <1433530180.3342.17.camel@citrix.com> <1433765498.7108.480.camel@citrix.com> <5575A4C5.5070702@eu.citrix.com> <5575AE47.3020208@one.verizon.com> <5575B6D0.8010407@eu.citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: Wei Liu , George Dunlap , Andrew Cooper , Ian Jackson , Don Slutz , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On Tue, 2015-06-09 at 11:14 +0100, Stefano Stabellini wrote: > I don't think that the current solution is inherently racy. If we are > really interested in an honest evaluation of the current solution in > terms of races, I would be happy to do so. Consider a domain with 1M of RAM (==target and maxmem for the sake of argument) and two simultaneous calls to libxl_set_memory_target, both with relative=0 and enforce=1, one with target=3 and the other with target=5. target=3 call target=5 call get ptr.max_memkb, gets 1 get ptr.max_memkb, gets 1 start transaction do xenstore stuff, seeing target=1, setting target=3 memorykb = ptr.max_memkb - current_target_memkb + new_target_memkb; memorykb = 1 - 1 + 3 xc_setmaxmem(3) transaction commit (success) Now target=maxmem=3 start transaction do xenstore stuff, seeing target=3, setting target=5 memorykb = ptr.max_memkb - current_target_memkb + new_target_memkb; memorykb = 1 - 3 + 5 xc_setmaxmem(3) transaction commit (success) Now target=5, maxmem=3. The obvious fix of moving the get ptr.max_memkb inside the transaction fails in a different way in the case where the first transaction commit fails and is retried after the second one, I think. Prior to 0c029c4da21 "libxl_set_memory_target: retain the same maxmem offset on top of the current target" this issue didn't exist because memorykb was just memorykb = new_target_memkb + videoram. BTW, I noticed some other (unrelated) dubious stuff in there while looking at this, in particular the setmaxmem is not rolled back if set_pod_target fails. Also the transiently "wrong" maxmem between the setmaxmem and a failed transaction getting redone might possibly give rise to some interesting cases, especially if anything else fails the second time around the loop. > When I committed it, I didn't do it without thinking. Likewise I thought about it when reviewing, but it seems we have both missed an important aspect. Ian.