From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: QEMU bumping memory bug analysis Date: Mon, 8 Jun 2015 12:40:19 +0100 Message-ID: References: <20150605164354.GK29102@zion.uk.xensource.com> <1433530180.3342.17.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1433530180.3342.17.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Wei Liu , Stefano Stabellini , George Dunlap , Andrew Cooper , Ian Jackson , dslutz@verizon.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On Fri, 5 Jun 2015, Ian Campbell wrote: > On Fri, 2015-06-05 at 18:10 +0100, Stefano Stabellini wrote: > > On Fri, 5 Jun 2015, Wei Liu wrote: > > > Hi all > > > > > > This bug is now considered a blocker for 4.6 release. > > > > > > The premises of the problem remain the same (George's translated > > > version): > > > > > > 1. QEMU may need extra pages from Xen to implement option ROMS, and so at > > > the moment it calls set_max_mem() to increase max_pages so that it can > > > allocate more pages to the guest. libxl doesn't know what max_pages a > > > domain needs prior to qemu start-up. > > > > > > 2. Libxl doesn't know max_pages even after qemu start-up, because there > > > is no mechanism to communicate between qemu and libxl. > > > > I might not know what is the right design for the overall solution, but > > I do know that libxl shouldn't have its own state tracking for > > max_pages, because max_pages is kept, maintained and enforced by Xen. > > > > Ian might still remember, but at the beginning of the xl/libxl project, > > we had few simple design principles. One of which was that we should not > > have two places where we keep track of the same thing. If Xen keeps > > track of something, libxl should avoid it. > > This isn't about libxl tracking something duplicating information in > Xen. It is about who gets to choose what that value is, which is not the > same as who stores that value. > > So this is about libxl being the owner of what the current maxmem value > is. It can so this by using setmaxmem and getmaxmem to set and retrieve > the value with no state in libxl. > > > I disagree that libxl should be the arbitrator of a property that is > > stored, maintained and enforced by Xen. Xen should be the arbitrator. > > That's not what "arbitrator" means, an arbitrator decides what the value > should be, but that doesn't necessarily imply that it either stores, > maintains nor enforces that value. The way you describe it, it looks like some kind of host wide memory policy manager, that also doesn't belong to xl/libxl, the same way as other memory management tools were never accepted into xl/libxl but kept to separate daemons, like xenballoond or squeezed. Let's step away from this specific issue for a second. If it is not an host way policy manager but a per-VM layer on top of libxc, what value is this indirection actually adding? > > Even if QEMU called into libxl to change maxmem, I don't think that > > libxl should store maxmem anywhere. It is already stored in Xen. > > I don't think anyone suggested otherwise, did they? > > What locking is there around QEMU's read/modify/write of the maxmem > value? What happens if someone else also modifies maxmem at the same > time? It only happens at start of day before the domain is unpaused. At the time I couldn't come up with a scenario where it would be an issue, unless the admin is purposely trying to shut himself in the foot.