All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	dslutz@verizon.com
Subject: QEMU bumping memory bug analysis
Date: Fri, 5 Jun 2015 17:43:54 +0100	[thread overview]
Message-ID: <20150605164354.GK29102@zion.uk.xensource.com> (raw)

Hi all

This bug is now considered a blocker for 4.6 release.

The premises of the problem remain the same (George's translated
version):

1. QEMU may need extra pages from Xen to implement option ROMS, and so at
   the moment it calls set_max_mem() to increase max_pages so that it can
   allocate more pages to the guest.  libxl doesn't know what max_pages a
   domain needs prior to qemu start-up.

2. Libxl doesn't know max_pages even after qemu start-up, because there
   is no mechanism to communicate between qemu and libxl.

3. QEMU calls xc_domain_setmaxmem to increase max_pages by N pages.
   Those pages are only accounted for in the hypervisor.  Libxl
   (currently) does not extract that value from the hypervisor.

Several solutions were proposed:

1. Add a new record type in libxc migration stream and call setmaxmem
   in the middle of xc migration stream.

Main objections are calling xc_domain_setmaxmem in the middle of xc
migration stream is layer violation. Also this prevents us from
disaggregating domain construction to a less privileged domain.

2. Use libxl toolstack save restore blob to tranmit max pages
   information to remote end.

This is considered a bodge and has been proven not to work because
toolstack blob restore happens after xc_domain_restore.

3. Add a libxl layer that wraps necessary information, take over
   Andrew's work on libxl migration v2.  Having a libxl layer that's not
   part of migration v2 is a waste of effort.

There are several obstacles for libxl migration v2 at the moment. Libxl
layer in migration v2 still has unresolved issues. It has
inter-dependency with Remus / COLO.

Most importantly it doesn't inherently solve the problem. It still
requires the current libxl JSON blob to contain information about max
pages (or information used to derive max pages).

Andrew, correct me if I'm wrong.

4. Add a none user configurable field in current libxl JSON structure to
   record max pages information.

This is not desirable. All fields in libxl JSON should be user
configurable.

5. Add a user configurable field in current libxl JSON structure to
   record how much more memory this domain needs. Admin is required to
   fill in that value manually. In the mean time we revert the change in
   QEMU and declare QEMU with that change buggy.

No response to this so far. But in fact I consider this the most viable
solution.

It's a simple enough solution that is achievable within 4.6 time frame.
It doesn't prevent us from doing useful work in the future
(disaggregated architecture with stricter security policy). It provides
a way to work around buggy QEMU (admin sets that value to prevent QEMU
from bumping memory limit). It's orgthogonal to migration v2 which means
it won't be blocked by migration v2 or block migration v2.

I tend to go with solution 5. Speak up if you don't agree with my
analysis or you think I miss some aspects.

For long term we need to:

1. Establish libxl as the arbitrator how much pages a domain can have.
   Anything else doing stuff behind arbitrator's back is considered
   buggy. This principle probably apply to other aspects of a domain as
   well.

2. Work out a solution communicate between QEMU and libxl. This can be
   expanded to cover other components in a Xen setup, but currently we
   only have QEMU.

Wei.

             reply	other threads:[~2015-06-05 16:43 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-05 16:43 Wei Liu [this message]
2015-06-05 16:58 ` QEMU bumping memory bug analysis Ian Campbell
2015-06-05 17:13   ` Stefano Stabellini
2015-06-05 19:06     ` Wei Liu
2015-06-05 17:17   ` Andrew Cooper
2015-06-05 17:39   ` Wei Liu
2015-06-05 17:10 ` Stefano Stabellini
2015-06-05 18:10   ` Wei Liu
2015-06-08 11:39     ` Stefano Stabellini
2015-06-08 12:14       ` Andrew Cooper
2015-06-08 13:01         ` Stefano Stabellini
2015-06-08 13:33           ` Jan Beulich
2015-06-08 13:10       ` Wei Liu
2015-06-08 13:27         ` Stefano Stabellini
2015-06-08 13:32           ` Wei Liu
2015-06-08 13:38             ` Stefano Stabellini
2015-06-08 13:44               ` Andrew Cooper
2015-06-08 13:45                 ` Stefano Stabellini
2015-06-05 18:49   ` Ian Campbell
2015-06-08 11:40     ` Stefano Stabellini
2015-06-08 12:11       ` Ian Campbell
2015-06-08 13:22         ` Stefano Stabellini
2015-06-08 13:52           ` Ian Campbell
2015-06-08 14:20           ` George Dunlap
2015-06-08 15:01             ` Don Slutz
2015-06-08 15:37               ` George Dunlap
2015-06-08 16:06                 ` Don Slutz
2015-06-09 10:00                   ` George Dunlap
2015-06-09 10:17                     ` Wei Liu
2015-06-09 10:14                 ` Stefano Stabellini
2015-06-09 11:20                   ` George Dunlap
2015-06-16 16:44                     ` Stefano Stabellini
2015-06-09 12:45                   ` Ian Campbell
2015-06-17 13:35                     ` Stefano Stabellini
2015-06-08 14:53         ` Konrad Rzeszutek Wilk
2015-06-08 15:20           ` George Dunlap
2015-06-08 15:42             ` Konrad Rzeszutek Wilk
2015-06-08 14:14   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150605164354.GK29102@zion.uk.xensource.com \
    --to=wei.liu2@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dslutz@verizon.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.