All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>,
	"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "aarcange@redhat.com" <aarcange@redhat.com>,
	"yamahata@private.email.ne.jp" <yamahata@private.email.ne.jp>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"luis@cs.umu.se" <luis@cs.umu.se>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	"david@gibson.dropbear.id.au" <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH v7 01/42] Start documenting how postcopy works.
Date: Thu, 18 Jun 2015 10:28:35 +0200	[thread overview]
Message-ID: <55828133.90400@redhat.com> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E55056F@shsmsx102.ccr.corp.intel.com>



On 18/06/2015 09:50, Li, Liang Z wrote:
> Do you have any idea or plan to deal with the failure happened during
> the postcopy phase?
> 
> Lost the guest  is too frightening for a cloud provider, we have a
> discussion with Alibaba, they said that they can't use the postcopy
> feature unless there is a mechanism to find the guest back.

There's no solution to this problem, except for rollback to a previous
snapshot.

To give an idea, an example of an intended usecase for postcopy is
datacenter evacuation in 30 minutes after a tsunami alert.  That's not a
case where you care much about losing guests to network failures.

Why is there no solution?  Let's look at one of the best surveys on
migration,
http://courses.cs.vt.edu/~cs5204/fall05-kafura/Papers/Migration/ProcessMigration.pdf
(warning, 59 pages!):

  [3.2] If only part of the task state is transferred to another node,
  the task can start executing sooner, and the initial migration costs
  are lower.

  [3.4] Fault resilience can be improved in several ways. The impact of
  failures during migration can be reduced by maintaining process state
  on both the source and destination sites until the destination site
  instance is successfully promoted to a regular process and the source
  node is informed about this.

  [3.5] Migration algorithms should avoid linear dependencies on the
  amount of state to be transferred. For example, the eager data
  transfer strategy has costs proportional to the address space size

"Pre"copy means "start copying *before* promoting the destination to be
the primary host" and it has such a linear dependency on the amount of
state to be transferred. "Post"copy means "delay some copying to *after*
promoting the destination to be the primary host".

So we have:

                           Precopy            Postcopy
   3.2 Performance            - (1)             - (2)
   3.4 Fault resilience       +                 -
   3.5 Scalability            -                 +

      (1) smaller impact, longer freeze time
      (2) larger impact, extremely short freeze time

Postcopy can also limit the length of the non-resilient phase, by
starting with a precopy phase and only switching to postcopy after some
time.  Then you have:

                           Precopy        Hybrid      Postcopy
   3.2 Performance            - (1)          + (3)        - (2)
   3.4 Fault resilience       +              -            --
   3.5 Scalability            -              +            +

      (3) intermediate impact, extremely short freeze time

but there is still going to be a phase where migration is not resilient
to network faults.

Cloud operators can use a combination of precopy and postcopy.  For
example, I would not use postcopy for mass migration when doing
host updates, but it can be used as a last resort before a scheduled
downtime.

For example, say you're doing a rolling update and you want it complete
by next Sunday.  90% of the guests are shut down by the customers or can
be migrated successfully with precopy.  The others do not converge and
their SLA does not let you throttle them to complete precopy migration.

You then tell your customers that either they shutdown and restart their
instances before Saturday 8:00 PM, or they might be shut down forcibly.
 Then for customers who haven't rebooted you can do
postcopy---you have alerted them that something might go wrong.  So even
though postcopy would not be a first choice, it can still help cloud
operators.

Paolo

  parent reply	other threads:[~2015-06-18  8:28 UTC|newest]

Thread overview: 209+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16 10:26 [Qemu-devel] [PATCH v7 00/42] Postcopy implementation Dr. David Alan Gilbert (git)
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 01/42] Start documenting how postcopy works Dr. David Alan Gilbert (git)
2015-06-17 11:42   ` Juan Quintela
2015-06-17 12:30     ` Dr. David Alan Gilbert
2015-06-18  7:50   ` Li, Liang Z
2015-06-18  8:10     ` Dr. David Alan Gilbert
2015-06-18  8:28     ` Paolo Bonzini [this message]
2015-06-19 17:52       ` Dr. David Alan Gilbert
2015-06-26  6:46   ` Yang Hongyang
2015-06-26  7:53     ` zhanghailiang
2015-06-26  8:00       ` Yang Hongyang
2015-06-26  8:10         ` Dr. David Alan Gilbert
2015-06-26  8:19           ` Yang Hongyang
2015-08-04  5:20   ` Amit Shah
2015-08-05 12:21     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 02/42] Provide runtime Target page information Dr. David Alan Gilbert (git)
2015-06-17 11:43   ` Juan Quintela
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 03/42] Init page sizes in qtest Dr. David Alan Gilbert (git)
2015-06-17 11:49   ` Juan Quintela
2015-07-06  6:14   ` Amit Shah
2015-08-04  5:23   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 04/42] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2015-06-17 11:54   ` Juan Quintela
2015-07-10  8:36   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 05/42] Add qemu_get_buffer_less_copy to avoid copies some of the time Dr. David Alan Gilbert (git)
2015-06-17 11:57   ` Juan Quintela
2015-06-17 12:33     ` Dr. David Alan Gilbert
2015-07-13  9:08   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 06/42] Add wrapper for setting blocking status on a QEMUFile Dr. David Alan Gilbert (git)
2015-06-17 11:59   ` Juan Quintela
2015-06-17 12:34     ` Dr. David Alan Gilbert
2015-06-17 12:57       ` Juan Quintela
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 07/42] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2015-06-17 12:17   ` Juan Quintela
2015-06-19 17:04     ` Dr. David Alan Gilbert
2015-07-13 10:15       ` Juan Quintela
2015-07-13  9:12   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 08/42] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2015-06-17 12:18   ` Juan Quintela
2015-07-13  9:13   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 09/42] Rename save_live_complete to save_live_complete_precopy Dr. David Alan Gilbert (git)
2015-06-17 12:20   ` Juan Quintela
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 10/42] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2015-06-17 12:23   ` Juan Quintela
2015-06-17 17:07     ` Dr. David Alan Gilbert
2015-07-13 10:12   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 11/42] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2015-06-17 12:28   ` Juan Quintela
2015-06-19 17:18     ` Dr. David Alan Gilbert
2015-07-13 12:37   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 12/42] Migration commands Dr. David Alan Gilbert (git)
2015-06-17 12:31   ` Juan Quintela
2015-06-19 17:38     ` Dr. David Alan Gilbert
2015-07-13 12:45   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 13/42] Return path: Control commands Dr. David Alan Gilbert (git)
2015-06-17 12:49   ` Juan Quintela
2015-06-23 18:57     ` Dr. David Alan Gilbert
2015-07-13 12:55   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 14/42] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2015-06-17 16:30   ` Juan Quintela
2015-06-19 18:42     ` Dr. David Alan Gilbert
2015-07-01  9:29       ` Juan Quintela
2015-08-06 12:18         ` Dr. David Alan Gilbert
2015-07-15  7:31   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 15/42] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2015-07-13 10:29   ` Juan Quintela
2015-08-18 10:23     ` Dr. David Alan Gilbert
2015-07-15  7:50   ` Amit Shah
2015-07-16 11:32     ` Dr. David Alan Gilbert
2015-08-05  8:06   ` zhanghailiang
2015-08-18 10:45     ` Dr. David Alan Gilbert
2015-08-18 11:29       ` zhanghailiang
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 16/42] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2015-07-13 10:33   ` Juan Quintela
2015-07-15  9:34   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 17/42] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2015-06-16 15:43   ` Eric Blake
2015-06-16 15:58     ` Dr. David Alan Gilbert
2015-07-15  9:39       ` Amit Shah
2015-07-13 10:35   ` Juan Quintela
2015-07-15  9:40   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 18/42] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2015-07-13 11:02   ` Juan Quintela
2015-07-20 10:13     ` Amit Shah
2015-08-26 14:48     ` Dr. David Alan Gilbert
2015-07-20 10:06   ` Amit Shah
2015-07-27  9:55     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 19/42] MIG_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2015-07-13 11:07   ` Juan Quintela
2015-07-21  6:11   ` Amit Shah
2015-07-27 17:28     ` Dr. David Alan Gilbert
2015-08-04  5:27   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 20/42] Modify save_live_pending for postcopy Dr. David Alan Gilbert (git)
2015-07-13 11:12   ` Juan Quintela
2015-07-31 16:13     ` Dr. David Alan Gilbert
2015-07-21  6:17   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 21/42] postcopy: OS support test Dr. David Alan Gilbert (git)
2015-07-13 11:20   ` Juan Quintela
2015-07-13 16:31     ` Dr. David Alan Gilbert
2015-07-21  7:29   ` Amit Shah
2015-07-27 17:38     ` Dr. David Alan Gilbert
2015-08-04  5:28   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 22/42] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2015-07-13 11:23   ` Juan Quintela
2015-07-13 17:13     ` Dr. David Alan Gilbert
2015-07-13 18:07       ` Juan Quintela
2015-07-21  7:40         ` Amit Shah
2015-09-24  9:59           ` Dr. David Alan Gilbert
2015-09-24 14:20         ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 23/42] MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2015-07-13 11:27   ` Juan Quintela
2015-07-13 15:53     ` Dr. David Alan Gilbert
2015-07-13 16:26       ` Juan Quintela
2015-07-13 16:48         ` Dr. David Alan Gilbert
2015-07-13 18:05           ` Juan Quintela
2015-07-21 10:33   ` Amit Shah
2015-09-23 17:04     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 24/42] Add qemu_savevm_state_complete_postcopy Dr. David Alan Gilbert (git)
2015-07-13 11:35   ` Juan Quintela
2015-07-13 15:33     ` Dr. David Alan Gilbert
2015-07-21 10:42   ` Amit Shah
2015-07-27 17:58     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 25/42] Postcopy: Maintain sentmap and calculate discard Dr. David Alan Gilbert (git)
2015-07-13 11:47   ` Juan Quintela
2015-09-15 17:01     ` Dr. David Alan Gilbert
2015-07-21 11:36   ` Amit Shah
2015-07-31 16:51     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 26/42] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2015-07-13 12:04   ` Juan Quintela
2015-09-23 19:06     ` Dr. David Alan Gilbert
2015-07-22  6:19   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 27/42] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2015-07-13 12:10   ` Juan Quintela
2015-07-13 17:36     ` Dr. David Alan Gilbert
2015-07-23  5:22   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 28/42] Postcopy: Postcopy startup in migration thread Dr. David Alan Gilbert (git)
2015-07-13 12:56   ` Juan Quintela
2015-07-13 17:56     ` Dr. David Alan Gilbert
2015-07-13 18:09       ` Juan Quintela
2015-09-23 17:56         ` Dr. David Alan Gilbert
2015-07-23  5:53       ` Amit Shah
2015-07-23  5:55   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 29/42] Postcopy end in migration_thread Dr. David Alan Gilbert (git)
2015-07-13 13:15   ` Juan Quintela
2015-07-23  6:41     ` Amit Shah
2015-08-04 11:31     ` Dr. David Alan Gilbert
2015-07-23  6:41   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 30/42] Page request: Add MIG_RP_MSG_REQ_PAGES reverse command Dr. David Alan Gilbert (git)
2015-07-13 13:24   ` Juan Quintela
2015-08-06 14:15     ` Dr. David Alan Gilbert
2015-07-23  6:50   ` Amit Shah
2015-08-06 14:21     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 31/42] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2015-07-14  9:18   ` Juan Quintela
2015-08-06 10:45     ` Dr. David Alan Gilbert
2015-10-20 10:29       ` Juan Quintela
2015-07-23 12:23   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 32/42] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2015-07-14  9:40   ` Juan Quintela
2015-09-16 18:36     ` Dr. David Alan Gilbert
2015-07-27  6:05   ` Amit Shah
2015-09-16 18:48     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 33/42] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2015-07-14 10:05   ` Juan Quintela
2015-07-27  6:11     ` Amit Shah
2015-09-23 16:45     ` Dr. David Alan Gilbert
2015-07-27  6:11   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 34/42] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2015-07-14 12:34   ` Juan Quintela
2015-07-17 17:31     ` Dr. David Alan Gilbert
2015-07-27  7:39   ` Amit Shah
2015-08-06 11:22     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 35/42] Don't sync dirty bitmaps in postcopy Dr. David Alan Gilbert (git)
2015-07-14 12:36   ` Juan Quintela
2015-07-14 13:13     ` Dr. David Alan Gilbert
2015-07-27  7:43   ` Amit Shah
2015-07-31  9:50     ` Dr. David Alan Gilbert
2015-08-04  5:46       ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 36/42] Host page!=target page: Cleanup bitmaps Dr. David Alan Gilbert (git)
2015-07-14 15:01   ` Juan Quintela
2015-07-31 15:53     ` Dr. David Alan Gilbert
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 37/42] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2015-07-14 15:10   ` Juan Quintela
2015-07-14 15:15     ` Dr. David Alan Gilbert
2015-07-14 15:25       ` Juan Quintela
2015-07-27 14:29   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 38/42] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2015-07-14 15:12   ` Juan Quintela
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 39/42] postcopy: Wire up loadvm_postcopy_handle_ commands Dr. David Alan Gilbert (git)
2015-07-14 15:14   ` Juan Quintela
2015-07-28  5:53   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 40/42] End of migration for postcopy Dr. David Alan Gilbert (git)
2015-07-14 15:15   ` Juan Quintela
2015-07-28  5:55   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 41/42] Disable mlock around incoming postcopy Dr. David Alan Gilbert (git)
2015-07-14 15:22   ` Juan Quintela
2015-07-28  6:02     ` Amit Shah
2015-07-28 11:32       ` Juan Quintela
2015-08-06 14:55         ` Dr. David Alan Gilbert
2015-08-07  3:05           ` zhanghailiang
2015-09-24 10:36     ` Dr. David Alan Gilbert
2015-07-28  6:02   ` Amit Shah
2015-06-16 10:26 ` [Qemu-devel] [PATCH v7 42/42] Inhibit ballooning during postcopy Dr. David Alan Gilbert (git)
2015-07-14 15:24   ` Juan Quintela
2015-07-28  6:15   ` Amit Shah
2015-07-28  9:08     ` Dr. David Alan Gilbert
2015-07-28 10:01       ` Amit Shah
2015-07-28 11:16         ` Dr. David Alan Gilbert
2015-07-28  6:21 ` [Qemu-devel] [PATCH v7 00/42] Postcopy implementation Amit Shah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55828133.90400@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dgilbert@redhat.com \
    --cc=liang.z.li@intel.com \
    --cc=luis@cs.umu.se \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.