From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53862)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1ZEiB3-0003sU-N9
	for qemu-devel@nongnu.org; Mon, 13 Jul 2015 14:10:06 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1ZEiAw-0005ys-DZ
	for qemu-devel@nongnu.org; Mon, 13 Jul 2015 14:10:05 -0400
Received: from mx1.redhat.com ([209.132.183.28]:50584)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1ZEiAw-0005yh-5z
	for qemu-devel@nongnu.org; Mon, 13 Jul 2015 14:09:58 -0400
From: Juan Quintela <quintela@redhat.com>
In-Reply-To: <20150713175654.GM2492@work-vm> (David Alan Gilbert's message of
	"Mon, 13 Jul 2015 18:56:55 +0100")
References: <1434450415-11339-1-git-send-email-dgilbert@redhat.com>
	<1434450415-11339-29-git-send-email-dgilbert@redhat.com>
	<87a8v0jjvt.fsf@neno.neno> <20150713175654.GM2492@work-vm>
Date: Mon, 13 Jul 2015 20:09:56 +0200
Message-ID: <87io9ohqtn.fsf@neno.neno>
MIME-Version: 1.0
Content-Type: text/plain
Subject: Re: [Qemu-devel] [PATCH v7 28/42] Postcopy: Postcopy startup in
	migration thread
Reply-To: quintela@redhat.com
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, liang.z.li@intel.com, qemu-devel@nongnu.org, luis@cs.umu.se, amit.shah@redhat.com, pbonzini@redhat.com, david@gibson.dropbear.id.au

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

>> > +
>> > +    /*
>> > +     * send rest of state - note things that are doing postcopy
>> > +     * will notice we're in POSTCOPY_ACTIVE and not actually
>> > +     * wrap their state up here
>> > +     */
>> > +    qemu_file_set_rate_limit(ms->file, INT64_MAX);
>> 
>> Do we undo this?  or, are we sure that it is ok to maximize network
>> output?
>
> No we don't undo it;  it's a good question what we can do better.
> I'm trying to avoid delaying the postcopy-requested pages; ideally
> I'd like to separate those out so they get satisfied but still
> meet the bandwidth limit for the background transfer.
> The ideal is separate fd's, however something else I've considered
> is getting incoming postcopy requests to wake the outgoing side
> up when it's sleeping for the bandwidth limit, although I've
> not tried implementing that yet.

I see.

>
>> > +    /* Ping just for debugging, helps line traces up */
>> > +    qemu_savevm_send_ping(ms->file, 2);
>> 
>> Change the values 1, 2, 3 to constants?
>
> Suggestions to names? - they purely for debugging so you can
> match it up on the destination.
>
>> > +     * We need to leave the fd free for page transfers during the
>> > +     * loading of the device state, so wrap all the remaining
>> > +     * commands and state into a package that gets sent in one go
>> > +     */
>> > +    QEMUFile *fb = qemu_bufopen("w", NULL);
>> > +    if (!fb) {
>> > +        error_report("Failed to create buffered file");
>> > +        goto fail;
>> > +    }
>> > +
>> > +    qemu_savevm_state_complete_precopy(fb);
>> > +    qemu_savevm_send_ping(fb, 3);
>> > +
>> > +    qemu_savevm_send_postcopy_run(fb);
>> > +
>> > +    /* <><> end of stuff going into the package */
>> > +    qsb = qemu_buf_get(fb);
>> > +
>> > +    /* Now send that blob */
>> > +    if (qemu_savevm_send_packaged(ms->file, qsb)) {
>> > +        goto fail_closefb;
>> > +    }
>> > +    qemu_fclose(fb);
>> 
>> Why can't we send this directly without the extra copy?
>> I guess that there are some missing/extra section starts/end whatever?
>> Anything specific?
>
> The problem is that the destination has to be able to read the chunk
> of migration stream off the fd and leave the fd free for page requests
> that may be required during loading the device state.
> Since the migration-stream is unstructured, there is no way to read
> a chunk of stream off without knowing the length of that chunk, and the
> only way to know that chunk is to write it to a buffer and then see
> how big it is.

Arghhh.  ok.  Comment?

>
>> > +    ms->downtime =  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - time_at_stop;
>> 
>> Now, that we are here, is there a counter of the time that takes the
>> postcopy stage?  Just curious.
>
> No, not separate.
>
>> > +/*
>> >   * Master migration thread on the source VM.
>> >   * It drives the migration and pumps the data down the outgoing channel.
>> >   */
>> >  static void *migration_thread(void *opaque)
>> >  {
>> >      MigrationState *s = opaque;
>> > +    /* Used by the bandwidth calcs, updated later */
>> >      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>> >      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> >      int64_t initial_bytes = 0;
>> >      int64_t max_size = 0;
>> >      int64_t start_time = initial_time;
>> >      bool old_vm_running = false;
>> > +    bool entered_postcopy = false;
>> > +    /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
>> > +    enum MigrationStatus current_active_type = MIGRATION_STATUS_ACTIVE;
>> 
>> current_active_state?
>
> Changed.
>
> Dave
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK