From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59774) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZcHZ4-0006Ku-RY for qemu-devel@nongnu.org; Wed, 16 Sep 2015 14:36:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZcHYz-0002Ra-QH for qemu-devel@nongnu.org; Wed, 16 Sep 2015 14:36:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55114) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZcHYz-0002RR-Gx for qemu-devel@nongnu.org; Wed, 16 Sep 2015 14:36:13 -0400 Date: Wed, 16 Sep 2015 19:36:06 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20150916183605.GA7809@work-vm> References: <1434450415-11339-1-git-send-email-dgilbert@redhat.com> <1434450415-11339-33-git-send-email-dgilbert@redhat.com> <87a8uzgjrb.fsf@neno.neno> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87a8uzgjrb.fsf@neno.neno> Subject: Re: [Qemu-devel] [PATCH v7 32/42] Page request: Consume pages off the post-copy queue List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, liang.z.li@intel.com, qemu-devel@nongnu.org, luis@cs.umu.se, amit.shah@redhat.com, pbonzini@redhat.com, david@gibson.dropbear.id.au * Juan Quintela (quintela@redhat.com) wrote: > "Dr. David Alan Gilbert (git)" wrote: > > From: "Dr. David Alan Gilbert" > > > > When transmitting RAM pages, consume pages that have been queued by > > MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning. > > > > Note: > > a) After a queued page the linear walk carries on from after the > > unqueued page; there is a reasonable chance that the destination > > was about to ask for other closeby pages anyway. > > > > b) We have to be careful of any assumptions that the page walking > > code makes, in particular it does some short cuts on its first linear > > walk that break as soon as we do a queued page. > > > > c) We have to be careful to not break up host-page size chunks, since > > this makes it harder to place the pages on the destination. > > > > Signed-off-by: Dr. David Alan Gilbert > > > > +static bool last_was_from_queue; > > Are we using this variable later in the series? That was a straggler; I've killed it off. > > static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length) > > { > > migration_dirty_pages += > > @@ -923,6 +933,41 @@ static int ram_save_compressed_page(QEMUFile *f, RAMBlock *block, > > return pages; > > } > > > > +/* > > + * Unqueue a page from the queue fed by postcopy page requests > > + * > > + * Returns: The RAMBlock* to transmit from (or NULL if the queue is empty) > > + * ms: MigrationState in > > + * offset: the byte offset within the RAMBlock for the start of the page > > + * ram_addr_abs: global offset in the dirty/sent bitmaps > > + */ > > +static RAMBlock *ram_save_unqueue_page(MigrationState *ms, ram_addr_t *offset, > > + ram_addr_t *ram_addr_abs) > > +{ > > + RAMBlock *result = NULL; > > + qemu_mutex_lock(&ms->src_page_req_mutex); > > + if (!QSIMPLEQ_EMPTY(&ms->src_page_requests)) { > > + struct MigrationSrcPageRequest *entry = > > + QSIMPLEQ_FIRST(&ms->src_page_requests); > > + result = entry->rb; > > + *offset = entry->offset; > > + *ram_addr_abs = (entry->offset + entry->rb->offset) & TARGET_PAGE_MASK; > > + > > + if (entry->len > TARGET_PAGE_SIZE) { > > + entry->len -= TARGET_PAGE_SIZE; > > + entry->offset += TARGET_PAGE_SIZE; > > + } else { > > + memory_region_unref(result->mr); > > Here it is the unref, but I still don't understand why we don't need to > undo that on the error case on previous patch. I've added an unref to the 'flush_page_queue' routine that's called during cleanup; thus we take a ref whenever anything is added to the queue, and release it either when we remove it to use it, or during cleanup. > > + QSIMPLEQ_REMOVE_HEAD(&ms->src_page_requests, next_req); > > + g_free(entry); > > + } > > + } > > + qemu_mutex_unlock(&ms->src_page_req_mutex); > > + > > + return result; > > +} > > + > > + > > /** > > * Queue the pages for transmission, e.g. a request from postcopy destination > > * ms: MigrationStatus in which the queue is held > > @@ -987,6 +1032,58 @@ err: > > > > > @@ -997,65 +1094,102 @@ err: > > * @f: QEMUFile where to send the data > > * @last_stage: if we are at the completion stage > > * @bytes_transferred: increase it with the number of transferred bytes > > + * > > + * On systems where host-page-size > target-page-size it will send all the > > + * pages in a host page that are dirty. > > */ > > > > static int ram_find_and_save_block(QEMUFile *f, bool last_stage, > > uint64_t *bytes_transferred) > > { > > + MigrationState *ms = migrate_get_current(); > > RAMBlock *block = last_seen_block; > > + RAMBlock *tmpblock; > > ram_addr_t offset = last_offset; > > + ram_addr_t tmpoffset; > > bool complete_round = false; > > int pages = 0; > > - MemoryRegion *mr; > > ram_addr_t dirty_ram_abs; /* Address of the start of the dirty page in > > ram_addr_t space */ > > > > - if (!block) > > + if (!block) { > > block = QLIST_FIRST_RCU(&ram_list.blocks); > > + last_was_from_queue = false; > > + } > > > > - while (true) { > > - mr = block->mr; > > - offset = migration_bitmap_find_and_reset_dirty(mr, offset, > > - &dirty_ram_abs); > > - if (complete_round && block == last_seen_block && > > - offset >= last_offset) { > > - break; > > - } > > - if (offset >= block->used_length) { > > - offset = 0; > > - block = QLIST_NEXT_RCU(block, next); > > - if (!block) { > > - block = QLIST_FIRST_RCU(&ram_list.blocks); > > - complete_round = true; > > - ram_bulk_stage = false; > > - if (migrate_use_xbzrle()) { > > - /* If xbzrle is on, stop using the data compression at this > > - * point. In theory, xbzrle can do better than compression. > > - */ > > - flush_compressed_data(f); > > - compression_switch = false; > > - } > > + while (true) { /* Until we send a block or run out of stuff to send */ > > + tmpblock = ram_save_unqueue_page(ms, &tmpoffset, &dirty_ram_abs); > > This function was ugly. You already split it in the past. This patch > makes it even more complicated. Can we try something like add a > > ram_find_next_page() and try to put some of the code inside the while > there? I've just posted a pair of patches separately that do this; please let me know if they're on the right lines; they can be applied without postcopy. > Once here, can we agree to send the next N pages (if they are contiguos) > if we receive a queued request? Yeap, deciding N means testing and measuring. > And can wait for this to be integrated. Yes we could do that; at the moment I'm working in host page sized chunks. > > + > > + if (tmpblock) { > > + /* We've got a block from the postcopy queue */ > > + trace_ram_find_and_save_block_postcopy(tmpblock->idstr, > > + (uint64_t)tmpoffset, > > + (uint64_t)dirty_ram_abs); > > + /* > > + * We're sending this page, and since it's postcopy nothing else > > + * will dirty it, and we must make sure it doesn't get sent again > > + * even if this queue request was received after the background > > + * search already sent it. > > + */ > > + if (!test_bit(dirty_ram_abs >> TARGET_PAGE_BITS, > > + migration_bitmap)) { > > I think this test can be inside ram_save_unqueue_page() > > I.e. rename to: > > ram_save_get_next_queued_page() Renamed to the shorter get_queued_page (it's static anyway). Dave -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK