From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Date: Tue, 16 Jun 2015 16:01:56 +0100 Message-ID: <55803A64.5040601@citrix.com> References: <1434375880-30914-1-git-send-email-andrew.cooper3@citrix.com> <1434375880-30914-17-git-send-email-andrew.cooper3@citrix.com> <1434465119.13744.196.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1434465119.13744.196.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Ross Lagerwall , Wei Liu , Yang Hongyang , Ian Jackson , Xen-devel List-Id: xen-devel@lists.xenproject.org On 16/06/15 15:31, Ian Campbell wrote: > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: >> From: Ross Lagerwall >> >> Signed-off-by: Ross Lagerwall >> Signed-off-by: Andrew Cooper >> CC: Ian Campbell >> CC: Ian Jackson >> CC: Wei Liu > Overall looks good, I've got some comments below and I think it almost > certainly wants eyes from Ian who knows more about the dc infra etc. > >> +void libxl__stream_read_start(libxl__egc *egc, >> + libxl__stream_read_state *stream) >> +{ >> + libxl__datacopier_state *dc = &stream->dc; >> + int ret = 0; >> + >> + /* State initialisation. */ >> + assert(!stream->running); >> + >> + memset(dc, 0, sizeof(*dc)); > libxl__datacopier_init, please That call is made by libxl__datacopier_start() each and every time, and unlike here, is matched with an equivalent _kill() call. > >> + dc->ao = stream->ao; >> + dc->readfd = stream->fd; >> + dc->writefd = -1; >> + >> + /* Start reading the stream header. */ >> + dc->readwhat = "stream header"; >> + dc->readbuf = &stream->hdr; >> + stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr); >> + dc->used = 0; >> + dc->callback = stream_header_done; > This pattern of resetting and reinitialising the dc occurs in multiple > places, I think a helper would be in order, some sort of > stream_next_record_init or something perhaps? The only feasible helper would have to take everything as parameters; there is insufficient similarity between all users. I dunno whether that would be harder to read... > >> +void libxl__stream_read_abort(libxl__egc *egc, >> + libxl__stream_read_state *stream, int rc) >> +{ >> + stream_failed(egc, stream, rc); >> +} >> + >> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream) >> +{ >> + stream->rc = 0; >> + stream->running = false; >> + >> + stream_done(egc, stream); > Push the running = false into stream_done and flip the assert there? > Logically the stream is still running until it is done, so having done > assert it isn't running seems counter-intuitive. This is more for piece of mind. stream_done() my strictly only ever be called once, hence its assert. > >> +static void stream_done(libxl__egc *egc, >> + libxl__stream_read_state *stream) >> +{ >> + libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); >> + >> + assert(!stream->running); >> + >> + stream->completion_callback(egc, dcs, stream->rc); >> +} >> + >> +static void stream_header_done(libxl__egc *egc, >> + libxl__datacopier_state *dc, >> + int onwrite, int errnoval) >> +{ >> + libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); >> + libxl_sr_hdr *hdr = &stream->hdr; >> + STATE_AO_GC(dc->ao); >> + int ret = 0; >> + >> + if (onwrite || dc->used != stream->expected_len) { >> + ret = ERROR_FAIL; >> + LOG(ERROR, "write %d, err %d, expected %zu, got %zu", >> + onwrite, errnoval, stream->expected_len, dc->used); >> + goto err; >> + } > I think you need to check errnoval == 0 in the !onwrite case, otherwise > you may miss a read error? "dc->used != stream->expected_len" covers all possible read errors, in the "something went wrong" kind of way. > > Also it looks like onwrite can be -1, which is a separate error case. > >> + >> +static void record_header_done(libxl__egc *egc, >> + libxl__datacopier_state *dc, >> + int onwrite, int errnoval) >> +{ >> + libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); >> + libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr; >> + STATE_AO_GC(dc->ao); >> + int ret = 0; >> + >> + if (onwrite || dc->used != stream->expected_len) { >> + ret = ERROR_FAIL; >> + LOG(ERROR, "write %d, err %d, expected %zu, got %zu", >> + onwrite, errnoval, stream->expected_len, dc->used); >> + goto err; >> + } > Same comments wrt the arguments as the previous one. > > Maybe a common helper to check (and log) the status at the head of each > callback? So you can effectively do if (!everything_ok(stream, dc) goto > err? I will see what I can do. > >> + assert(!ret); >> + if (rec_hdr->length) { >> + free(stream->rec_body); >> + stream->rec_body = NULL; > reset length too? > >> +static void read_emulator_body(libxl__egc *egc, >> + libxl__stream_read_state *stream) >> +{ >> + libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); >> + libxl__datacopier_state *dc = &stream->dc; >> + libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr; >> + libxl_sr_emulator_hdr *emu_hdr = stream->rec_body; >> + STATE_AO_GC(stream->ao); >> + char path[256]; >> + int ret = 0; >> + >> + sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid); >> + >> + dc->readwhat = "save/migration stream"; >> + dc->copywhat = "emulator context"; >> + dc->writewhat = "qemu save file"; >> + dc->readbuf = NULL; >> + dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666); > Since it this is all done in the same process (or children of it) with > not setuid etc, I think 0600 would be better to avoid accidentally > leaving the save state world readable (just in case it matters). Probably best. > > Also, should consider whether this fd needs to be subject to the carefd > machinery. Probably does. > > Sharing the dc between al these differing usages is starting to rankle a > little, but I think it is necessary because it may have queued data from > a previous read which was larger than the current record, correct? > > Hrm, isn't setting dc->used = 0 on each reset potentially throwing some > stuff away? We should never be in a case where we are setting up a new read/write from the dc with any previous IO pending. > >> + if (dc->writefd == -1) { >> + ret = ERROR_FAIL; >> + LOGE(ERROR, "Unable to open '%s'", path); >> + goto err; >> + } >> + dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr); >> + stream->expected_len = dc->used = 0; > expecting 0? This differs from the pattern common everywhere else and > I'm not sure why. The datacopier has been overloaded so many times, it is messy to use. In this case, we are splicing from read fd to a write fd, rather than to a local buffer. Therefore, when the IO is complete, we expect 0 bytes in the local buffer, as all data should end up in the fd. > >> + dc->callback = emulator_body_done; >> + >> + ret = libxl__datacopier_start(dc); >> + if (ret) >> + goto err; >> + return; >> + >> + err: >> + assert(ret); >> + stream_failed(egc, stream, ret); >> +} >> + >> +static void emulator_body_done(libxl__egc *egc, >> + libxl__datacopier_state *dc, >> + int onwrite, int errnoval) >> +{ >> + /* Safe to be static, as it is a write-only discard buffer. */ >> + static char padding[1U << REC_ALIGN_ORDER]; >> + >> + libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); >> + libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr; >> + STATE_AO_GC(dc->ao); >> + unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER); >> + int ret = 0; >> + >> + if (onwrite || dc->used != stream->expected_len) { >> + ret = ERROR_FAIL; >> + LOG(ERROR, "write %d, err %d, expected %zu, got %zu", >> + onwrite, errnoval, stream->expected_len, dc->used); >> + goto err; >> + } >> + >> + /* Undo modifications for splicing the emulator context. */ > Hrm, not so much undo as nuke and rebuild. Is that really necessary, > can't you just reset what you need to in the inverse of the other thing? > > If there isn't a problem with buffered stuff on callback, then perhaps > it would be clearer to use a separate dc, at least for the qemu side. Or > to _always_ teardown and restart the dc from scratch instead of doing it > partially in some places and fully in others. > > >> + memset(dc, 0, sizeof(*dc)); >> + dc->ao = stream->ao; >> + dc->readfd = stream->fd; >> + dc->writefd = -1; >> + >> + /* Do we need to eat some padding out of the stream? */ > Why only now and not for e.g. the xenstore stuff (which doesn't appear > to be explicitly padded). Any record which is read into a local buffer has the local buffer aligned up, and the padding read onto the end. > > And given that why not handle this in some central place rather than in > the emulator only place? Experimentally, some versions of Qemu barf if they have trailing zeros in save file. I think they expect to find eof() on a qemu record boundary. ~Andrew