From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35555) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZbWCD-0005Xf-Vo for qemu-devel@nongnu.org; Mon, 14 Sep 2015 12:01:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZbWCC-0002Gm-Oc for qemu-devel@nongnu.org; Mon, 14 Sep 2015 12:01:33 -0400 Received: from mail-oi0-x22e.google.com ([2607:f8b0:4003:c06::22e]:36785) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZbWCC-0002Gg-6N for qemu-devel@nongnu.org; Mon, 14 Sep 2015 12:01:32 -0400 Received: by oibi136 with SMTP id i136so79480429oib.3 for ; Mon, 14 Sep 2015 09:01:31 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <87613dp12b.fsf@blackfin.pond.sub.org> References: <20150729150531.GI16847@redhat.com> <87613dp12b.fsf@blackfin.pond.sub.org> Date: Mon, 14 Sep 2015 18:01:31 +0200 Message-ID: From: =?UTF-8?B?TWFyYy1BbmRyw6kgTHVyZWF1?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] RFC: async commands with QMP List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: QEMU Hi On Mon, Sep 14, 2015 at 3:45 PM, Markus Armbruster wrot= e: >> Querying status is not incompatible with having async commands. It's >> not because command return immediately that the job is finished, it's >> async already! > > I'm not sure I got your point here. But see below. I try to explain that we have commands that are async (the monitor returns nothing but you can later query status and receive events for completion). It is useful to start calling them -async, and have a common way to deal with them, instead of having to create dissociated commands and events and having each command to implement them in different way (both on callee and caller side). >> This proposal is about having a return when the job is >> actually finished to avoid having to continuously query, that's the >> main motivation. Today's return for async commands is pretty useless, >> you can admit, it's just a ack that the command got accepted and >> eventually started... > > Let's step back and consider the general life cycle of a "job". I'm > calling it "job" to distinguish it from "QMP command". > > User orders a job to get done > > System either accepts or rejects the job > if reject, we're done > > Job runs for a while, then either fails or succeeds > > Anthony's initial QMP design wanted this wrapped in QMP as follows: > > User sends a QMP command > If we reject the job, send a QMP error response > Else, the job runs until it fails or succeeds > when it fails, send a QMP error response > when it succeeds, send a QMP success response > > The idea was that *all* commands are asynchronous! However, the initial > implementation was in fact completely synchronous. > > For most jobs, the "while" in "runs for a while" is reliably very short, > so this wasn't a problem. When the first job was added where that isn't > the case, the implementation was hacked up to support asynchronous > commands. However, we didn't change the existing commands to become > asynchronous then. Probably because we were pushing very hard to get > QMP reasonably complete, so clients can migrate off HMP. In retrospect, > that was probably the last chance to go back the original design, > because from then on, commands being synchronous became ABI real quick. > Asynchronous commands remained a freaky exception, and we never even > bothered to fix its bugs. > > Instead, we went down a different route: we used QMP events to signal > job completion. Events already existed, so this was natural. Jobs > become wrapped in QMP as follows: > > User sends a QMP command > If we reject the job, send a QMP error response > Else, send a QMP success response > now the job runs until it fails or succeeds > in either case, send a QMP event > > For the event to make sense, the initial command usually needs to > establish a suitable job ID. > > Now let's refine the life cycle some: clients want to query job status, > and cancel jobs. > > The synchronous commands + events design can support it easily. Simply > have a command to query status, taking the job ID as argument. > Likewise, have a command to cancel. Both are synchronous. Cancel may > fail (say because the job has completed meanwhile). > > The asynchronous command design could support it as well, with one > problem: we need a job ID. The QMP command ID is no good, because it's > tied to a connection, while a job ID must remain valid as long as the > job runs. Solvable. > > So yes, "querying status is not incompatible with having async > commands". > > My point is: asynchronous commands vs. synchronous commands + events > appears to be a wash: both get the job done. For better or worse, we > have the latter working, but not the former. Why should we add the > former now? If you think about it, the caller already has to deal with unexpected messa= ges: 1) Caller send QMP command Caller may receive unrelated QMP event (repeat) Caller receives QMP return With async commands today, this is how it goes: 2) Caller send QMP async command id "foo" Caller may receive unrelated QMP event (repeat) Caller receives QMP return Caller may receive unrelated QMP event (repeat) Caller receives QMP event for "foo" completion After all an event is just a message, why not let this message be a regular "return" for the caller then? 3) Caller send QMP async command id "foo" Caller may receive unrelated QMP event (repeat) Caller receives QMP return for "foo" There is no big difference with 2). Here the return is associated with the command, there is no need for an initial "return" and no need to come up with a new event for completion. The event isn't broadcasted to all listeners. So it's not so much about introducing async, but rather providing helpers to help deal with async commands. Events are not well associated with commands in the API and they are better for broadcast than to reply from command. The resulting QAPI isn't actually changed: But callers should opt-in because return may come out of order and that's similar to events for return today. --=20 Marc-Andr=C3=A9 Lureau