[Qemu-devel] RFC: async commands with QMP

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] RFC: async commands with QMP
@ 2015-07-29 14:49 Marc-André Lureau
  2015-07-29 14:51 ` Marc-André Lureau
  2015-07-29 15:05 ` Daniel P. Berrange
  0 siblings, 2 replies; 9+ messages in thread
From: Marc-André Lureau @ 2015-07-29 14:49 UTC (permalink / raw)
  To: QEMU

Hi

It seems there is no support for async commands with QAPI/QMP other
than having to poll for status, or a "sync" command followed
eventually by an event. But, there is no direct relation between the
command and the event.

As a monitor command shouldn't block, it would be useful to have async
commands support, if possible with a common scheme.

Here is a simple proposal, let's take an existing command:

                -> { "execute": "drive-backup", "arguments": {
"device": "drive0",
                                                               "sync": "full",

"target": "backup.img" },
                      "id": 1234 }
                <- { "return": {}, "id": 1234 }

You can then check regularly the state with query-block-jobs.. Suppose
instead of returning immediately, a similar function would return when
the task is completed. It would be nice if you wouldn't have to
duplicate existing blocking functions to an _async variant. Could we
introduce an additional "async" member? (rightfully rejected on qemu
today)

                -> { "execute": "drive-backup", "arguments": {
"device": "drive0",
                                                               "sync": "full",

"target": "backup.img" },
                      "id": 1234,
                      "async": true }
Here, with "async": true, the caller knows he shouldn't expect an
immediate return, and he can exchange other messages:
                -> { "execute"...
                <- { "event", ...
And when "drive-backup" finishes:
                <- { "return": {}, "id": 1234 }

In order to make "async" variant possible, "id" would have to be
provided and unique. I think the async support should also be
announced in the qmp_capabilities. Commands would have to opt-in for
async support in the schema. An async return wouldn't be allowed when
a blocking command is ongoing.

There are many variations possible on the same idea. We could
introduce new _async functions instead, and keep the "id"
requirements. Or alternatively, an "async_id" could be generated by
qemu and returned immediately. Then the reply of the command could be
an event instead. Ex:

                -> { "execute": "drive-backup-async", "arguments": {
"device": "drive0",
                                                               "sync": "full",

"target": "backup.img" },
                      "id": 1234 }
                <- { "return-async": {}, "async_id": 42 }
... later on:
                <- {"event": "ASYNC_COMPLETED", "async_id": 42,
"data": {.return values could go there..}, "id" 1234}

I haven't looked much on the implementation side, but I can try to
implement a proof-of-concept. Let see if this threads brings some
discussion first.

cheers

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 14:49 [Qemu-devel] RFC: async commands with QMP Marc-André Lureau
@ 2015-07-29 14:51 ` Marc-André Lureau
  2015-07-29 15:05 ` Daniel P. Berrange
  1 sibling, 0 replies; 9+ messages in thread
From: Marc-André Lureau @ 2015-07-29 14:51 UTC (permalink / raw)
  To: QEMU

(sorry, the indentation was a bad idea)

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 14:49 [Qemu-devel] RFC: async commands with QMP Marc-André Lureau
  2015-07-29 14:51 ` Marc-André Lureau
@ 2015-07-29 15:05 ` Daniel P. Berrange
  2015-07-29 15:10   ` Eric Blake
  2015-07-29 15:18   ` Marc-André Lureau
  1 sibling, 2 replies; 9+ messages in thread
From: Daniel P. Berrange @ 2015-07-29 15:05 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: QEMU

On Wed, Jul 29, 2015 at 04:49:56PM +0200, Marc-André Lureau wrote:
> Hi
> 
> It seems there is no support for async commands with QAPI/QMP other
> than having to poll for status, or a "sync" command followed
> eventually by an event. But, there is no direct relation between the
> command and the event.
> 
> As a monitor command shouldn't block, it would be useful to have async
> commands support, if possible with a common scheme.
> 
> Here is a simple proposal, let's take an existing command:
> 
>                 -> { "execute": "drive-backup", "arguments": {
> "device": "drive0",
>                                                                "sync": "full",
> 
> "target": "backup.img" },
>                       "id": 1234 }
>                 <- { "return": {}, "id": 1234 }
> 
> 
> You can then check regularly the state with query-block-jobs.. Suppose
> instead of returning immediately, a similar function would return when
> the task is completed. It would be nice if you wouldn't have to
> duplicate existing blocking functions to an _async variant. Could we
> introduce an additional "async" member? (rightfully rejected on qemu
> today)
> 
>                 -> { "execute": "drive-backup", "arguments": {
> "device": "drive0",
>                                                                "sync": "full",
> 
> "target": "backup.img" },
>                       "id": 1234,
>                       "async": true }
> Here, with "async": true, the caller knows he shouldn't expect an
> immediate return, and he can exchange other messages:
>                 -> { "execute"...
>                 <- { "event", ...
> And when "drive-backup" finishes:
>                 <- { "return": {}, "id": 1234 }
> 
> In order to make "async" variant possible, "id" would have to be
> provided and unique. I think the async support should also be
> announced in the qmp_capabilities. Commands would have to opt-in for
> async support in the schema. An async return wouldn't be allowed when
> a blocking command is ongoing.
> 
> There are many variations possible on the same idea. We could
> introduce new _async functions instead, and keep the "id"
> requirements. Or alternatively, an "async_id" could be generated by
> qemu and returned immediately. Then the reply of the command could be
> an event instead. Ex:
> 
>                 -> { "execute": "drive-backup-async", "arguments": {
> "device": "drive0",
>                                                                "sync": "full",
> 
> "target": "backup.img" },
>                       "id": 1234 }
>                 <- { "return-async": {}, "async_id": 42 }
> ... later on:
>                 <- {"event": "ASYNC_COMPLETED", "async_id": 42,
> "data": {.return values could go there..}, "id" 1234}
> 
> I haven't looked much on the implementation side, but I can try to
> implement a proof-of-concept. Let see if this threads brings some
> discussion first.

When QMP was originally written there was some infrastructure for doing
async commands, but over time this has all been ripped out because it
was never really used, complicated the code and didn't work too well
either. It seems we pretty much settled on the approach that all
commands should be fast to execute, and where there is a long term
job being run, we have commands to query its status, cancel it, and
sometimes events too.  One of the benefits of this is that it means
that libvirt can determine current status of ongoing background jobs
when it restarts and reconnects to a previously launched QEMU, where
as an async command approach is tied to the specific monitor connection
that is open.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 15:05 ` Daniel P. Berrange
@ 2015-07-29 15:10   ` Eric Blake
  2015-07-29 15:32     ` Marc-André Lureau
  2015-07-29 15:18   ` Marc-André Lureau
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Blake @ 2015-07-29 15:10 UTC (permalink / raw)
  To: Daniel P. Berrange, Marc-André Lureau; +Cc: QEMU

[-- Attachment #1: Type: text/plain, Size: 1680 bytes --]

On 07/29/2015 09:05 AM, Daniel P. Berrange wrote:
>> In order to make "async" variant possible, "id" would have to be
>> provided and unique. I think the async support should also be
>> announced in the qmp_capabilities. Commands would have to opt-in for
>> async support in the schema. An async return wouldn't be allowed when
>> a blocking command is ongoing.
>>
>> There are many variations possible on the same idea. We could
>> introduce new _async functions instead, and keep the "id"
>> requirements. Or alternatively, an "async_id" could be generated by
>> qemu and returned immediately. Then the reply of the command could be
>> an event instead. Ex:
>>

> 
> When QMP was originally written there was some infrastructure for doing
> async commands, but over time this has all been ripped out because it
> was never really used, complicated the code and didn't work too well
> either. It seems we pretty much settled on the approach that all
> commands should be fast to execute, and where there is a long term
> job being run, we have commands to query its status, cancel it, and
> sometimes events too.

In fact, see commit 65207c59, where we ripped it out with prejudice.


>  One of the benefits of this is that it means
> that libvirt can determine current status of ongoing background jobs
> when it restarts and reconnects to a previously launched QEMU, where
> as an async command approach is tied to the specific monitor connection
> that is open.

And that is a real concern with any new proposal for async commands.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 15:05 ` Daniel P. Berrange
  2015-07-29 15:10   ` Eric Blake
@ 2015-07-29 15:18   ` Marc-André Lureau
  2015-09-14 13:45     ` Markus Armbruster
  1 sibling, 1 reply; 9+ messages in thread
From: Marc-André Lureau @ 2015-07-29 15:18 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: QEMU

Hi Daniel

On Wed, Jul 29, 2015 at 5:05 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> When QMP was originally written there was some infrastructure for doing
> async commands, but over time this has all been ripped out because it
> was never really used, complicated the code and didn't work too well
> either. It seems we pretty much settled on the approach that all
> commands should be fast to execute, and where there is a long term
> job being run, we have commands to query its status, cancel it, and
> sometimes events too.  One of the benefits of this is that it means
> that libvirt can determine current status of ongoing background jobs
> when it restarts and reconnects to a previously launched QEMU, where
> as an async command approach is tied to the specific monitor connection
> that is open.

Thanks for the historic reasons, I ignored that.

Querying status is not incompatible with having async commands. It's
not because command return immediately that the job is finished, it's
async already! This proposal is about having a return when the job is
actually finished to avoid having to continuously query, that's the
main motivation. Today's return for async commands is pretty useless,
you can admit, it's just a ack that the command got accepted and
eventually started...

Also commands can be tight to the lifecycle of a client. Take memsave
or screendump, if you give them a client fd, you actually don't care
if the client is gone: there is no need to query about the status of
those commands. Instead, they should be cancelled when the client is
gone.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 15:10   ` Eric Blake
@ 2015-07-29 15:32     ` Marc-André Lureau
  2015-07-29 15:45       ` Daniel P. Berrange
  0 siblings, 1 reply; 9+ messages in thread
From: Marc-André Lureau @ 2015-07-29 15:32 UTC (permalink / raw)
  To: Eric Blake; +Cc: QEMU

Hi Eric

On Wed, Jul 29, 2015 at 5:10 PM, Eric Blake <eblake@redhat.com> wrote:
> In fact, see commit 65207c59, where we ripped it out with prejudice.

Whoo, I should have looked at git history :)

That implementation looks quite bad indeed (suspending monitor?) And
it wasn't properly documented it seems. No wonders it wasn't used.
Btw, apparently it's the reason why "id" is there in the first place.
I don't think there is a good reason to keep having it in sync
commands.

>
>>  One of the benefits of this is that it means
>> that libvirt can determine current status of ongoing background jobs
>> when it restarts and reconnects to a previously launched QEMU, where
>> as an async command approach is tied to the specific monitor connection
>> that is open.
>
> And that is a real concern with any new proposal for async commands.

I don't think it's incompatible with having async commands, since in
fact they are already async commands with dummy quick return. Some
async should be cancelled when the client is gone (this depends on the
async command, whether it is tight to a client or not). However, no
old async return should be given to a new client.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 15:32     ` Marc-André Lureau
@ 2015-07-29 15:45       ` Daniel P. Berrange
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel P. Berrange @ 2015-07-29 15:45 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: QEMU

On Wed, Jul 29, 2015 at 05:32:59PM +0200, Marc-André Lureau wrote:
> Hi Eric
> 
> On Wed, Jul 29, 2015 at 5:10 PM, Eric Blake <eblake@redhat.com> wrote:
> > In fact, see commit 65207c59, where we ripped it out with prejudice.
> 
> Whoo, I should have looked at git history :)
> 
> That implementation looks quite bad indeed (suspending monitor?) And
> it wasn't properly documented it seems. No wonders it wasn't used.
> Btw, apparently it's the reason why "id" is there in the first place.
> I don't think there is a good reason to keep having it in sync
> commands.

Yep, the 'id' is pretty pointless now, but we keep accepting it
and sending it in responses in order to not break compatibility
with any existing apps which might be sending it / checking for
it in responses.

> >>  One of the benefits of this is that it means
> >> that libvirt can determine current status of ongoing background jobs
> >> when it restarts and reconnects to a previously launched QEMU, where
> >> as an async command approach is tied to the specific monitor connection
> >> that is open.
> >
> > And that is a real concern with any new proposal for async commands.
> 
> I don't think it's incompatible with having async commands, since in
> fact they are already async commands with dummy quick return. Some
> async should be cancelled when the client is gone (this depends on the
> async command, whether it is tight to a client or not). However, no
> old async return should be given to a new client.

Agreed, they're not mutually exclusive. I guess it is more a question
of whether there is a compelling enough reason to support both usage
approaches

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-07-29 15:18   ` Marc-André Lureau
@ 2015-09-14 13:45     ` Markus Armbruster
  2015-09-14 16:01       ` Marc-André Lureau
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Armbruster @ 2015-09-14 13:45 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: QEMU

I'm so far behind on e-mail it's not funny anymore.  My apologies...

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi Daniel
>
> On Wed, Jul 29, 2015 at 5:05 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
>> When QMP was originally written there was some infrastructure for doing
>> async commands, but over time this has all been ripped out because it
>> was never really used, complicated the code and didn't work too well
>> either. It seems we pretty much settled on the approach that all
>> commands should be fast to execute, and where there is a long term
>> job being run, we have commands to query its status, cancel it, and
>> sometimes events too.  One of the benefits of this is that it means
>> that libvirt can determine current status of ongoing background jobs
>> when it restarts and reconnects to a previously launched QEMU, where
>> as an async command approach is tied to the specific monitor connection
>> that is open.
>
> Thanks for the historic reasons, I ignored that.
>
> Querying status is not incompatible with having async commands. It's
> not because command return immediately that the job is finished, it's
> async already!

I'm not sure I got your point here.  But see below.

>                This proposal is about having a return when the job is
> actually finished to avoid having to continuously query, that's the
> main motivation. Today's return for async commands is pretty useless,
> you can admit, it's just a ack that the command got accepted and
> eventually started...

Let's step back and consider the general life cycle of a "job".  I'm
calling it "job" to distinguish it from "QMP command".

    User orders a job to get done

    System either accepts or rejects the job
    if reject, we're done

    Job runs for a while, then either fails or succeeds

Anthony's initial QMP design wanted this wrapped in QMP as follows:

    User sends a QMP command
    If we reject the job, send a QMP error response
    Else, the job runs until it fails or succeeds
        when it fails, send a QMP error response
        when it succeeds, send a QMP success response

The idea was that *all* commands are asynchronous!  However, the initial
implementation was in fact completely synchronous.

For most jobs, the "while" in "runs for a while" is reliably very short,
so this wasn't a problem.  When the first job was added where that isn't
the case, the implementation was hacked up to support asynchronous
commands.  However, we didn't change the existing commands to become
asynchronous then.  Probably because we were pushing very hard to get
QMP reasonably complete, so clients can migrate off HMP.  In retrospect,
that was probably the last chance to go back the original design,
because from then on, commands being synchronous became ABI real quick.
Asynchronous commands remained a freaky exception, and we never even
bothered to fix its bugs.

Instead, we went down a different route: we used QMP events to signal
job completion.  Events already existed, so this was natural.  Jobs
become wrapped in QMP as follows:

    User sends a QMP command
    If we reject the job, send a QMP error response
    Else, send a QMP success response
        now the job runs until it fails or succeeds
        in either case, send a QMP event

For the event to make sense, the initial command usually needs to
establish a suitable job ID.

Now let's refine the life cycle some: clients want to query job status,
and cancel jobs.

The synchronous commands + events design can support it easily.  Simply
have a command to query status, taking the job ID as argument.
Likewise, have a command to cancel.  Both are synchronous.  Cancel may
fail (say because the job has completed meanwhile).

The asynchronous command design could support it as well, with one
problem: we need a job ID.  The QMP command ID is no good, because it's
tied to a connection, while a job ID must remain valid as long as the
job runs.  Solvable.

So yes, "querying status is not incompatible with having async
commands".

My point is: asynchronous commands vs. synchronous commands + events
appears to be a wash: both get the job done.  For better or worse, we
have the latter working, but not the former.  Why should we add the
former now?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] RFC: async commands with QMP
  2015-09-14 13:45     ` Markus Armbruster
@ 2015-09-14 16:01       ` Marc-André Lureau
  0 siblings, 0 replies; 9+ messages in thread
From: Marc-André Lureau @ 2015-09-14 16:01 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: QEMU

Hi

On Mon, Sep 14, 2015 at 3:45 PM, Markus Armbruster <armbru@redhat.com> wrote:
>> Querying status is not incompatible with having async commands. It's
>> not because command return immediately that the job is finished, it's
>> async already!
>
> I'm not sure I got your point here.  But see below.

I try to explain that we have commands that are async (the monitor
returns nothing but you can later query status and receive events for
completion).

It is useful to start calling them -async, and have a common way to
deal with them, instead of having to create dissociated commands and
events and having each command to implement them in different way
(both on callee and caller side).

>>                This proposal is about having a return when the job is
>> actually finished to avoid having to continuously query, that's the
>> main motivation. Today's return for async commands is pretty useless,
>> you can admit, it's just a ack that the command got accepted and
>> eventually started...
>
> Let's step back and consider the general life cycle of a "job".  I'm
> calling it "job" to distinguish it from "QMP command".
>
>     User orders a job to get done
>
>     System either accepts or rejects the job
>     if reject, we're done
>
>     Job runs for a while, then either fails or succeeds
>
> Anthony's initial QMP design wanted this wrapped in QMP as follows:
>
>     User sends a QMP command
>     If we reject the job, send a QMP error response
>     Else, the job runs until it fails or succeeds
>         when it fails, send a QMP error response
>         when it succeeds, send a QMP success response
>
> The idea was that *all* commands are asynchronous!  However, the initial
> implementation was in fact completely synchronous.
>
> For most jobs, the "while" in "runs for a while" is reliably very short,
> so this wasn't a problem.  When the first job was added where that isn't
> the case, the implementation was hacked up to support asynchronous
> commands.  However, we didn't change the existing commands to become
> asynchronous then.  Probably because we were pushing very hard to get
> QMP reasonably complete, so clients can migrate off HMP.  In retrospect,
> that was probably the last chance to go back the original design,
> because from then on, commands being synchronous became ABI real quick.
> Asynchronous commands remained a freaky exception, and we never even
> bothered to fix its bugs.
>
> Instead, we went down a different route: we used QMP events to signal
> job completion.  Events already existed, so this was natural.  Jobs
> become wrapped in QMP as follows:
>
>     User sends a QMP command
>     If we reject the job, send a QMP error response
>     Else, send a QMP success response
>         now the job runs until it fails or succeeds
>         in either case, send a QMP event
>
> For the event to make sense, the initial command usually needs to
> establish a suitable job ID.
>
> Now let's refine the life cycle some: clients want to query job status,
> and cancel jobs.
>
> The synchronous commands + events design can support it easily.  Simply
> have a command to query status, taking the job ID as argument.
> Likewise, have a command to cancel.  Both are synchronous.  Cancel may
> fail (say because the job has completed meanwhile).
>
> The asynchronous command design could support it as well, with one
> problem: we need a job ID.  The QMP command ID is no good, because it's
> tied to a connection, while a job ID must remain valid as long as the
> job runs.  Solvable.
>
> So yes, "querying status is not incompatible with having async
> commands".
>
> My point is: asynchronous commands vs. synchronous commands + events
> appears to be a wash: both get the job done.  For better or worse, we
> have the latter working, but not the former.  Why should we add the
> former now?

If you think about it, the caller already has to deal with unexpected messages:
1)
Caller send QMP command
Caller may receive unrelated QMP event (repeat)
Caller receives QMP return

With async commands today, this is how it goes:
2)
Caller send QMP async command id "foo"
Caller may receive unrelated QMP event (repeat)
Caller receives QMP return
Caller may receive unrelated QMP event (repeat)
Caller receives QMP event for "foo" completion

After all an event is just a message, why not let this message be a
regular "return" for the caller then?
3)
Caller send QMP async command id "foo"
Caller may receive unrelated QMP event (repeat)
Caller receives QMP return for "foo"

There is no big difference with 2). Here the return is associated with
the command, there is no need for an initial "return" and no need to
come up with a new event for completion. The event isn't broadcasted
to all listeners.

So it's not so much about introducing async, but rather providing
helpers to help deal with async commands. Events are not well
associated with commands in the API and they are better for broadcast
than to reply from command. The resulting QAPI isn't actually changed:
But callers should opt-in because return may come out of order and
that's similar to events for return today.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-09-14 16:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-29 14:49 [Qemu-devel] RFC: async commands with QMP Marc-André Lureau
2015-07-29 14:51 ` Marc-André Lureau
2015-07-29 15:05 ` Daniel P. Berrange
2015-07-29 15:10   ` Eric Blake
2015-07-29 15:32     ` Marc-André Lureau
2015-07-29 15:45       ` Daniel P. Berrange
2015-07-29 15:18   ` Marc-André Lureau
2015-09-14 13:45     ` Markus Armbruster
2015-09-14 16:01       ` Marc-André Lureau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.