All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn
@ 2015-07-10  3:46 Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 01/15] qapi: Add transaction support to block-dirty-bitmap operations Fam Zheng
                   ` (14 more replies)
  0 siblings, 15 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

v3: Simplify the code a bit to implement the idea as discussed in v2 thread:
    https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg02130.html

    Unchanged:
    [01/15] qapi: Add transaction support to block-dirty-bitmap operations
    [02/15] iotests: add transactional incremental backup test
    [03/15] block: rename BlkTransactionState and BdrvActionOps
    [04/15] block: keep bitmap if incremental backup job is cancelled

    New patch to refactor out dirty bitmap reclaimation code in backup:
    [05/15] backup: Extract dirty bitmap handling as a separate function

    Two new transaction specific operations for block job driver:
    [06/15] blockjob: Add .txn_commit and .txn_abort transaction actions

    Two new fields that will make managing state of block job easier:
    [07/15] blockjob: Add "completed" and "ret" in BlockJob

    New. Necessary to avoid race conditions between jobs:
    [08/15] blockjob: Simplify block_job_finish_sync
    [09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob

    Partly rewrote the implementation:
    [10/15] block: add block job transactions

    Slightly adjust to the new API (no block_job_txn_begin):
    [11/15] blockdev: make BlockJobTxn available to qmp 'transaction'

    Slightly adjust to the new API (no block_job_txn_job_done).
    [12/15] block/backup: support block job transactions

    Unchanged:
    [13/15] iotests: 124 - transactional failure test
    [14/15] qmp-commands.hx: Update the supported 'transaction' operations
    [15/15] tests: add BlockJobTxn unit test


---
Original cover letter from Stefan's v2, note that block_job_txn_begin and
block_job_txn_job_done are not necessary now:

v2:
 Patch 5:
  * Set txn pointer to NULL in block_job_txn_begin() [jsnow]
  * Rename block_job_txn_job_done to block_job_txn_job_done [jsnow]
  * Rename block_job_txn_complete to block_job_txn_kick [jsnow]
  * Add BLOCK_JOB_TXN_CANCEL_PENDING to solve race condition on cancel [jsnow]
  * Document when txn may be NULL
 Patch 7:
  * Convert blockdev-backup in addition to drive-backup
  * Add 'transactional-cancel' argument so users can enable/disable
    transactional behavior.  This preserves semantics for existing users
    and allows fine-grained control over when to use transaction
    semantics. [jsnow]
 Patch 10:
  * Test fail/cancel race condition [jsnow]
  * Use new 'transactional-cancel' QMP argument

This series uses patches from John Snow's "[PATCH v6 00/10] block: incremental
backup transactions" series but implements the feature with a new transaction
mechanism for blockjobs called BlockJobTxn.

Recap: motivation for block job transactions
--------------------------------------------
If an incremental backup block job fails then we reclaim the bitmap so
the job can be retried.  The problem comes when multiple jobs are started as
part of a qmp 'transaction' command.  We need to group these jobs in a
transaction so that either all jobs complete successfully or all bitmaps are
reclaimed.

Without transactions, there is a case where some jobs complete successfully and
throw away their bitmaps, making it impossible to retry the backup by rerunning
the command if one of the jobs fails.

How does this implementation work?
----------------------------------
These patches add a BlockJobTxn object with the following API:

  txn = block_job_txn_new();
  block_job_txn_add_job(txn, job1);
  block_job_txn_add_job(txn, job2);
  block_job_txn_begin();

The jobs either both complete successfully or they both fail/cancel.  If the
user cancels job1 then job2 will also be cancelled and vice versa.

Jobs stay alive waiting for other jobs to complete.  They can be cancelled by
the user during this time.  Job blockers are still in effect and no other block
job can run on this device in the meantime (since QEMU currently only allows 1
job per device).  This is the main drawback to this approach but reasonable
since you probably don't want to run other jobs/operations until you're sure
the backup was successful (you won't be able to retry a failed backup if
there's a new job running).

Adding transaction support to the backup job is very easy.  It just needs to
make a call before throwing away the bitmap and returning from its coroutine:

  block_job_txn_job_done(job->txn, job, ret);

  if (job->sync_bitmap) {
      BdrvDirtyBitmap *bm;
      if (ret < 0 || block_job_is_cancelled(&job->common)) {
          ...

Fam Zheng (5):
  backup: Extract dirty bitmap handling as a separate function
  blockjob: Add .txn_commit and .txn_abort transaction actions
  blockjob: Add "completed" and "ret" in BlockJob
  blockjob: Simplify block_job_finish_sync
  blockjob: Move BlockJobDeferToMainLoopData into BlockJob

John Snow (4):
  qapi: Add transaction support to block-dirty-bitmap operations
  iotests: add transactional incremental backup test
  block: rename BlkTransactionState and BdrvActionOps
  iotests: 124 - transactional failure test

Kashyap Chamarthy (1):
  qmp-commands.hx: Update the supported 'transaction' operations

Stefan Hajnoczi (5):
  block: keep bitmap if incremental backup job is cancelled
  block: add block job transactions
  blockdev: make BlockJobTxn available to qmp 'transaction'
  block/backup: support block job transactions
  tests: add BlockJobTxn unit test

 block.c                    |  19 ++-
 block/backup.c             |  49 ++++--
 blockdev.c                 | 360 ++++++++++++++++++++++++++++++++++++---------
 blockjob.c                 | 146 +++++++++++++-----
 docs/bitmaps.md            |   6 +-
 hmp.c                      |   2 +-
 include/block/block.h      |   2 +-
 include/block/block_int.h  |   6 +-
 include/block/blockjob.h   |  63 +++++++-
 qapi-schema.json           |   6 +-
 qapi/block-core.json       |  14 +-
 qmp-commands.hx            |  21 ++-
 tests/Makefile             |   3 +
 tests/qemu-iotests/124     | 182 ++++++++++++++++++++++-
 tests/qemu-iotests/124.out |   4 +-
 tests/test-blockjob-txn.c  | 228 ++++++++++++++++++++++++++++
 16 files changed, 963 insertions(+), 148 deletions(-)
 create mode 100644 tests/test-blockjob-txn.c

-- 
2.4.3

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 01/15] qapi: Add transaction support to block-dirty-bitmap operations
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 02/15] iotests: add transactional incremental backup test Fam Zheng
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: John Snow <jsnow@redhat.com>

This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   |  19 +++++++-
 blockdev.c                | 114 +++++++++++++++++++++++++++++++++++++++++++++-
 docs/bitmaps.md           |   6 +--
 include/block/block.h     |   1 -
 include/block/block_int.h |   3 ++
 qapi-schema.json          |   6 ++-
 6 files changed, 139 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 5e80336..0d1623a 100644
--- a/block.c
+++ b/block.c
@@ -3510,10 +3510,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
     hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
     assert(bdrv_dirty_bitmap_enabled(bitmap));
-    hbitmap_reset_all(bitmap->bitmap);
+    if (!out) {
+        hbitmap_reset_all(bitmap->bitmap);
+    } else {
+        HBitmap *backup = bitmap->bitmap;
+        bitmap->bitmap = hbitmap_alloc(bitmap->size,
+                                       hbitmap_granularity(backup));
+        *out = backup;
+    }
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+    HBitmap *tmp = bitmap->bitmap;
+    assert(bdrv_dirty_bitmap_enabled(bitmap));
+    bitmap->bitmap = in;
+    hbitmap_free(tmp);
 }
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
diff --git a/blockdev.c b/blockdev.c
index 7fee519..a4d8f65 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1708,6 +1708,106 @@ static void blockdev_backup_clean(BlkTransactionState *common)
     }
 }
 
+typedef struct BlockDirtyBitmapState {
+    BlkTransactionState common;
+    BdrvDirtyBitmap *bitmap;
+    BlockDriverState *bs;
+    AioContext *aio_context;
+    HBitmap *backup;
+    bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+                                           Error **errp)
+{
+    Error *local_err = NULL;
+    BlockDirtyBitmapAdd *action;
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    action = common->action->block_dirty_bitmap_add;
+    /* AIO context taken and released within qmp_block_dirty_bitmap_add */
+    qmp_block_dirty_bitmap_add(action->node, action->name,
+                               action->has_granularity, action->granularity,
+                               &local_err);
+
+    if (!local_err) {
+        state->prepared = true;
+    } else {
+        error_propagate(errp, local_err);
+    }
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+    BlockDirtyBitmapAdd *action;
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    action = common->action->block_dirty_bitmap_add;
+    /* Should not be able to fail: IF the bitmap was added via .prepare(),
+     * then the node reference and bitmap name must have been valid.
+     */
+    if (state->prepared) {
+        qmp_block_dirty_bitmap_remove(action->node, action->name, &error_abort);
+    }
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+                                             Error **errp)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+    BlockDirtyBitmap *action;
+
+    action = common->action->block_dirty_bitmap_clear;
+    state->bitmap = block_dirty_bitmap_lookup(action->node,
+                                              action->name,
+                                              &state->bs,
+                                              &state->aio_context,
+                                              errp);
+    if (!state->bitmap) {
+        return;
+    }
+
+    if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
+        error_setg(errp, "Cannot modify a frozen bitmap");
+        return;
+    } else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
+        error_setg(errp, "Cannot clear a disabled bitmap");
+        return;
+    }
+
+    bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
+    /* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
+}
+
+static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    hbitmap_free(state->backup);
+}
+
+static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    if (state->aio_context) {
+        aio_context_release(state->aio_context);
+    }
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
     error_setg(errp, "Transaction aborted using Abort action");
@@ -1748,6 +1848,18 @@ static const BdrvActionOps actions[] = {
         .abort = internal_snapshot_abort,
         .clean = internal_snapshot_clean,
     },
+    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+        .instance_size = sizeof(BlockDirtyBitmapState),
+        .prepare = block_dirty_bitmap_add_prepare,
+        .abort = block_dirty_bitmap_add_abort,
+    },
+    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
+        .instance_size = sizeof(BlockDirtyBitmapState),
+        .prepare = block_dirty_bitmap_clear_prepare,
+        .commit = block_dirty_bitmap_clear_commit,
+        .abort = block_dirty_bitmap_clear_abort,
+        .clean = block_dirty_bitmap_clear_clean,
+    }
 };
 
 /*
@@ -2131,7 +2243,7 @@ void qmp_block_dirty_bitmap_clear(const char *node, const char *name,
         goto out;
     }
 
-    bdrv_clear_dirty_bitmap(bitmap);
+    bdrv_clear_dirty_bitmap(bitmap, NULL);
 
  out:
     aio_context_release(aio_context);
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index fa87f07..9fd8ea6 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -97,11 +97,7 @@ which is included at the end of this document.
 }
 ```
 
-## Transactions (Not yet implemented)
-
-* Transactional commands are forthcoming in a future version,
-  and are not yet available for use. This section serves as
-  documentation of intent for their design and usage.
+## Transactions
 
 ### Justification
 
diff --git a/include/block/block.h b/include/block/block.h
index 06e4137..7437590 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -497,7 +497,6 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
                            int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
                              int64_t cur_sector, int nr_sectors);
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, struct HBitmapIter *hbi);
 void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8996baf..6eb22c7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -663,4 +663,7 @@ void blk_dev_resize_cb(BlockBackend *blk);
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
 
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in);
+
 #endif /* BLOCK_INT_H */
diff --git a/qapi-schema.json b/qapi-schema.json
index 1285b8c..d5fbc46 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1493,6 +1493,8 @@
 # abort since 1.6
 # blockdev-snapshot-internal-sync since 1.7
 # blockdev-backup since 2.3
+# block-dirty-bitmap-add since 2.4
+# block-dirty-bitmap-clear since 2.4
 ##
 { 'union': 'TransactionAction',
   'data': {
@@ -1500,7 +1502,9 @@
        'drive-backup': 'DriveBackup',
        'blockdev-backup': 'BlockdevBackup',
        'abort': 'Abort',
-       'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
+       'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+       'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
+       'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
    } }
 
 ##
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 02/15] iotests: add transactional incremental backup test
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 01/15] qapi: Add transaction support to block-dirty-bitmap operations Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 03/15] block: rename BlkTransactionState and BdrvActionOps Fam Zheng
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: John Snow <jsnow@redhat.com>

Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/124     | 54 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 9ccd118..9c1977e 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
         pass
 
 
+def transaction_action(action, **kwargs):
+    return {
+        'type': action,
+        'data': kwargs
+    }
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+    return transaction_action('block-dirty-bitmap-clear',
+                              node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+    return transaction_action('drive-backup', device=device, target=target,
+                              **kwargs)
+
+
 class Bitmap:
     def __init__(self, name, drive):
         self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
         return self.do_incremental_simple(granularity=131072)
 
 
+    def test_incremental_transaction(self):
+        '''Test: Verify backups made from transactionally created bitmaps.
+
+        Create a bitmap "before" VM execution begins, then create a second
+        bitmap AFTER writes have already occurred. Use transactions to create
+        a full backup and synchronize both bitmaps to this backup.
+        Create an incremental backup through both bitmaps and verify that
+        both backups match the current drive0 image.
+        '''
+
+        drive0 = self.drives[0]
+        bitmap0 = self.add_bitmap('bitmap0', drive0)
+        self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+                                          ('0xfe', '16M', '256k'),
+                                          ('0x64', '32736k', '64k')))
+        bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+        result = self.vm.qmp('transaction', actions=[
+            transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+            transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+            transaction_drive_backup(drive0['id'], drive0['backup'],
+                                     sync='full', format=drive0['fmt'])
+        ])
+        self.assert_qmp(result, 'return', {})
+        self.wait_until_completed(drive0['id'])
+        self.files.append(drive0['backup'])
+
+        self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+                                          ('0x55', '8M', '352k'),
+                                          ('0x78', '15872k', '1M')))
+        # Both bitmaps should be correctly in sync.
+        self.create_incremental(bitmap0)
+        self.create_incremental(bitmap1)
+        self.vm.shutdown()
+        self.check_backups()
+
+
     def test_incremental_failure(self):
         '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-.......
+........
 ----------------------------------------------------------------------
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 03/15] block: rename BlkTransactionState and BdrvActionOps
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 01/15] qapi: Add transaction support to block-dirty-bitmap operations Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 02/15] iotests: add transactional incremental backup test Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled Fam Zheng
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: John Snow <jsnow@redhat.com>

These structures are misnomers, somewhat.

(1) BlockTransactionState is not state for a transaction,
    but is rather state for a single transaction action.
    Rename it "BlkActionState" to be more accurate.

(2) The BdrvActionOps describes operations for the BlkActionState,
    above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
    which there isn't.
    Rename this to 'BlkActionOps' to match 'BlkActionState'.

Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.

This patch changes only comments and names, and should not affect
behavior in any way.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 116 ++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 65 insertions(+), 51 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a4d8f65..0ab8ad9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,43 +1240,57 @@ static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *node,
 
 /* New and old BlockDriverState structs for atomic group operations */
 
-typedef struct BlkTransactionState BlkTransactionState;
+typedef struct BlkActionState BlkActionState;
 
-/* Only prepare() may fail. In a single transaction, only one of commit() or
-   abort() will be called, clean() will always be called if it present. */
-typedef struct BdrvActionOps {
-    /* Size of state struct, in bytes. */
+/**
+ * BlkActionOps:
+ * Table of operations that define an Action.
+ *
+ * @instance_size: Size of state struct, in bytes.
+ * @prepare: Prepare the work, must NOT be NULL.
+ * @commit: Commit the changes, can be NULL.
+ * @abort: Abort the changes on fail, can be NULL.
+ * @clean: Clean up resources after all transaction actions have called
+ *         commit() or abort(). Can be NULL.
+ *
+ * Only prepare() may fail. In a single transaction, only one of commit() or
+ * abort() will be called. clean() will always be called if it is present.
+ */
+typedef struct BlkActionOps {
     size_t instance_size;
-    /* Prepare the work, must NOT be NULL. */
-    void (*prepare)(BlkTransactionState *common, Error **errp);
-    /* Commit the changes, can be NULL. */
-    void (*commit)(BlkTransactionState *common);
-    /* Abort the changes on fail, can be NULL. */
-    void (*abort)(BlkTransactionState *common);
-    /* Clean up resource in the end, can be NULL. */
-    void (*clean)(BlkTransactionState *common);
-} BdrvActionOps;
+    void (*prepare)(BlkActionState *common, Error **errp);
+    void (*commit)(BlkActionState *common);
+    void (*abort)(BlkActionState *common);
+    void (*clean)(BlkActionState *common);
+} BlkActionOps;
 
-/*
- * This structure must be arranged as first member in child type, assuming
- * that compiler will also arrange it to the same address with parent instance.
- * Later it will be used in free().
+/**
+ * BlkActionState:
+ * Describes one Action's state within a Transaction.
+ *
+ * @action: QAPI-defined enum identifying which Action to perform.
+ * @ops: Table of ActionOps this Action can perform.
+ * @entry: List membership for all Actions in this Transaction.
+ *
+ * This structure must be arranged as first member in a subclassed type,
+ * assuming that the compiler will also arrange it to the same offsets as the
+ * base class.
  */
-struct BlkTransactionState {
+struct BlkActionState {
     TransactionAction *action;
-    const BdrvActionOps *ops;
-    QSIMPLEQ_ENTRY(BlkTransactionState) entry;
+    const BlkActionOps *ops;
+    QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     AioContext *aio_context;
     QEMUSnapshotInfo sn;
 } InternalSnapshotState;
 
-static void internal_snapshot_prepare(BlkTransactionState *common,
+static void internal_snapshot_prepare(BlkActionState *common,
                                       Error **errp)
 {
     Error *local_err = NULL;
@@ -1372,7 +1386,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     state->bs = bs;
 }
 
-static void internal_snapshot_abort(BlkTransactionState *common)
+static void internal_snapshot_abort(BlkActionState *common)
 {
     InternalSnapshotState *state =
                              DO_UPCAST(InternalSnapshotState, common, common);
@@ -1395,7 +1409,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
     }
 }
 
-static void internal_snapshot_clean(BlkTransactionState *common)
+static void internal_snapshot_clean(BlkActionState *common)
 {
     InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
                                              common, common);
@@ -1407,13 +1421,13 @@ static void internal_snapshot_clean(BlkTransactionState *common)
 
 /* external snapshot private data */
 typedef struct ExternalSnapshotState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *old_bs;
     BlockDriverState *new_bs;
     AioContext *aio_context;
 } ExternalSnapshotState;
 
-static void external_snapshot_prepare(BlkTransactionState *common,
+static void external_snapshot_prepare(BlkActionState *common,
                                       Error **errp)
 {
     BlockDriver *drv;
@@ -1534,7 +1548,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     }
 }
 
-static void external_snapshot_commit(BlkTransactionState *common)
+static void external_snapshot_commit(BlkActionState *common)
 {
     ExternalSnapshotState *state =
                              DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1552,7 +1566,7 @@ static void external_snapshot_commit(BlkTransactionState *common)
     aio_context_release(state->aio_context);
 }
 
-static void external_snapshot_abort(BlkTransactionState *common)
+static void external_snapshot_abort(BlkActionState *common)
 {
     ExternalSnapshotState *state =
                              DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1565,13 +1579,13 @@ static void external_snapshot_abort(BlkTransactionState *common)
 }
 
 typedef struct DriveBackupState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     AioContext *aio_context;
     BlockJob *job;
 } DriveBackupState;
 
-static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
+static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     BlockDriverState *bs;
@@ -1612,7 +1626,7 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
     state->job = state->bs->job;
 }
 
-static void drive_backup_abort(BlkTransactionState *common)
+static void drive_backup_abort(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     BlockDriverState *bs = state->bs;
@@ -1623,7 +1637,7 @@ static void drive_backup_abort(BlkTransactionState *common)
     }
 }
 
-static void drive_backup_clean(BlkTransactionState *common)
+static void drive_backup_clean(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
@@ -1633,13 +1647,13 @@ static void drive_backup_clean(BlkTransactionState *common)
 }
 
 typedef struct BlockdevBackupState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     BlockJob *job;
     AioContext *aio_context;
 } BlockdevBackupState;
 
-static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
+static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockdevBackup *backup;
@@ -1688,7 +1702,7 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
     state->job = state->bs->job;
 }
 
-static void blockdev_backup_abort(BlkTransactionState *common)
+static void blockdev_backup_abort(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockDriverState *bs = state->bs;
@@ -1699,7 +1713,7 @@ static void blockdev_backup_abort(BlkTransactionState *common)
     }
 }
 
-static void blockdev_backup_clean(BlkTransactionState *common)
+static void blockdev_backup_clean(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
 
@@ -1709,7 +1723,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
 }
 
 typedef struct BlockDirtyBitmapState {
-    BlkTransactionState common;
+    BlkActionState common;
     BdrvDirtyBitmap *bitmap;
     BlockDriverState *bs;
     AioContext *aio_context;
@@ -1717,7 +1731,7 @@ typedef struct BlockDirtyBitmapState {
     bool prepared;
 } BlockDirtyBitmapState;
 
-static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_add_prepare(BlkActionState *common,
                                            Error **errp)
 {
     Error *local_err = NULL;
@@ -1738,7 +1752,7 @@ static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
     }
 }
 
-static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_add_abort(BlkActionState *common)
 {
     BlockDirtyBitmapAdd *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1753,7 +1767,7 @@ static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
     }
 }
 
-static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
                                              Error **errp)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1782,7 +1796,7 @@ static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
     /* AioContext is released in .clean() */
 }
 
-static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_abort(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1790,7 +1804,7 @@ static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
     bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
 }
 
-static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_commit(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1798,7 +1812,7 @@ static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
     hbitmap_free(state->backup);
 }
 
-static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_clean(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1808,17 +1822,17 @@ static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
     }
 }
 
-static void abort_prepare(BlkTransactionState *common, Error **errp)
+static void abort_prepare(BlkActionState *common, Error **errp)
 {
     error_setg(errp, "Transaction aborted using Abort action");
 }
 
-static void abort_commit(BlkTransactionState *common)
+static void abort_commit(BlkActionState *common)
 {
     g_assert_not_reached(); /* this action never succeeds */
 }
 
-static const BdrvActionOps actions[] = {
+static const BlkActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
         .instance_size = sizeof(ExternalSnapshotState),
         .prepare  = external_snapshot_prepare,
@@ -1838,7 +1852,7 @@ static const BdrvActionOps actions[] = {
         .clean = blockdev_backup_clean,
     },
     [TRANSACTION_ACTION_KIND_ABORT] = {
-        .instance_size = sizeof(BlkTransactionState),
+        .instance_size = sizeof(BlkActionState),
         .prepare = abort_prepare,
         .commit = abort_commit,
     },
@@ -1869,10 +1883,10 @@ static const BdrvActionOps actions[] = {
 void qmp_transaction(TransactionActionList *dev_list, Error **errp)
 {
     TransactionActionList *dev_entry = dev_list;
-    BlkTransactionState *state, *next;
+    BlkActionState *state, *next;
     Error *local_err = NULL;
 
-    QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionState) snap_bdrv_states;
+    QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
     QSIMPLEQ_INIT(&snap_bdrv_states);
 
     /* drain all i/o before any operations */
@@ -1881,7 +1895,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
     /* We don't do anything in this loop that commits us to the operations */
     while (NULL != dev_entry) {
         TransactionAction *dev_info = NULL;
-        const BdrvActionOps *ops;
+        const BlkActionOps *ops;
 
         dev_info = dev_entry->value;
         dev_entry = dev_entry->next;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (2 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 03/15] block: rename BlkTransactionState and BdrvActionOps Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 19:48   ` John Snow
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function Fam Zheng
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Stefan Hajnoczi <stefanha@redhat.com>

Reclaim the dirty bitmap if an incremental backup block job is
cancelled.  The ret variable may be 0 when the job is cancelled so it's
not enough to check ret < 0.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index d3c7d9f..965654d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque)
 
     if (job->sync_bitmap) {
         BdrvDirtyBitmap *bm;
-        if (ret < 0) {
+        if (ret < 0 || block_job_is_cancelled(&job->common)) {
             /* Merge the successor back into the parent, delete nothing. */
             bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
             assert(bm);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (3 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:06   ` John Snow
  2015-07-14  8:26   ` Stefan Hajnoczi
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions Fam Zheng
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

This will be reused by the coming new transactional completion code.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 965654d..6e24384 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -210,6 +210,21 @@ static void backup_iostatus_reset(BlockJob *job)
 
     bdrv_iostatus_reset(s->target);
 }
+static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)
+{
+    BdrvDirtyBitmap *bm;
+    BlockDriverState *bs = job->common.bs;
+
+    if (ret < 0 || block_job_is_cancelled(&job->common)) {
+        /* Merge the successor back into the parent, delete nothing. */
+        bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
+        assert(bm);
+    } else {
+        /* Everything is fine, delete this bitmap and install the backup. */
+        bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
+        assert(bm);
+    }
+}
 
 static const BlockJobDriver backup_job_driver = {
     .instance_size  = sizeof(BackupBlockJob),
@@ -430,16 +445,7 @@ static void coroutine_fn backup_run(void *opaque)
     qemu_co_rwlock_unlock(&job->flush_rwlock);
 
     if (job->sync_bitmap) {
-        BdrvDirtyBitmap *bm;
-        if (ret < 0 || block_job_is_cancelled(&job->common)) {
-            /* Merge the successor back into the parent, delete nothing. */
-            bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        } else {
-            /* Everything is fine, delete this bitmap and install the backup. */
-            bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        }
+        backup_handle_dirty_bitmap(job, ret);
     }
     hbitmap_free(job->bitmap);
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (4 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:06   ` John Snow
  2015-07-14  8:35   ` Stefan Hajnoczi
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob Fam Zheng
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

They will be called if the job is part of a transaction, after all jobs in a
transaction are completed or cancelled, before calling job->cb().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/blockjob.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index dd9d5e6..a7b7f66 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -50,6 +50,18 @@ typedef struct BlockJobDriver {
      * manually.
      */
     void (*complete)(BlockJob *job, Error **errp);
+
+    /**
+     * Optional callback for job types that can be in a transaction. Called
+     * when the transaction succeeds.
+     */
+    void (*txn_commit)(BlockJob *job);
+
+    /**
+     * Optional callback for job types that can be in a transaction. Call when
+     * the transaction fails.
+     */
+    void (*txn_abort)(BlockJob *job);
 } BlockJobDriver;
 
 /**
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (5 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:08   ` John Snow
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync Fam Zheng
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

They are set when block_job_completed is called.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c               | 3 +++
 include/block/blockjob.h | 9 +++++++++
 2 files changed, 12 insertions(+)

diff --git a/blockjob.c b/blockjob.c
index 62bb906..fb1d9e7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -89,6 +89,9 @@ void block_job_completed(BlockJob *job, int ret)
     BlockDriverState *bs = job->bs;
 
     assert(bs->job == job);
+    assert(!job->completed);
+    job->completed = true;
+    job->ret = ret;
     job->cb(job->opaque, ret);
     block_job_release(bs);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index a7b7f66..40d0776 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -134,6 +134,15 @@ struct BlockJob {
 
     /** The opaque value that is passed to the completion function.  */
     void *opaque;
+
+    /* True if this job has reported completion by calling block_job_completed.
+     */
+    bool completed;
+
+    /* ret code passed to block_job_completed.
+     */
+    int ret;
+
 };
 
 /**
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (6 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:08   ` John Snow
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob Fam Zheng
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

With job->completed and job->ret to replace BlockFinishData.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index fb1d9e7..11b48f5 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -179,43 +179,24 @@ struct BlockFinishData {
     int ret;
 };
 
-static void block_job_finish_cb(void *opaque, int ret)
-{
-    struct BlockFinishData *data = opaque;
-
-    data->cancelled = block_job_is_cancelled(data->job);
-    data->ret = ret;
-    data->cb(data->opaque, ret);
-}
-
 static int block_job_finish_sync(BlockJob *job,
                                  void (*finish)(BlockJob *, Error **errp),
                                  Error **errp)
 {
-    struct BlockFinishData data;
     BlockDriverState *bs = job->bs;
     Error *local_err = NULL;
 
     assert(bs->job == job);
 
-    /* Set up our own callback to store the result and chain to
-     * the original callback.
-     */
-    data.job = job;
-    data.cb = job->cb;
-    data.opaque = job->opaque;
-    data.ret = -EINPROGRESS;
-    job->cb = block_job_finish_cb;
-    job->opaque = &data;
     finish(job, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return -EBUSY;
     }
-    while (data.ret == -EINPROGRESS) {
+    while (!job->completed) {
         aio_poll(bdrv_get_aio_context(bs), true);
     }
-    return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
+    return (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
 }
 
 /* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (7 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-14 10:03   ` Stefan Hajnoczi
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 10/15] block: add block job transactions Fam Zheng
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

This will bind the BH data to block job data and is therefore easier to
manage, for example during cancellation.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c               | 34 +++++++++++++++-------------------
 include/block/blockjob.h | 16 ++++++++++++++--
 2 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 11b48f5..e057dd5 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -92,6 +92,10 @@ void block_job_completed(BlockJob *job, int ret)
     assert(!job->completed);
     job->completed = true;
     job->ret = ret;
+    if (job->defer_to_main_loop_data.bh) {
+        qemu_bh_delete(job->defer_to_main_loop_data.bh);
+        job->defer_to_main_loop_data.bh = NULL;
+    }
     job->cb(job->opaque, ret);
     block_job_release(bs);
 }
@@ -344,44 +348,36 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
     return action;
 }
 
-typedef struct {
-    BlockJob *job;
-    QEMUBH *bh;
-    AioContext *aio_context;
-    BlockJobDeferToMainLoopFn *fn;
-    void *opaque;
-} BlockJobDeferToMainLoopData;
-
 static void block_job_defer_to_main_loop_bh(void *opaque)
 {
-    BlockJobDeferToMainLoopData *data = opaque;
+    BlockJob *job = opaque;
+    /* Copy the struct in case job get released in data.fn() */
+    BlockJobDeferToMainLoopData data = job->defer_to_main_loop_data;
     AioContext *aio_context;
 
-    qemu_bh_delete(data->bh);
+    qemu_bh_delete(data.bh);
 
     /* Prevent race with block_job_defer_to_main_loop() */
-    aio_context_acquire(data->aio_context);
+    aio_context_acquire(data.aio_context);
 
     /* Fetch BDS AioContext again, in case it has changed */
-    aio_context = bdrv_get_aio_context(data->job->bs);
+    aio_context = bdrv_get_aio_context(job->bs);
     aio_context_acquire(aio_context);
 
-    data->fn(data->job, data->opaque);
+    data.fn(job, data.opaque);
 
     aio_context_release(aio_context);
 
-    aio_context_release(data->aio_context);
-
-    g_free(data);
+    aio_context_release(data.aio_context);
 }
 
 void block_job_defer_to_main_loop(BlockJob *job,
                                   BlockJobDeferToMainLoopFn *fn,
                                   void *opaque)
 {
-    BlockJobDeferToMainLoopData *data = g_malloc(sizeof(*data));
-    data->job = job;
-    data->bh = qemu_bh_new(block_job_defer_to_main_loop_bh, data);
+    BlockJobDeferToMainLoopData *data = &job->defer_to_main_loop_data;
+    assert(!data->bh);
+    data->bh = qemu_bh_new(block_job_defer_to_main_loop_bh, job);
     data->aio_context = bdrv_get_aio_context(job->bs);
     data->fn = fn;
     data->opaque = opaque;
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 40d0776..5bac2e2 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -64,6 +64,15 @@ typedef struct BlockJobDriver {
     void (*txn_abort)(BlockJob *job);
 } BlockJobDriver;
 
+typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
+
+typedef struct {
+    QEMUBH *bh;
+    AioContext *aio_context;
+    BlockJobDeferToMainLoopFn *fn;
+    void *opaque;
+} BlockJobDeferToMainLoopData;
+
 /**
  * BlockJob:
  *
@@ -83,6 +92,11 @@ struct BlockJob {
     Coroutine *co;
 
     /**
+     * The data used by block_job_defer_to_main_loop.
+     */
+    BlockJobDeferToMainLoopData defer_to_main_loop_data;
+
+    /**
      * Set to true if the job should cancel itself.  The flag must
      * always be tested just before toggling the busy flag from false
      * to true.  After a job has been cancelled, it should only yield
@@ -359,8 +373,6 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
                                         BlockdevOnError on_err,
                                         int is_read, int error);
 
-typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
-
 /**
  * block_job_defer_to_main_loop:
  * @job: The job
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 10/15] block: add block job transactions
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (8 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:12   ` John Snow
  2015-07-14 10:27   ` Stefan Hajnoczi
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 11/15] blockdev: make BlockJobTxn available to qmp 'transaction' Fam Zheng
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Stefan Hajnoczi <stefanha@redhat.com>

Sometimes block jobs must execute as a transaction group.  Finishing
jobs wait until all other jobs are ready to complete successfully.
Failure or cancellation of one job cancels the other jobs in the group.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Rewrite the implementation which is now contained in block_job_completed.
--Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c               | 86 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block.h    |  1 +
 include/block/blockjob.h | 26 +++++++++++++++
 3 files changed, 113 insertions(+)

diff --git a/blockjob.c b/blockjob.c
index e057dd5..7b59b53 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -36,6 +36,16 @@
 #include "qemu/timer.h"
 #include "qapi-event.h"
 
+/* Transactional group of block jobs */
+struct BlockJobTxn {
+
+    /* Is this txn being cancelled? */
+    bool aborting;
+
+    /* List of jobs */
+    QLIST_HEAD(, BlockJob) jobs;
+};
+
 void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
                        int64_t speed, BlockCompletionFunc *cb,
                        void *opaque, Error **errp)
@@ -84,6 +94,59 @@ void block_job_release(BlockDriverState *bs)
     g_free(job);
 }
 
+static void block_job_completed_txn(BlockJobTxn *txn, BlockJob *job, int ret)
+{
+    AioContext *ctx;
+    BlockJob *other_job, *next;
+    if (ret < 0 || block_job_is_cancelled(job)) {
+        if (!txn->aborting) {
+            /* We are the first failed job. Cancel other jobs. */
+            txn->aborting = true;
+            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+                if (other_job == job || other_job->completed) {
+                    continue;
+                }
+                ctx = bdrv_get_aio_context(other_job->bs);
+                aio_context_acquire(ctx);
+                block_job_cancel_sync(other_job);
+                assert(other_job->completed);
+                aio_context_release(ctx);
+            }
+            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+                if (other_job->driver->txn_abort) {
+                    other_job->driver->txn_abort(other_job);
+                }
+                other_job->cb(other_job->opaque,
+                              other_job->ret ? : -ECANCELED);
+                block_job_release(other_job->bs);
+            }
+        } else {
+            /*
+             * We are cancelled by another job, who will handle everything.
+             */
+            return;
+        }
+    } else {
+        /*
+         * Successful completion, see if there is other running jobs in this
+         * txn. */
+        QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+            if (!other_job->completed) {
+                return;
+            }
+        }
+        /* We are the last completed job, commit the transaction. */
+        QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+            if (other_job->driver->txn_commit) {
+                other_job->driver->txn_commit(other_job);
+            }
+            assert(other_job->ret == 0);
+            other_job->cb(other_job->opaque, 0);
+            block_job_release(other_job->bs);
+        }
+    }
+}
+
 void block_job_completed(BlockJob *job, int ret)
 {
     BlockDriverState *bs = job->bs;
@@ -96,6 +159,10 @@ void block_job_completed(BlockJob *job, int ret)
         qemu_bh_delete(job->defer_to_main_loop_data.bh);
         job->defer_to_main_loop_data.bh = NULL;
     }
+    if (job->txn) {
+        block_job_completed_txn(job->txn, job, ret);
+        return;
+    }
     job->cb(job->opaque, ret);
     block_job_release(bs);
 }
@@ -384,3 +451,22 @@ void block_job_defer_to_main_loop(BlockJob *job,
 
     qemu_bh_schedule(data->bh);
 }
+
+BlockJobTxn *block_job_txn_new(void)
+{
+    BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
+    QLIST_INIT(&txn->jobs);
+    return txn;
+}
+
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
+{
+    if (!txn) {
+        return;
+    }
+
+    assert(!job->txn);
+    job->txn = txn;
+
+    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
+}
diff --git a/include/block/block.h b/include/block/block.h
index 7437590..c7fc5b6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -13,6 +13,7 @@
 typedef struct BlockDriver BlockDriver;
 typedef struct BlockJob BlockJob;
 typedef struct BdrvChildRole BdrvChildRole;
+typedef struct BlockJobTxn BlockJobTxn;
 
 typedef struct BlockDriverInfo {
     /* in bytes, 0 if irrelevant */
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 5bac2e2..d854ee0 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -157,6 +157,9 @@ struct BlockJob {
      */
     int ret;
 
+    /** Non-NULL if this job is part of a transaction */
+    BlockJobTxn *txn;
+    QLIST_ENTRY(BlockJob) txn_list;
 };
 
 /**
@@ -389,4 +392,27 @@ void block_job_defer_to_main_loop(BlockJob *job,
                                   BlockJobDeferToMainLoopFn *fn,
                                   void *opaque);
 
+/**
+ * block_job_txn_new:
+ *
+ * Allocate and return a new block job transaction.  Jobs can be added to the
+ * transaction using block_job_txn_add_job().
+ *
+ * All jobs in the transaction either complete successfully or fail/cancel as a
+ * group.  Jobs wait for each other before completing.  Cancelling one job
+ * cancels all jobs in the transaction.
+ */
+BlockJobTxn *block_job_txn_new(void);
+
+/**
+ * block_job_txn_add_job:
+ * @txn: The transaction (may be NULL)
+ * @job: Job to add to the transaction
+ *
+ * Add @job to the transaction.  The @job must not already be in a transaction.
+ * The block job driver must call block_job_txn_prepare_to_complete() before
+ * final cleanup and completion.
+ */
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
+
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 11/15] blockdev: make BlockJobTxn available to qmp 'transaction'
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (9 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 10/15] block: add block job transactions Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions Fam Zheng
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Stefan Hajnoczi <stefanha@redhat.com>

Provide a BlockJobTxn to actions executed in a qmp 'transaction'
command.  This allows actions to make their block jobs either complete
as a group or fail/cancel together.

The next patch adds the first user.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 0ab8ad9..116e144 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1279,6 +1279,7 @@ typedef struct BlkActionOps {
 struct BlkActionState {
     TransactionAction *action;
     const BlkActionOps *ops;
+    BlockJobTxn *block_job_txn;
     QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
@@ -1883,12 +1884,15 @@ static const BlkActionOps actions[] = {
 void qmp_transaction(TransactionActionList *dev_list, Error **errp)
 {
     TransactionActionList *dev_entry = dev_list;
+    BlockJobTxn *block_job_txn;
     BlkActionState *state, *next;
     Error *local_err = NULL;
 
     QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
     QSIMPLEQ_INIT(&snap_bdrv_states);
 
+    block_job_txn = block_job_txn_new();
+
     /* drain all i/o before any operations */
     bdrv_drain_all();
 
@@ -1908,6 +1912,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
         state = g_malloc0(ops->instance_size);
         state->ops = ops;
         state->action = dev_info;
+        state->block_job_txn = block_job_txn;
         QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
 
         state->ops->prepare(state, &local_err);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (10 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 11/15] blockdev: make BlockJobTxn available to qmp 'transaction' Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:14   ` John Snow
  2015-07-14 10:32   ` Stefan Hajnoczi
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 13/15] iotests: 124 - transactional failure test Fam Zheng
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Stefan Hajnoczi <stefanha@redhat.com>

Join the transaction when the 'transactional-cancel' QMP argument is
true.

This ensures that the sync bitmap is not thrown away if another block
job in the transaction is cancelled or fails.  This is critical so
incremental backup with multiple disks can be retried in case of
cancellation/failure.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c            |  23 +++++++-
 blockdev.c                | 139 ++++++++++++++++++++++++++++++++++++----------
 hmp.c                     |   2 +-
 include/block/block_int.h |   3 +-
 qapi/block-core.json      |  14 ++++-
 5 files changed, 147 insertions(+), 34 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 6e24384..4f0eb17 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -226,11 +226,29 @@ static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)
     }
 }
 
+static void backup_txn_commit(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    if (s->sync_bitmap) {
+        backup_handle_dirty_bitmap(s, 0);
+    }
+}
+
+static void backup_txn_abort(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    if (s->sync_bitmap) {
+        backup_handle_dirty_bitmap(s, -1);
+    }
+}
+
 static const BlockJobDriver backup_job_driver = {
     .instance_size  = sizeof(BackupBlockJob),
     .job_type       = BLOCK_JOB_TYPE_BACKUP,
     .set_speed      = backup_set_speed,
     .iostatus_reset = backup_iostatus_reset,
+    .txn_commit     = backup_txn_commit,
+    .txn_abort      = backup_txn_abort,
 };
 
 static BlockErrorAction backup_error_action(BackupBlockJob *job,
@@ -444,7 +462,7 @@ static void coroutine_fn backup_run(void *opaque)
     qemu_co_rwlock_wrlock(&job->flush_rwlock);
     qemu_co_rwlock_unlock(&job->flush_rwlock);
 
-    if (job->sync_bitmap) {
+    if (!job->common.txn && job->sync_bitmap) {
         backup_handle_dirty_bitmap(job, ret);
     }
     hbitmap_free(job->bitmap);
@@ -463,7 +481,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   BlockCompletionFunc *cb, void *opaque,
-                  Error **errp)
+                  BlockJobTxn *txn, Error **errp)
 {
     int64_t len;
 
@@ -545,6 +563,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                        sync_bitmap : NULL;
     job->common.len = len;
     job->common.co = qemu_coroutine_create(backup_run);
+    block_job_txn_add_job(txn, &job->common);
     qemu_coroutine_enter(job->common.co, job);
     return;
 
diff --git a/blockdev.c b/blockdev.c
index 116e144..30ea52d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1586,12 +1586,25 @@ typedef struct DriveBackupState {
     BlockJob *job;
 } DriveBackupState;
 
+static void do_drive_backup(const char *device, const char *target,
+                            bool has_format, const char *format,
+                            enum MirrorSyncMode sync,
+                            bool has_mode, enum NewImageMode mode,
+                            bool has_speed, int64_t speed,
+                            bool has_bitmap, const char *bitmap,
+                            bool has_on_source_error,
+                            BlockdevOnError on_source_error,
+                            bool has_on_target_error,
+                            BlockdevOnError on_target_error,
+                            BlockJobTxn *txn, Error **errp);
+
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     BlockDriverState *bs;
     BlockBackend *blk;
     DriveBackup *backup;
+    BlockJobTxn *txn = NULL;
     Error *local_err = NULL;
 
     assert(common->action->kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
@@ -1609,15 +1622,20 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
 
-    qmp_drive_backup(backup->device, backup->target,
-                     backup->has_format, backup->format,
-                     backup->sync,
-                     backup->has_mode, backup->mode,
-                     backup->has_speed, backup->speed,
-                     backup->has_bitmap, backup->bitmap,
-                     backup->has_on_source_error, backup->on_source_error,
-                     backup->has_on_target_error, backup->on_target_error,
-                     &local_err);
+    if (backup->has_transactional_cancel &&
+        backup->transactional_cancel) {
+        txn = common->block_job_txn;
+    }
+
+    do_drive_backup(backup->device, backup->target,
+                    backup->has_format, backup->format,
+                    backup->sync,
+                    backup->has_mode, backup->mode,
+                    backup->has_speed, backup->speed,
+                    backup->has_bitmap, backup->bitmap,
+                    backup->has_on_source_error, backup->on_source_error,
+                    backup->has_on_target_error, backup->on_target_error,
+                    txn, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -1654,12 +1672,22 @@ typedef struct BlockdevBackupState {
     AioContext *aio_context;
 } BlockdevBackupState;
 
+static void do_blockdev_backup(const char *device, const char *target,
+                               enum MirrorSyncMode sync,
+                               bool has_speed, int64_t speed,
+                               bool has_on_source_error,
+                               BlockdevOnError on_source_error,
+                               bool has_on_target_error,
+                               BlockdevOnError on_target_error,
+                               BlockJobTxn *txn, Error **errp);
+
 static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockdevBackup *backup;
     BlockDriverState *bs, *target;
     BlockBackend *blk;
+    BlockJobTxn *txn = NULL;
     Error *local_err = NULL;
 
     assert(common->action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
@@ -1688,12 +1716,17 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     }
     aio_context_acquire(state->aio_context);
 
-    qmp_blockdev_backup(backup->device, backup->target,
-                        backup->sync,
-                        backup->has_speed, backup->speed,
-                        backup->has_on_source_error, backup->on_source_error,
-                        backup->has_on_target_error, backup->on_target_error,
-                        &local_err);
+    if (backup->has_transactional_cancel &&
+        backup->transactional_cancel) {
+        txn = common->block_job_txn;
+    }
+
+    do_blockdev_backup(backup->device, backup->target,
+                       backup->sync,
+                       backup->has_speed, backup->speed,
+                       backup->has_on_source_error, backup->on_source_error,
+                       backup->has_on_target_error, backup->on_target_error,
+                       txn, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -2578,15 +2611,17 @@ out:
     aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-                      bool has_format, const char *format,
-                      enum MirrorSyncMode sync,
-                      bool has_mode, enum NewImageMode mode,
-                      bool has_speed, int64_t speed,
-                      bool has_bitmap, const char *bitmap,
-                      bool has_on_source_error, BlockdevOnError on_source_error,
-                      bool has_on_target_error, BlockdevOnError on_target_error,
-                      Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+                            bool has_format, const char *format,
+                            enum MirrorSyncMode sync,
+                            bool has_mode, enum NewImageMode mode,
+                            bool has_speed, int64_t speed,
+                            bool has_bitmap, const char *bitmap,
+                            bool has_on_source_error,
+                            BlockdevOnError on_source_error,
+                            bool has_on_target_error,
+                            BlockdevOnError on_target_error,
+                            BlockJobTxn *txn, Error **errp)
 {
     BlockBackend *blk;
     BlockDriverState *bs;
@@ -2703,7 +2738,7 @@ void qmp_drive_backup(const char *device, const char *target,
 
     backup_start(bs, target_bs, speed, sync, bmap,
                  on_source_error, on_target_error,
-                 block_job_cb, bs, &local_err);
+                 block_job_cb, bs, txn, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
         error_propagate(errp, local_err);
@@ -2714,19 +2749,44 @@ out:
     aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+                      bool has_format, const char *format,
+                      enum MirrorSyncMode sync,
+                      bool has_mode, enum NewImageMode mode,
+                      bool has_speed, int64_t speed,
+                      bool has_bitmap, const char *bitmap,
+                      bool has_on_source_error, BlockdevOnError on_source_error,
+                      bool has_on_target_error, BlockdevOnError on_target_error,
+                      bool has_transactional_cancel, bool transactional_cancel,
+                      Error **errp)
+{
+    if (has_transactional_cancel && transactional_cancel) {
+        error_setg(errp, "Transactional cancel can only be used in the "
+                   "'transaction' command");
+        return;
+    }
+
+    return do_drive_backup(device, target, has_format, format, sync,
+                           has_mode, mode, has_speed, speed,
+                           has_bitmap, bitmap,
+                           has_on_source_error, on_source_error,
+                           has_on_target_error, on_target_error,
+                           NULL, errp);
+}
+
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
 {
     return bdrv_named_nodes_list(errp);
 }
 
-void qmp_blockdev_backup(const char *device, const char *target,
+void do_blockdev_backup(const char *device, const char *target,
                          enum MirrorSyncMode sync,
                          bool has_speed, int64_t speed,
                          bool has_on_source_error,
                          BlockdevOnError on_source_error,
                          bool has_on_target_error,
                          BlockdevOnError on_target_error,
-                         Error **errp)
+                         BlockJobTxn *txn, Error **errp)
 {
     BlockBackend *blk;
     BlockDriverState *bs;
@@ -2764,7 +2824,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
     bdrv_ref(target_bs);
     bdrv_set_aio_context(target_bs, aio_context);
     backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
-                 on_target_error, block_job_cb, bs, &local_err);
+                 on_target_error, block_job_cb, bs, txn, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
         error_propagate(errp, local_err);
@@ -2773,6 +2833,29 @@ out:
     aio_context_release(aio_context);
 }
 
+void qmp_blockdev_backup(const char *device, const char *target,
+                         enum MirrorSyncMode sync,
+                         bool has_speed, int64_t speed,
+                         bool has_on_source_error,
+                         BlockdevOnError on_source_error,
+                         bool has_on_target_error,
+                         BlockdevOnError on_target_error,
+                         bool has_transactional_cancel,
+                         bool transactional_cancel,
+                         Error **errp)
+{
+    if (has_transactional_cancel && transactional_cancel) {
+        error_setg(errp, "Transactional cancel can only be used in the "
+                   "'transaction' command");
+        return;
+    }
+
+    do_blockdev_backup(device, target, sync, has_speed, speed,
+                       has_on_source_error, on_source_error,
+                       has_on_target_error, on_target_error,
+                       NULL, errp);
+}
+
 #define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
 
 void qmp_drive_mirror(const char *device, const char *target,
diff --git a/hmp.c b/hmp.c
index dcc66f1..f492965 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1090,7 +1090,7 @@ void hmp_drive_backup(Monitor *mon, const QDict *qdict)
     qmp_drive_backup(device, filename, !!format, format,
                      full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
                      true, mode, false, 0, false, NULL,
-                     false, 0, false, 0, &err);
+                     false, 0, false, 0, false, false, &err);
     hmp_handle_error(mon, &err);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6eb22c7..5382422 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -642,6 +642,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
  * @on_target_error: The action to take upon error writing to the target.
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
+ * @txn: Transaction that this job is part of (may be NULL).
  *
  * Start a backup operation on @bs.  Clusters in @bs are written to @target
  * until the job is cancelled or manually completed.
@@ -652,7 +653,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   BlockCompletionFunc *cb, void *opaque,
-                  Error **errp);
+                  BlockJobTxn *txn, Error **errp);
 
 void blk_dev_change_media_cb(BlockBackend *blk, bool load);
 bool blk_dev_has_removable_media(BlockBackend *blk);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7b2efb8..d5e33fd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -736,6 +736,10 @@
 #                   default 'report' (no limitations, since this applies to
 #                   a different block device than @device).
 #
+# @transactional-cancel: #optional whether failure or cancellation of other
+#                        block jobs with @transactional-cancel true causes the
+#                        whole group to cancel.
+#
 # Note that @on-source-error and @on-target-error only affect background I/O.
 # If an error occurs during a guest write request, the device's rerror/werror
 # actions will be used.
@@ -747,7 +751,8 @@
             'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
             '*speed': 'int', '*bitmap': 'str',
             '*on-source-error': 'BlockdevOnError',
-            '*on-target-error': 'BlockdevOnError' } }
+            '*on-target-error': 'BlockdevOnError',
+            '*transactional-cancel': 'bool' } }
 
 ##
 # @BlockdevBackup
@@ -771,6 +776,10 @@
 #                   default 'report' (no limitations, since this applies to
 #                   a different block device than @device).
 #
+# @transactional-cancel: #optional whether failure or cancellation of other
+#                        block jobs with @transactional-cancel true causes the
+#                        whole group to cancel.
+#
 # Note that @on-source-error and @on-target-error only affect background I/O.
 # If an error occurs during a guest write request, the device's rerror/werror
 # actions will be used.
@@ -782,7 +791,8 @@
             'sync': 'MirrorSyncMode',
             '*speed': 'int',
             '*on-source-error': 'BlockdevOnError',
-            '*on-target-error': 'BlockdevOnError' } }
+            '*on-target-error': 'BlockdevOnError',
+            '*transactional-cancel': 'bool' } }
 
 ##
 # @blockdev-snapshot-sync
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 13/15] iotests: 124 - transactional failure test
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (11 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 14/15] qmp-commands.hx: Update the supported 'transaction' operations Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test Fam Zheng
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: John Snow <jsnow@redhat.com>

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

To support the 'transactional-cancel' QMP argument name it's necessary
for transaction_action() to convert underscores in Python argument names
to hyphens for QMP argument names.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/124     | 130 ++++++++++++++++++++++++++++++++++++++++++++-
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 130 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 9c1977e..33d00ef 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -39,7 +39,7 @@ def try_remove(img):
 def transaction_action(action, **kwargs):
     return {
         'type': action,
-        'data': kwargs
+        'data': dict((k.replace('_', '-'), v) for k, v in kwargs.iteritems())
     }
 
 
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
     def do_qmp_backup(self, error='Input/output error', **kwargs):
         res = self.vm.qmp('drive-backup', **kwargs)
         self.assert_qmp(res, 'return', {})
+        return self.wait_qmp_backup(kwargs['device'], error)
 
+
+    def wait_qmp_backup(self, device, error='Input/output error'):
         event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
-                                   match={'data': {'device': kwargs['device']}})
+                                   match={'data': {'device': device}})
         self.assertNotEqual(event, None)
 
         try:
@@ -156,6 +159,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
             return False
 
 
+    def wait_qmp_backup_cancelled(self, device):
+        event = self.vm.event_wait(name='BLOCK_JOB_CANCELLED',
+                                   match={'data': {'device': device}})
+        self.assertNotEqual(event, None)
+
+
     def create_anchor_backup(self, drive=None):
         if drive is None:
             drive = self.drives[-1]
@@ -375,6 +384,123 @@ class TestIncrementalBackup(iotests.QMPTestCase):
         self.check_backups()
 
 
+    def test_transaction_failure(self):
+        '''Test: Verify backups made from a transaction that partially fails.
+
+        Add a second drive with its own unique pattern, and add a bitmap to each
+        drive. Use blkdebug to interfere with the backup on just one drive and
+        attempt to create a coherent incremental backup across both drives.
+
+        verify a failure in one but not both, then delete the failed stubs and
+        re-run the same transaction.
+
+        verify that both incrementals are created successfully.
+        '''
+
+        # Create a second drive, with pattern:
+        drive1 = self.add_node('drive1')
+        self.img_create(drive1['file'], drive1['fmt'])
+        io_write_patterns(drive1['file'], (('0x14', 0, 512),
+                                           ('0x5d', '1M', '32k'),
+                                           ('0xcd', '32M', '124k')))
+
+        # Create a blkdebug interface to this img as 'drive1'
+        result = self.vm.qmp('blockdev-add', options={
+            'id': drive1['id'],
+            'driver': drive1['fmt'],
+            'file': {
+                'driver': 'blkdebug',
+                'image': {
+                    'driver': 'file',
+                    'filename': drive1['file']
+                },
+                'set-state': [{
+                    'event': 'flush_to_disk',
+                    'state': 1,
+                    'new_state': 2
+                }],
+                'inject-error': [{
+                    'event': 'read_aio',
+                    'errno': 5,
+                    'state': 2,
+                    'immediately': False,
+                    'once': True
+                }],
+            }
+        })
+        self.assert_qmp(result, 'return', {})
+
+        # Create bitmaps and full backups for both drives
+        drive0 = self.drives[0]
+        dr0bm0 = self.add_bitmap('bitmap0', drive0)
+        dr1bm0 = self.add_bitmap('bitmap0', drive1)
+        self.create_anchor_backup(drive0)
+        self.create_anchor_backup(drive1)
+        self.assert_no_active_block_jobs()
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+        # Emulate some writes
+        self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+                                          ('0xfe', '16M', '256k'),
+                                          ('0x64', '32736k', '64k')))
+        self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+                                          ('0xef', '16M', '256k'),
+                                          ('0x46', '32736k', '64k')))
+
+        # Create incremental backup targets
+        target0 = self.prepare_backup(dr0bm0)
+        target1 = self.prepare_backup(dr1bm0)
+
+        # Ask for a new incremental backup per-each drive,
+        # expecting drive1's backup to fail:
+        transaction = [
+            transaction_drive_backup(drive0['id'], target0, sync='incremental',
+                                     format=drive0['fmt'], mode='existing',
+                                     bitmap=dr0bm0.name,
+                                     transactional_cancel=True),
+            transaction_drive_backup(drive1['id'], target1, sync='incremental',
+                                     format=drive1['fmt'], mode='existing',
+                                     bitmap=dr1bm0.name,
+                                     transactional_cancel=True),
+        ]
+        result = self.vm.qmp('transaction', actions=transaction)
+        self.assert_qmp(result, 'return', {})
+
+        # Observe that drive0's backup is cancelled and drive1 completes with
+        # an error.
+        self.wait_qmp_backup_cancelled(drive0['id'])
+        self.assertFalse(self.wait_qmp_backup(drive1['id']))
+        error = self.vm.event_wait('BLOCK_JOB_ERROR')
+        self.assert_qmp(error, 'data', {'device': drive1['id'],
+                                        'action': 'report',
+                                        'operation': 'read'})
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+        self.assert_no_active_block_jobs()
+
+        # Delete drive0's successful target and eliminate our record of the
+        # unsuccessful drive1 target. Then re-run the same transaction.
+        dr0bm0.del_target()
+        dr1bm0.del_target()
+        target0 = self.prepare_backup(dr0bm0)
+        target1 = self.prepare_backup(dr1bm0)
+
+        # Re-run the exact same transaction.
+        result = self.vm.qmp('transaction', actions=transaction)
+        self.assert_qmp(result, 'return', {})
+
+        # Both should complete successfully this time.
+        self.assertTrue(self.wait_qmp_backup(drive0['id']))
+        self.assertTrue(self.wait_qmp_backup(drive1['id']))
+        self.make_reference_backup(dr0bm0)
+        self.make_reference_backup(dr1bm0)
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+        self.assert_no_active_block_jobs()
+
+        # And the images should of course validate.
+        self.vm.shutdown()
+        self.check_backups()
+
+
     def test_sync_dirty_bitmap_missing(self):
         self.assert_no_active_block_jobs()
         self.files.append(self.err_img)
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 594c16f..dae404e 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-........
+.........
 ----------------------------------------------------------------------
-Ran 8 tests
+Ran 9 tests
 
 OK
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 14/15] qmp-commands.hx: Update the supported 'transaction' operations
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (12 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 13/15] iotests: 124 - transactional failure test Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test Fam Zheng
  14 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Kashyap Chamarthy <kchamart@redhat.com>

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

    drive-backup
    blockdev-backup
    blockdev-snapshot-internal-sync
    abort
    block-dirty-bitmap-add
    block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qmp-commands.hx | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index e1bcc60..fe39467 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,22 @@ SQMP
 transaction
 -----------
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported:
+
+    - drive-backup
+    - blockdev-backup
+    - blockdev-snapshot-sync
+    - blockdev-snapshot-internal-sync
+    - abort
+    - block-dirty-bitmap-add
+    - block-dirty-bitmap-clear
+
+Refer to the qemu/qapi-schema.json file for minimum required QEMU
+versions for these operations.  A list of dictionaries is accepted,
+that contains the actions to be performed.  If there is any failure
+performing any of the operations, all operations for the group are
+abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test
  2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
                   ` (13 preceding siblings ...)
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 14/15] qmp-commands.hx: Update the supported 'transaction' operations Fam Zheng
@ 2015-07-10  3:46 ` Fam Zheng
  2015-07-13 23:14   ` John Snow
  14 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-10  3:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, Max Reitz, vsementsov,
	stefanha

From: Stefan Hajnoczi <stefanha@redhat.com>

The BlockJobTxn unit test verifies that both single jobs and pairs of
jobs behave as a transaction group.  Either all jobs complete
successfully or the group is cancelled.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/Makefile            |   3 +
 tests/test-blockjob-txn.c | 228 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 231 insertions(+)
 create mode 100644 tests/test-blockjob-txn.c

diff --git a/tests/Makefile b/tests/Makefile
index 2cd1195..da79721 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
 gcov-files-test-hbitmap-y = util/hbitmap.c
 check-unit-y += tests/test-hbitmap$(EXESUF)
+gcov-files-test-hbitmap-y = blockjob.c
+check-unit-y += tests/test-blockjob-txn$(EXESUF)
 check-unit-y += tests/test-x86-cpuid$(EXESUF)
 # all code tested by test-x86-cpuid is inside topology.h
 gcov-files-test-x86-cpuid-y =
@@ -286,6 +288,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
 tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
+tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
 tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
new file mode 100644
index 0000000..62a7510
--- /dev/null
+++ b/tests/test-blockjob-txn.c
@@ -0,0 +1,228 @@
+/*
+ * Blockjob transactions tests
+ *
+ * Copyright Red Hat, Inc. 2015
+ *
+ * Authors:
+ *  Stefan Hajnoczi    <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include <glib.h>
+#include "qapi/error.h"
+#include "qemu/main-loop.h"
+#include "block/blockjob.h"
+
+typedef struct {
+    BlockJob common;
+    unsigned int iterations;
+    bool use_timer;
+    int rc;
+} TestBlockJob;
+
+static const BlockJobDriver test_block_job_driver = {
+    .instance_size = sizeof(TestBlockJob),
+};
+
+static void test_block_job_complete(BlockJob *job, void *opaque)
+{
+    BlockDriverState *bs = job->bs;
+    int rc = (intptr_t)opaque;
+
+    if (block_job_is_cancelled(job)) {
+        rc = -ECANCELED;
+    }
+
+    block_job_completed(job, rc);
+    bdrv_unref(bs);
+}
+
+static void coroutine_fn test_block_job_run(void *opaque)
+{
+    TestBlockJob *s = opaque;
+    BlockJob *job = &s->common;
+
+    while (s->iterations--) {
+        if (s->use_timer) {
+            block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, 0);
+        } else {
+            block_job_yield(job);
+        }
+
+        if (block_job_is_cancelled(job)) {
+            break;
+        }
+    }
+
+    block_job_defer_to_main_loop(job, test_block_job_complete,
+                                 (void *)(intptr_t)s->rc);
+}
+
+static void test_block_job_cb(void *opaque, int ret)
+{
+    *(int *)opaque = ret;
+}
+
+/* Create a block job that completes with a given return code after a given
+ * number of event loop iterations.  The return code is stored in the given
+ * result pointer.
+ *
+ * The event loop iterations can either be handled automatically with a 0 delay
+ * timer, or they can be stepped manually by entering the coroutine.
+ */
+static BlockJob *test_block_job_start(unsigned int iterations,
+                                      bool use_timer,
+                                      int rc, int *result)
+{
+    BlockDriverState *bs;
+    TestBlockJob *s;
+
+    bs = bdrv_new();
+    s = block_job_create(&test_block_job_driver, bs, 0, test_block_job_cb,
+                         result, &error_abort);
+    s->iterations = iterations;
+    s->use_timer = use_timer;
+    s->rc = rc;
+    s->common.co = qemu_coroutine_create(test_block_job_run);
+    qemu_coroutine_enter(s->common.co, s);
+    return &s->common;
+}
+
+static void test_single_job(int expected)
+{
+    BlockJob *job;
+    BlockJobTxn *txn;
+    int result = -EINPROGRESS;
+
+    txn = block_job_txn_new();
+    job = test_block_job_start(1, true, expected, &result);
+    block_job_txn_add_job(txn, job);
+
+    if (expected == -ECANCELED) {
+        block_job_cancel(job);
+    }
+
+    while (result == -EINPROGRESS) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+    g_assert_cmpint(result, ==, expected);
+}
+
+static void test_single_job_success(void)
+{
+    test_single_job(0);
+}
+
+static void test_single_job_failure(void)
+{
+    test_single_job(-EIO);
+}
+
+static void test_single_job_cancel(void)
+{
+    test_single_job(-ECANCELED);
+}
+
+static void test_pair_jobs(int expected1, int expected2)
+{
+    BlockJob *job1;
+    BlockJob *job2;
+    BlockJobTxn *txn;
+    int result1 = -EINPROGRESS;
+    int result2 = -EINPROGRESS;
+
+    txn = block_job_txn_new();
+    job1 = test_block_job_start(1, true, expected1, &result1);
+    block_job_txn_add_job(txn, job1);
+    job2 = test_block_job_start(2, true, expected2, &result2);
+    block_job_txn_add_job(txn, job2);
+
+    if (expected1 == -ECANCELED) {
+        block_job_cancel(job1);
+    }
+    if (expected2 == -ECANCELED) {
+        block_job_cancel(job2);
+    }
+
+    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+
+    /* Failure or cancellation of one job cancels the other job */
+    if (expected1 != 0) {
+        expected2 = -ECANCELED;
+    } else if (expected2 != 0) {
+        expected1 = -ECANCELED;
+    }
+
+    g_assert_cmpint(result1, ==, expected1);
+    g_assert_cmpint(result2, ==, expected2);
+}
+
+static void test_pair_jobs_success(void)
+{
+    test_pair_jobs(0, 0);
+}
+
+static void test_pair_jobs_failure(void)
+{
+    /* Test both orderings.  The two jobs run for a different number of
+     * iterations so the code path is different depending on which job fails
+     * first.
+     */
+    test_pair_jobs(-EIO, 0);
+    test_pair_jobs(0, -EIO);
+}
+
+static void test_pair_jobs_cancel(void)
+{
+    test_pair_jobs(-ECANCELED, 0);
+    test_pair_jobs(0, -ECANCELED);
+}
+
+static void test_pair_jobs_fail_cancel_race(void)
+{
+    BlockJob *job1;
+    BlockJob *job2;
+    BlockJobTxn *txn;
+    int result1 = -EINPROGRESS;
+    int result2 = -EINPROGRESS;
+
+    txn = block_job_txn_new();
+    job1 = test_block_job_start(1, true, -ECANCELED, &result1);
+    block_job_txn_add_job(txn, job1);
+    job2 = test_block_job_start(2, false, 0, &result2);
+    block_job_txn_add_job(txn, job2);
+
+    block_job_cancel(job1);
+
+    /* Now make job2 finish before the main loop kicks jobs.  This means
+     * simulates the race between a pending kick and another job completing.
+     */
+    block_job_enter(job2);
+    block_job_enter(job2);
+
+    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+
+    g_assert_cmpint(result1, ==, -ECANCELED);
+    g_assert_cmpint(result2, ==, -ECANCELED);
+}
+
+int main(int argc, char **argv)
+{
+    qemu_init_main_loop(&error_abort);
+
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/single/success", test_single_job_success);
+    g_test_add_func("/single/failure", test_single_job_failure);
+    g_test_add_func("/single/cancel", test_single_job_cancel);
+    g_test_add_func("/pair/success", test_pair_jobs_success);
+    g_test_add_func("/pair/failure", test_pair_jobs_failure);
+    g_test_add_func("/pair/cancel", test_pair_jobs_cancel);
+    g_test_add_func("/pair/fail-cancel-race", test_pair_jobs_fail_cancel_race);
+    return g_test_run();
+}
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled Fam Zheng
@ 2015-07-13 19:48   ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-13 19:48 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Reclaim the dirty bitmap if an incremental backup block job is
> cancelled.  The ret variable may be 0 when the job is cancelled so it's
> not enough to check ret < 0.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/backup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index d3c7d9f..965654d 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque)
>  
>      if (job->sync_bitmap) {
>          BdrvDirtyBitmap *bm;
> -        if (ret < 0) {
> +        if (ret < 0 || block_job_is_cancelled(&job->common)) {
>              /* Merge the successor back into the parent, delete nothing. */
>              bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
>              assert(bm);
> 

Jeff Cody should've merged this one by now, but I guess he hasn't sent a
pull request for it yet.

This should be in 2.4, I think.

--js

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function Fam Zheng
@ 2015-07-13 23:06   ` John Snow
  2015-07-14  2:46     ` Fam Zheng
  2015-07-14  8:26   ` Stefan Hajnoczi
  1 sibling, 1 reply; 35+ messages in thread
From: John Snow @ 2015-07-13 23:06 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> This will be reused by the coming new transactional completion code.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/backup.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 965654d..6e24384 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -210,6 +210,21 @@ static void backup_iostatus_reset(BlockJob *job)
>  
>      bdrv_iostatus_reset(s->target);
>  }
> +static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)
> +{
> +    BdrvDirtyBitmap *bm;
> +    BlockDriverState *bs = job->common.bs;
> +
> +    if (ret < 0 || block_job_is_cancelled(&job->common)) {
> +        /* Merge the successor back into the parent, delete nothing. */
> +        bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
> +        assert(bm);
> +    } else {
> +        /* Everything is fine, delete this bitmap and install the backup. */
> +        bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
> +        assert(bm);
> +    }
> +}
>  
>  static const BlockJobDriver backup_job_driver = {
>      .instance_size  = sizeof(BackupBlockJob),
> @@ -430,16 +445,7 @@ static void coroutine_fn backup_run(void *opaque)
>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>  
>      if (job->sync_bitmap) {
> -        BdrvDirtyBitmap *bm;
> -        if (ret < 0 || block_job_is_cancelled(&job->common)) {
> -            /* Merge the successor back into the parent, delete nothing. */
> -            bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
> -            assert(bm);
> -        } else {
> -            /* Everything is fine, delete this bitmap and install the backup. */
> -            bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
> -            assert(bm);
> -        }
> +        backup_handle_dirty_bitmap(job, ret);
>      }
>      hbitmap_free(job->bitmap);
>  
> 

Bike-shedding: strange name, I may have used 'cleanup' or 'finalize' or
so, but that's neither here nor there.

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions Fam Zheng
@ 2015-07-13 23:06   ` John Snow
  2015-07-14  8:35   ` Stefan Hajnoczi
  1 sibling, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-13 23:06 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> They will be called if the job is part of a transaction, after all jobs in a
> transaction are completed or cancelled, before calling job->cb().
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/blockjob.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index dd9d5e6..a7b7f66 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -50,6 +50,18 @@ typedef struct BlockJobDriver {
>       * manually.
>       */
>      void (*complete)(BlockJob *job, Error **errp);
> +
> +    /**
> +     * Optional callback for job types that can be in a transaction. Called
> +     * when the transaction succeeds.
> +     */
> +    void (*txn_commit)(BlockJob *job);
> +
> +    /**
> +     * Optional callback for job types that can be in a transaction. Call when
> +     * the transaction fails.
> +     */
> +    void (*txn_abort)(BlockJob *job);
>  } BlockJobDriver;
>  
>  /**
> 

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob Fam Zheng
@ 2015-07-13 23:08   ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-13 23:08 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> They are set when block_job_completed is called.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockjob.c               | 3 +++
>  include/block/blockjob.h | 9 +++++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 62bb906..fb1d9e7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -89,6 +89,9 @@ void block_job_completed(BlockJob *job, int ret)
>      BlockDriverState *bs = job->bs;
>  
>      assert(bs->job == job);
> +    assert(!job->completed);
> +    job->completed = true;
> +    job->ret = ret;
>      job->cb(job->opaque, ret);
>      block_job_release(bs);
>  }
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index a7b7f66..40d0776 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -134,6 +134,15 @@ struct BlockJob {
>  
>      /** The opaque value that is passed to the completion function.  */
>      void *opaque;
> +
> +    /* True if this job has reported completion by calling block_job_completed.
> +     */

The end comment marker looks a little doofy on a newline, no?

> +    bool completed;
> +
> +    /* ret code passed to block_job_completed.
> +     */
> +    int ret;
> +
>  };
>  
>  /**
> 

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync Fam Zheng
@ 2015-07-13 23:08   ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-13 23:08 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> With job->completed and job->ret to replace BlockFinishData.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockjob.c | 23 ++---------------------
>  1 file changed, 2 insertions(+), 21 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index fb1d9e7..11b48f5 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -179,43 +179,24 @@ struct BlockFinishData {
>      int ret;
>  };
>  
> -static void block_job_finish_cb(void *opaque, int ret)
> -{
> -    struct BlockFinishData *data = opaque;
> -
> -    data->cancelled = block_job_is_cancelled(data->job);
> -    data->ret = ret;
> -    data->cb(data->opaque, ret);
> -}
> -
>  static int block_job_finish_sync(BlockJob *job,
>                                   void (*finish)(BlockJob *, Error **errp),
>                                   Error **errp)
>  {
> -    struct BlockFinishData data;
>      BlockDriverState *bs = job->bs;
>      Error *local_err = NULL;
>  
>      assert(bs->job == job);
>  
> -    /* Set up our own callback to store the result and chain to
> -     * the original callback.
> -     */
> -    data.job = job;
> -    data.cb = job->cb;
> -    data.opaque = job->opaque;
> -    data.ret = -EINPROGRESS;
> -    job->cb = block_job_finish_cb;
> -    job->opaque = &data;
>      finish(job, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return -EBUSY;
>      }
> -    while (data.ret == -EINPROGRESS) {
> +    while (!job->completed) {
>          aio_poll(bdrv_get_aio_context(bs), true);
>      }
> -    return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
> +    return (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
>  }
>  
>  /* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
> 

Hmm ... I'm less sure of this one because I can't imagine why it's split
out the way it is to begin with.

Looks sane to me, though...

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 10/15] block: add block job transactions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 10/15] block: add block job transactions Fam Zheng
@ 2015-07-13 23:12   ` John Snow
  2015-07-14  3:04     ` Fam Zheng
  2015-07-14 10:27   ` Stefan Hajnoczi
  1 sibling, 1 reply; 35+ messages in thread
From: John Snow @ 2015-07-13 23:12 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Sometimes block jobs must execute as a transaction group.  Finishing
> jobs wait until all other jobs are ready to complete successfully.
> Failure or cancellation of one job cancels the other jobs in the group.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> [Rewrite the implementation which is now contained in block_job_completed.
> --Fam]
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockjob.c               | 86 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block.h    |  1 +
>  include/block/blockjob.h | 26 +++++++++++++++
>  3 files changed, 113 insertions(+)
> 
> diff --git a/blockjob.c b/blockjob.c
> index e057dd5..7b59b53 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -36,6 +36,16 @@
>  #include "qemu/timer.h"
>  #include "qapi-event.h"
>  
> +/* Transactional group of block jobs */
> +struct BlockJobTxn {
> +
> +    /* Is this txn being cancelled? */
> +    bool aborting;
> +
> +    /* List of jobs */
> +    QLIST_HEAD(, BlockJob) jobs;
> +};
> +
>  void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>                         int64_t speed, BlockCompletionFunc *cb,
>                         void *opaque, Error **errp)
> @@ -84,6 +94,59 @@ void block_job_release(BlockDriverState *bs)
>      g_free(job);
>  }
>  
> +static void block_job_completed_txn(BlockJobTxn *txn, BlockJob *job, int ret)
> +{
> +    AioContext *ctx;
> +    BlockJob *other_job, *next;
> +    if (ret < 0 || block_job_is_cancelled(job)) {
> +        if (!txn->aborting) {
> +            /* We are the first failed job. Cancel other jobs. */

What guarantee against a race is there at this point? Something to do
with deferring back to the main loop before we call completion?

Do jobs always complete in the main thread?

> +            txn->aborting = true;
> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +                if (other_job == job || other_job->completed) {
> +                    continue;
> +                }
> +                ctx = bdrv_get_aio_context(other_job->bs);
> +                aio_context_acquire(ctx);
> +                block_job_cancel_sync(other_job);
> +                assert(other_job->completed);
> +                aio_context_release(ctx);
> +            }
> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +                if (other_job->driver->txn_abort) {
> +                    other_job->driver->txn_abort(other_job);
> +                }
> +                other_job->cb(other_job->opaque,
> +                              other_job->ret ? : -ECANCELED);
> +                block_job_release(other_job->bs);
> +            }
> +        } else {
> +            /*
> +             * We are cancelled by another job, who will handle everything.
> +             */
> +            return;
> +        }
> +    } else {
> +        /*
> +         * Successful completion, see if there is other running jobs in this
> +         * txn. */

s/is/are/

> +        QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> +            if (!other_job->completed) {
> +                return;
> +            }
> +        }

Clever (though quadratic!) :~)

> +        /* We are the last completed job, commit the transaction. */
> +        QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +            if (other_job->driver->txn_commit) {
> +                other_job->driver->txn_commit(other_job);
> +            }
> +            assert(other_job->ret == 0);
> +            other_job->cb(other_job->opaque, 0);
> +            block_job_release(other_job->bs);

Worth creating some static local/inline routine and sharing it with
block_job_completed?

> +        }
> +    }
> +}
> +
>  void block_job_completed(BlockJob *job, int ret)
>  {
>      BlockDriverState *bs = job->bs;
> @@ -96,6 +159,10 @@ void block_job_completed(BlockJob *job, int ret)
>          qemu_bh_delete(job->defer_to_main_loop_data.bh);
>          job->defer_to_main_loop_data.bh = NULL;
>      }
> +    if (job->txn) {
> +        block_job_completed_txn(job->txn, job, ret);
> +        return;
> +    }
>      job->cb(job->opaque, ret);
>      block_job_release(bs);
>  }
> @@ -384,3 +451,22 @@ void block_job_defer_to_main_loop(BlockJob *job,
>  
>      qemu_bh_schedule(data->bh);
>  }
> +
> +BlockJobTxn *block_job_txn_new(void)
> +{
> +    BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
> +    QLIST_INIT(&txn->jobs);
> +    return txn;
> +}
> +
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> +{
> +    if (!txn) {
> +        return;
> +    }
> +
> +    assert(!job->txn);
> +    job->txn = txn;
> +
> +    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
> +}
> diff --git a/include/block/block.h b/include/block/block.h
> index 7437590..c7fc5b6 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -13,6 +13,7 @@
>  typedef struct BlockDriver BlockDriver;
>  typedef struct BlockJob BlockJob;
>  typedef struct BdrvChildRole BdrvChildRole;
> +typedef struct BlockJobTxn BlockJobTxn;
>  
>  typedef struct BlockDriverInfo {
>      /* in bytes, 0 if irrelevant */
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 5bac2e2..d854ee0 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -157,6 +157,9 @@ struct BlockJob {
>       */
>      int ret;
>  
> +    /** Non-NULL if this job is part of a transaction */
> +    BlockJobTxn *txn;
> +    QLIST_ENTRY(BlockJob) txn_list;
>  };
>  
>  /**
> @@ -389,4 +392,27 @@ void block_job_defer_to_main_loop(BlockJob *job,
>                                    BlockJobDeferToMainLoopFn *fn,
>                                    void *opaque);
>  
> +/**
> + * block_job_txn_new:
> + *
> + * Allocate and return a new block job transaction.  Jobs can be added to the
> + * transaction using block_job_txn_add_job().
> + *
> + * All jobs in the transaction either complete successfully or fail/cancel as a
> + * group.  Jobs wait for each other before completing.  Cancelling one job
> + * cancels all jobs in the transaction.
> + */
> +BlockJobTxn *block_job_txn_new(void);
> +
> +/**
> + * block_job_txn_add_job:
> + * @txn: The transaction (may be NULL)
> + * @job: Job to add to the transaction
> + *
> + * Add @job to the transaction.  The @job must not already be in a transaction.
> + * The block job driver must call block_job_txn_prepare_to_complete() before
> + * final cleanup and completion.
> + */
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
> +
>  #endif
> 

This definitely combines the best of our approaches AND has less code to
boot. Great!

Nitpicks and comment spelling aside:
Reviewed-by: John Snow <jsnow@redhat.com>

--
(linguistic postscript: I don't know how to say "Stefan and I's patches"
because I think 'I' is incorrect here because the subject of the
sentence is the patches, not Stefan or myself. I think it's some variant
on "me" "my" or "mine," but can't quite work it out. English is hard, or
I'm stupid, or both.)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions Fam Zheng
@ 2015-07-13 23:14   ` John Snow
  2015-07-14  3:12     ` Fam Zheng
  2015-07-14 10:32   ` Stefan Hajnoczi
  1 sibling, 1 reply; 35+ messages in thread
From: John Snow @ 2015-07-13 23:14 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, Markus Armbruster, Max Reitz, vsementsov,
	stefanha



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Join the transaction when the 'transactional-cancel' QMP argument is
> true.
> 
> This ensures that the sync bitmap is not thrown away if another block
> job in the transaction is cancelled or fails.  This is critical so
> incremental backup with multiple disks can be retried in case of
> cancellation/failure.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/backup.c            |  23 +++++++-
>  blockdev.c                | 139 ++++++++++++++++++++++++++++++++++++----------
>  hmp.c                     |   2 +-
>  include/block/block_int.h |   3 +-
>  qapi/block-core.json      |  14 ++++-
>  5 files changed, 147 insertions(+), 34 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 6e24384..4f0eb17 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -226,11 +226,29 @@ static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)
>      }
>  }
>  
> +static void backup_txn_commit(BlockJob *job)
> +{
> +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> +    if (s->sync_bitmap) {
> +        backup_handle_dirty_bitmap(s, 0);
> +    }
> +}
> +
> +static void backup_txn_abort(BlockJob *job)
> +{
> +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> +    if (s->sync_bitmap) {
> +        backup_handle_dirty_bitmap(s, -1);
> +    }
> +}
> +

If you're going to check for sync_bitmap out here in these two functions
instead of inside backup_handle_dirty_bitmap, add an assertion into the
called function that job->sync_bitmap *will* be present.

>  static const BlockJobDriver backup_job_driver = {
>      .instance_size  = sizeof(BackupBlockJob),
>      .job_type       = BLOCK_JOB_TYPE_BACKUP,
>      .set_speed      = backup_set_speed,
>      .iostatus_reset = backup_iostatus_reset,
> +    .txn_commit     = backup_txn_commit,
> +    .txn_abort      = backup_txn_abort,
>  };
>  
>  static BlockErrorAction backup_error_action(BackupBlockJob *job,
> @@ -444,7 +462,7 @@ static void coroutine_fn backup_run(void *opaque)
>      qemu_co_rwlock_wrlock(&job->flush_rwlock);
>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>  
> -    if (job->sync_bitmap) {
> +    if (!job->common.txn && job->sync_bitmap) {
>          backup_handle_dirty_bitmap(job, ret);
>      }

Can we add a little call to blockjobs, like:

bool block_cleanup_needed(BlockJob *job)
{
    return job->common.txn == NULL;
}

To make this status check look less magical?
Then, if the sync bitmap check is pushed into backup_handle, you can
just simply say:

if (blockjob_cleanup_needed(job)) {
    backup_handle_dirty_bitmap(job, ret);
}

which IMO is nicer.

>      hbitmap_free(job->bitmap);
> @@ -463,7 +481,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp)
> +                  BlockJobTxn *txn, Error **errp)
>  {
>      int64_t len;
>  
> @@ -545,6 +563,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                         sync_bitmap : NULL;
>      job->common.len = len;
>      job->common.co = qemu_coroutine_create(backup_run);
> +    block_job_txn_add_job(txn, &job->common);

OK, so all backup jobs will participate in the transaction -- but they
can control this based on whether or not they pass forward the txn
parameter.

>      qemu_coroutine_enter(job->common.co, job);
>      return;
>  
> diff --git a/blockdev.c b/blockdev.c
> index 116e144..30ea52d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1586,12 +1586,25 @@ typedef struct DriveBackupState {
>      BlockJob *job;
>  } DriveBackupState;
>  
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp);
> +
>  static void drive_backup_prepare(BlkActionState *common, Error **errp)
>  {
>      DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
>      BlockDriverState *bs;
>      BlockBackend *blk;
>      DriveBackup *backup;
> +    BlockJobTxn *txn = NULL;
>      Error *local_err = NULL;
>  
>      assert(common->action->kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
> @@ -1609,15 +1622,20 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
>      state->aio_context = bdrv_get_aio_context(bs);
>      aio_context_acquire(state->aio_context);
>  
> -    qmp_drive_backup(backup->device, backup->target,
> -                     backup->has_format, backup->format,
> -                     backup->sync,
> -                     backup->has_mode, backup->mode,
> -                     backup->has_speed, backup->speed,
> -                     backup->has_bitmap, backup->bitmap,
> -                     backup->has_on_source_error, backup->on_source_error,
> -                     backup->has_on_target_error, backup->on_target_error,
> -                     &local_err);
> +    if (backup->has_transactional_cancel &&
> +        backup->transactional_cancel) {
> +        txn = common->block_job_txn;
> +    }
> +
> +    do_drive_backup(backup->device, backup->target,
> +                    backup->has_format, backup->format,
> +                    backup->sync,
> +                    backup->has_mode, backup->mode,
> +                    backup->has_speed, backup->speed,
> +                    backup->has_bitmap, backup->bitmap,
> +                    backup->has_on_source_error, backup->on_source_error,
> +                    backup->has_on_target_error, backup->on_target_error,
> +                    txn, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -1654,12 +1672,22 @@ typedef struct BlockdevBackupState {
>      AioContext *aio_context;
>  } BlockdevBackupState;
>  
> +static void do_blockdev_backup(const char *device, const char *target,
> +                               enum MirrorSyncMode sync,
> +                               bool has_speed, int64_t speed,
> +                               bool has_on_source_error,
> +                               BlockdevOnError on_source_error,
> +                               bool has_on_target_error,
> +                               BlockdevOnError on_target_error,
> +                               BlockJobTxn *txn, Error **errp);
> +
>  static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>  {
>      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
>      BlockdevBackup *backup;
>      BlockDriverState *bs, *target;
>      BlockBackend *blk;
> +    BlockJobTxn *txn = NULL;
>      Error *local_err = NULL;
>  
>      assert(common->action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
> @@ -1688,12 +1716,17 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>      }
>      aio_context_acquire(state->aio_context);
>  
> -    qmp_blockdev_backup(backup->device, backup->target,
> -                        backup->sync,
> -                        backup->has_speed, backup->speed,
> -                        backup->has_on_source_error, backup->on_source_error,
> -                        backup->has_on_target_error, backup->on_target_error,
> -                        &local_err);
> +    if (backup->has_transactional_cancel &&
> +        backup->transactional_cancel) {
> +        txn = common->block_job_txn;
> +    }
> +
> +    do_blockdev_backup(backup->device, backup->target,
> +                       backup->sync,
> +                       backup->has_speed, backup->speed,
> +                       backup->has_on_source_error, backup->on_source_error,
> +                       backup->has_on_target_error, backup->on_target_error,
> +                       txn, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -2578,15 +2611,17 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> -void qmp_drive_backup(const char *device, const char *target,
> -                      bool has_format, const char *format,
> -                      enum MirrorSyncMode sync,
> -                      bool has_mode, enum NewImageMode mode,
> -                      bool has_speed, int64_t speed,
> -                      bool has_bitmap, const char *bitmap,
> -                      bool has_on_source_error, BlockdevOnError on_source_error,
> -                      bool has_on_target_error, BlockdevOnError on_target_error,
> -                      Error **errp)
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
> @@ -2703,7 +2738,7 @@ void qmp_drive_backup(const char *device, const char *target,
>  
>      backup_start(bs, target_bs, speed, sync, bmap,
>                   on_source_error, on_target_error,
> -                 block_job_cb, bs, &local_err);
> +                 block_job_cb, bs, txn, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> @@ -2714,19 +2749,44 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> +void qmp_drive_backup(const char *device, const char *target,
> +                      bool has_format, const char *format,
> +                      enum MirrorSyncMode sync,
> +                      bool has_mode, enum NewImageMode mode,
> +                      bool has_speed, int64_t speed,
> +                      bool has_bitmap, const char *bitmap,
> +                      bool has_on_source_error, BlockdevOnError on_source_error,
> +                      bool has_on_target_error, BlockdevOnError on_target_error,
> +                      bool has_transactional_cancel, bool transactional_cancel,
> +                      Error **errp)
> +{
> +    if (has_transactional_cancel && transactional_cancel) {
> +        error_setg(errp, "Transactional cancel can only be used in the "
> +                   "'transaction' command");
> +        return;
> +    }
> +
> +    return do_drive_backup(device, target, has_format, format, sync,
> +                           has_mode, mode, has_speed, speed,
> +                           has_bitmap, bitmap,
> +                           has_on_source_error, on_source_error,
> +                           has_on_target_error, on_target_error,
> +                           NULL, errp);
> +}
> +
>  BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
>  {
>      return bdrv_named_nodes_list(errp);
>  }
>  
> -void qmp_blockdev_backup(const char *device, const char *target,
> +void do_blockdev_backup(const char *device, const char *target,
>                           enum MirrorSyncMode sync,
>                           bool has_speed, int64_t speed,
>                           bool has_on_source_error,
>                           BlockdevOnError on_source_error,
>                           bool has_on_target_error,
>                           BlockdevOnError on_target_error,
> -                         Error **errp)
> +                         BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
> @@ -2764,7 +2824,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
>      bdrv_ref(target_bs);
>      bdrv_set_aio_context(target_bs, aio_context);
>      backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
> -                 on_target_error, block_job_cb, bs, &local_err);
> +                 on_target_error, block_job_cb, bs, txn, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> @@ -2773,6 +2833,29 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> +void qmp_blockdev_backup(const char *device, const char *target,
> +                         enum MirrorSyncMode sync,
> +                         bool has_speed, int64_t speed,
> +                         bool has_on_source_error,
> +                         BlockdevOnError on_source_error,
> +                         bool has_on_target_error,
> +                         BlockdevOnError on_target_error,
> +                         bool has_transactional_cancel,
> +                         bool transactional_cancel,
> +                         Error **errp)
> +{
> +    if (has_transactional_cancel && transactional_cancel) {
> +        error_setg(errp, "Transactional cancel can only be used in the "
> +                   "'transaction' command");
> +        return;
> +    }
> +
> +    do_blockdev_backup(device, target, sync, has_speed, speed,
> +                       has_on_source_error, on_source_error,
> +                       has_on_target_error, on_target_error,
> +                       NULL, errp);
> +}
> +
>  #define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
>  
>  void qmp_drive_mirror(const char *device, const char *target,
> diff --git a/hmp.c b/hmp.c
> index dcc66f1..f492965 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1090,7 +1090,7 @@ void hmp_drive_backup(Monitor *mon, const QDict *qdict)
>      qmp_drive_backup(device, filename, !!format, format,
>                       full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
>                       true, mode, false, 0, false, NULL,
> -                     false, 0, false, 0, &err);
> +                     false, 0, false, 0, false, false, &err);
>      hmp_handle_error(mon, &err);
>  }
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 6eb22c7..5382422 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -642,6 +642,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
>   * @on_target_error: The action to take upon error writing to the target.
>   * @cb: Completion function for the job.
>   * @opaque: Opaque pointer value passed to @cb.
> + * @txn: Transaction that this job is part of (may be NULL).
>   *
>   * Start a backup operation on @bs.  Clusters in @bs are written to @target
>   * until the job is cancelled or manually completed.
> @@ -652,7 +653,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp);
> +                  BlockJobTxn *txn, Error **errp);
>  
>  void blk_dev_change_media_cb(BlockBackend *blk, bool load);
>  bool blk_dev_has_removable_media(BlockBackend *blk);
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 7b2efb8..d5e33fd 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -736,6 +736,10 @@
>  #                   default 'report' (no limitations, since this applies to
>  #                   a different block device than @device).
>  #
> +# @transactional-cancel: #optional whether failure or cancellation of other
> +#                        block jobs with @transactional-cancel true causes the
> +#                        whole group to cancel.
> +#
>  # Note that @on-source-error and @on-target-error only affect background I/O.
>  # If an error occurs during a guest write request, the device's rerror/werror
>  # actions will be used.
> @@ -747,7 +751,8 @@
>              'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
>              '*speed': 'int', '*bitmap': 'str',
>              '*on-source-error': 'BlockdevOnError',
> -            '*on-target-error': 'BlockdevOnError' } }
> +            '*on-target-error': 'BlockdevOnError',
> +            '*transactional-cancel': 'bool' } }
>  
>  ##
>  # @BlockdevBackup
> @@ -771,6 +776,10 @@
>  #                   default 'report' (no limitations, since this applies to
>  #                   a different block device than @device).
>  #
> +# @transactional-cancel: #optional whether failure or cancellation of other
> +#                        block jobs with @transactional-cancel true causes the
> +#                        whole group to cancel.
> +#
>  # Note that @on-source-error and @on-target-error only affect background I/O.
>  # If an error occurs during a guest write request, the device's rerror/werror
>  # actions will be used.
> @@ -782,7 +791,8 @@
>              'sync': 'MirrorSyncMode',
>              '*speed': 'int',
>              '*on-source-error': 'BlockdevOnError',
> -            '*on-target-error': 'BlockdevOnError' } }
> +            '*on-target-error': 'BlockdevOnError',
> +            '*transactional-cancel': 'bool' } }
>  
>  ##
>  # @blockdev-snapshot-sync
> 

Thinking out loud:

I am /wondering/ if we shouldn't add "transactional-cancel" as an option
to /transaction itself/...

It seems slightly awkward to have transactional commands affixed to the
QMP commands, where one might want the option to apply to all actions.

Specifying the option for EACH action also seems slightly awkward, but
it does give you a very precise control.

I blathered to Eric Blake a little bit, and I was wondering if we
couldn't have "optional" and "mandatory" default-options for
transactions, where:

(A) Any mandatory transaction-wide options would propagate to all
actions below, which *must* support that option.
(B) Any optional transaction-wide options would propagate to all actions
below, but the action itself doesn't necessarily have to support it.
(C) The set of options the QMP commands take could be dictated by a base
structure and a derived child structure that had the optional
transaction arguments, for instance.

That way, you could:

- Specify a preference exactly once ("I want all transactions to fail
together, if possible.")
- But override it for specific actions, if desired, and
- Not have to add in explicit support for those options right away
thanks to an optional-obey mechanism.

Of course, this is a bit of an undertaking in the QAPI design arena and
perhaps it's not worth it if we don't expect to extend the transactional
mechanisms much in the future. Worth asking Markus and Eric, though,
before we add transactionally related parameters directly into specific
commands.

We might want to devise a more general purpose solution first.

--js

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test Fam Zheng
@ 2015-07-13 23:14   ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-13 23:14 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, stefanha, Max Reitz



On 07/09/2015 11:46 PM, Fam Zheng wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> 
> The BlockJobTxn unit test verifies that both single jobs and pairs of
> jobs behave as a transaction group.  Either all jobs complete
> successfully or the group is cancelled.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/Makefile            |   3 +
>  tests/test-blockjob-txn.c | 228 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 231 insertions(+)
>  create mode 100644 tests/test-blockjob-txn.c
> 
> diff --git a/tests/Makefile b/tests/Makefile
> index 2cd1195..da79721 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -45,6 +45,8 @@ check-unit-y += tests/test-thread-pool$(EXESUF)
>  gcov-files-test-thread-pool-y = thread-pool.c
>  gcov-files-test-hbitmap-y = util/hbitmap.c
>  check-unit-y += tests/test-hbitmap$(EXESUF)
> +gcov-files-test-hbitmap-y = blockjob.c
> +check-unit-y += tests/test-blockjob-txn$(EXESUF)
>  check-unit-y += tests/test-x86-cpuid$(EXESUF)
>  # all code tested by test-x86-cpuid is inside topology.h
>  gcov-files-test-x86-cpuid-y =
> @@ -286,6 +288,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
>  tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
>  tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
> +tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
>  tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
> diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
> new file mode 100644
> index 0000000..62a7510
> --- /dev/null
> +++ b/tests/test-blockjob-txn.c
> @@ -0,0 +1,228 @@
> +/*
> + * Blockjob transactions tests
> + *
> + * Copyright Red Hat, Inc. 2015
> + *
> + * Authors:
> + *  Stefan Hajnoczi    <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + */
> +
> +#include <glib.h>
> +#include "qapi/error.h"
> +#include "qemu/main-loop.h"
> +#include "block/blockjob.h"
> +
> +typedef struct {
> +    BlockJob common;
> +    unsigned int iterations;
> +    bool use_timer;
> +    int rc;
> +} TestBlockJob;
> +
> +static const BlockJobDriver test_block_job_driver = {
> +    .instance_size = sizeof(TestBlockJob),
> +};
> +
> +static void test_block_job_complete(BlockJob *job, void *opaque)
> +{
> +    BlockDriverState *bs = job->bs;
> +    int rc = (intptr_t)opaque;
> +
> +    if (block_job_is_cancelled(job)) {
> +        rc = -ECANCELED;
> +    }
> +
> +    block_job_completed(job, rc);
> +    bdrv_unref(bs);
> +}
> +
> +static void coroutine_fn test_block_job_run(void *opaque)
> +{
> +    TestBlockJob *s = opaque;
> +    BlockJob *job = &s->common;
> +
> +    while (s->iterations--) {
> +        if (s->use_timer) {
> +            block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, 0);
> +        } else {
> +            block_job_yield(job);
> +        }
> +
> +        if (block_job_is_cancelled(job)) {
> +            break;
> +        }
> +    }
> +
> +    block_job_defer_to_main_loop(job, test_block_job_complete,
> +                                 (void *)(intptr_t)s->rc);
> +}
> +
> +static void test_block_job_cb(void *opaque, int ret)
> +{
> +    *(int *)opaque = ret;
> +}
> +
> +/* Create a block job that completes with a given return code after a given
> + * number of event loop iterations.  The return code is stored in the given
> + * result pointer.
> + *
> + * The event loop iterations can either be handled automatically with a 0 delay
> + * timer, or they can be stepped manually by entering the coroutine.
> + */
> +static BlockJob *test_block_job_start(unsigned int iterations,
> +                                      bool use_timer,
> +                                      int rc, int *result)
> +{
> +    BlockDriverState *bs;
> +    TestBlockJob *s;
> +
> +    bs = bdrv_new();
> +    s = block_job_create(&test_block_job_driver, bs, 0, test_block_job_cb,
> +                         result, &error_abort);
> +    s->iterations = iterations;
> +    s->use_timer = use_timer;
> +    s->rc = rc;
> +    s->common.co = qemu_coroutine_create(test_block_job_run);
> +    qemu_coroutine_enter(s->common.co, s);
> +    return &s->common;
> +}
> +
> +static void test_single_job(int expected)
> +{
> +    BlockJob *job;
> +    BlockJobTxn *txn;
> +    int result = -EINPROGRESS;
> +
> +    txn = block_job_txn_new();
> +    job = test_block_job_start(1, true, expected, &result);
> +    block_job_txn_add_job(txn, job);
> +
> +    if (expected == -ECANCELED) {
> +        block_job_cancel(job);
> +    }
> +
> +    while (result == -EINPROGRESS) {
> +        aio_poll(qemu_get_aio_context(), true);
> +    }
> +    g_assert_cmpint(result, ==, expected);
> +}
> +
> +static void test_single_job_success(void)
> +{
> +    test_single_job(0);
> +}
> +
> +static void test_single_job_failure(void)
> +{
> +    test_single_job(-EIO);
> +}
> +
> +static void test_single_job_cancel(void)
> +{
> +    test_single_job(-ECANCELED);
> +}
> +
> +static void test_pair_jobs(int expected1, int expected2)
> +{
> +    BlockJob *job1;
> +    BlockJob *job2;
> +    BlockJobTxn *txn;
> +    int result1 = -EINPROGRESS;
> +    int result2 = -EINPROGRESS;
> +
> +    txn = block_job_txn_new();
> +    job1 = test_block_job_start(1, true, expected1, &result1);
> +    block_job_txn_add_job(txn, job1);
> +    job2 = test_block_job_start(2, true, expected2, &result2);
> +    block_job_txn_add_job(txn, job2);
> +
> +    if (expected1 == -ECANCELED) {
> +        block_job_cancel(job1);
> +    }
> +    if (expected2 == -ECANCELED) {
> +        block_job_cancel(job2);
> +    }
> +
> +    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
> +        aio_poll(qemu_get_aio_context(), true);
> +    }
> +
> +    /* Failure or cancellation of one job cancels the other job */
> +    if (expected1 != 0) {
> +        expected2 = -ECANCELED;
> +    } else if (expected2 != 0) {
> +        expected1 = -ECANCELED;
> +    }
> +
> +    g_assert_cmpint(result1, ==, expected1);
> +    g_assert_cmpint(result2, ==, expected2);
> +}
> +
> +static void test_pair_jobs_success(void)
> +{
> +    test_pair_jobs(0, 0);
> +}
> +
> +static void test_pair_jobs_failure(void)
> +{
> +    /* Test both orderings.  The two jobs run for a different number of
> +     * iterations so the code path is different depending on which job fails
> +     * first.
> +     */
> +    test_pair_jobs(-EIO, 0);
> +    test_pair_jobs(0, -EIO);
> +}
> +
> +static void test_pair_jobs_cancel(void)
> +{
> +    test_pair_jobs(-ECANCELED, 0);
> +    test_pair_jobs(0, -ECANCELED);
> +}
> +
> +static void test_pair_jobs_fail_cancel_race(void)
> +{
> +    BlockJob *job1;
> +    BlockJob *job2;
> +    BlockJobTxn *txn;
> +    int result1 = -EINPROGRESS;
> +    int result2 = -EINPROGRESS;
> +
> +    txn = block_job_txn_new();
> +    job1 = test_block_job_start(1, true, -ECANCELED, &result1);
> +    block_job_txn_add_job(txn, job1);
> +    job2 = test_block_job_start(2, false, 0, &result2);
> +    block_job_txn_add_job(txn, job2);
> +
> +    block_job_cancel(job1);
> +
> +    /* Now make job2 finish before the main loop kicks jobs.  This means
> +     * simulates the race between a pending kick and another job completing.
> +     */
> +    block_job_enter(job2);
> +    block_job_enter(job2);
> +
> +    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
> +        aio_poll(qemu_get_aio_context(), true);
> +    }
> +
> +    g_assert_cmpint(result1, ==, -ECANCELED);
> +    g_assert_cmpint(result2, ==, -ECANCELED);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    qemu_init_main_loop(&error_abort);
> +
> +    g_test_init(&argc, &argv, NULL);
> +    g_test_add_func("/single/success", test_single_job_success);
> +    g_test_add_func("/single/failure", test_single_job_failure);
> +    g_test_add_func("/single/cancel", test_single_job_cancel);
> +    g_test_add_func("/pair/success", test_pair_jobs_success);
> +    g_test_add_func("/pair/failure", test_pair_jobs_failure);
> +    g_test_add_func("/pair/cancel", test_pair_jobs_cancel);
> +    g_test_add_func("/pair/fail-cancel-race", test_pair_jobs_fail_cancel_race);
> +    return g_test_run();
> +}
> 

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function
  2015-07-13 23:06   ` John Snow
@ 2015-07-14  2:46     ` Fam Zheng
  0 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-14  2:46 UTC (permalink / raw)
  To: John Snow
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	stefanha

On Mon, 07/13 19:06, John Snow wrote:
> 
> 
> On 07/09/2015 11:46 PM, Fam Zheng wrote:
> > This will be reused by the coming new transactional completion code.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/backup.c | 26 ++++++++++++++++----------
> >  1 file changed, 16 insertions(+), 10 deletions(-)
> > 
> > diff --git a/block/backup.c b/block/backup.c
> > index 965654d..6e24384 100644
> > --- a/block/backup.c
> > +++ b/block/backup.c
> > @@ -210,6 +210,21 @@ static void backup_iostatus_reset(BlockJob *job)
> >  
> >      bdrv_iostatus_reset(s->target);
> >  }
> > +static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)
> > +{
> > +    BdrvDirtyBitmap *bm;
> > +    BlockDriverState *bs = job->common.bs;
> > +
> > +    if (ret < 0 || block_job_is_cancelled(&job->common)) {
> > +        /* Merge the successor back into the parent, delete nothing. */
> > +        bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
> > +        assert(bm);
> > +    } else {
> > +        /* Everything is fine, delete this bitmap and install the backup. */
> > +        bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
> > +        assert(bm);
> > +    }
> > +}
> >  
> >  static const BlockJobDriver backup_job_driver = {
> >      .instance_size  = sizeof(BackupBlockJob),
> > @@ -430,16 +445,7 @@ static void coroutine_fn backup_run(void *opaque)
> >      qemu_co_rwlock_unlock(&job->flush_rwlock);
> >  
> >      if (job->sync_bitmap) {
> > -        BdrvDirtyBitmap *bm;
> > -        if (ret < 0 || block_job_is_cancelled(&job->common)) {
> > -            /* Merge the successor back into the parent, delete nothing. */
> > -            bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
> > -            assert(bm);
> > -        } else {
> > -            /* Everything is fine, delete this bitmap and install the backup. */
> > -            bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
> > -            assert(bm);
> > -        }
> > +        backup_handle_dirty_bitmap(job, ret);
> >      }
> >      hbitmap_free(job->bitmap);
> >  
> > 
> 
> Bike-shedding: strange name, I may have used 'cleanup' or 'finalize' or
> so, but that's neither here nor there.

Good suggestion, so I'll change to "cleanup" while keeping your reviewed-by,
ok? :)

Fam

> 
> Reviewed-by: John Snow <jsnow@redhat.com>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 10/15] block: add block job transactions
  2015-07-13 23:12   ` John Snow
@ 2015-07-14  3:04     ` Fam Zheng
  2015-07-14 15:05       ` John Snow
  0 siblings, 1 reply; 35+ messages in thread
From: Fam Zheng @ 2015-07-14  3:04 UTC (permalink / raw)
  To: John Snow
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	stefanha

On Mon, 07/13 19:12, John Snow wrote:
> 
> 
> On 07/09/2015 11:46 PM, Fam Zheng wrote:
> > From: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > Sometimes block jobs must execute as a transaction group.  Finishing
> > jobs wait until all other jobs are ready to complete successfully.
> > Failure or cancellation of one job cancels the other jobs in the group.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > [Rewrite the implementation which is now contained in block_job_completed.
> > --Fam]
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  blockjob.c               | 86 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/block/block.h    |  1 +
> >  include/block/blockjob.h | 26 +++++++++++++++
> >  3 files changed, 113 insertions(+)
> > 
> > diff --git a/blockjob.c b/blockjob.c
> > index e057dd5..7b59b53 100644
> > --- a/blockjob.c
> > +++ b/blockjob.c
> > @@ -36,6 +36,16 @@
> >  #include "qemu/timer.h"
> >  #include "qapi-event.h"
> >  
> > +/* Transactional group of block jobs */
> > +struct BlockJobTxn {
> > +
> > +    /* Is this txn being cancelled? */
> > +    bool aborting;
> > +
> > +    /* List of jobs */
> > +    QLIST_HEAD(, BlockJob) jobs;
> > +};
> > +
> >  void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
> >                         int64_t speed, BlockCompletionFunc *cb,
> >                         void *opaque, Error **errp)
> > @@ -84,6 +94,59 @@ void block_job_release(BlockDriverState *bs)
> >      g_free(job);
> >  }
> >  
> > +static void block_job_completed_txn(BlockJobTxn *txn, BlockJob *job, int ret)
> > +{
> > +    AioContext *ctx;
> > +    BlockJob *other_job, *next;
> > +    if (ret < 0 || block_job_is_cancelled(job)) {
> > +        if (!txn->aborting) {
> > +            /* We are the first failed job. Cancel other jobs. */
> 
> What guarantee against a race is there at this point? Something to do
> with deferring back to the main loop before we call completion?

This is always in the main thread, so there can't be any race.
> 
> Do jobs always complete in the main thread?

Yes.

> 
> > +            txn->aborting = true;
> > +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> > +                if (other_job == job || other_job->completed) {
> > +                    continue;
> > +                }
> > +                ctx = bdrv_get_aio_context(other_job->bs);
> > +                aio_context_acquire(ctx);
> > +                block_job_cancel_sync(other_job);
> > +                assert(other_job->completed);
> > +                aio_context_release(ctx);
> > +            }
> > +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> > +                if (other_job->driver->txn_abort) {
> > +                    other_job->driver->txn_abort(other_job);
> > +                }
> > +                other_job->cb(other_job->opaque,
> > +                              other_job->ret ? : -ECANCELED);
> > +                block_job_release(other_job->bs);
> > +            }
> > +        } else {
> > +            /*
> > +             * We are cancelled by another job, who will handle everything.
> > +             */
> > +            return;
> > +        }
> > +    } else {
> > +        /*
> > +         * Successful completion, see if there is other running jobs in this
> > +         * txn. */
> 
> s/is/are/

OK.

> 
> > +        QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> > +            if (!other_job->completed) {
> > +                return;
> > +            }
> > +        }
> 
> Clever (though quadratic!) :~)

If someone suggest there could be 1000+ jobs in a txn, I can add a counter.

> 
> > +        /* We are the last completed job, commit the transaction. */
> > +        QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> > +            if (other_job->driver->txn_commit) {
> > +                other_job->driver->txn_commit(other_job);
> > +            }
> > +            assert(other_job->ret == 0);
> > +            other_job->cb(other_job->opaque, 0);
> > +            block_job_release(other_job->bs);
> 
> Worth creating some static local/inline routine and sharing it with
> block_job_completed?

Mayb when a 3rd common line comes?
> 
> > +        }
> > +    }
> > +}
> > +
> >  void block_job_completed(BlockJob *job, int ret)
> >  {
> >      BlockDriverState *bs = job->bs;
> > @@ -96,6 +159,10 @@ void block_job_completed(BlockJob *job, int ret)
> >          qemu_bh_delete(job->defer_to_main_loop_data.bh);
> >          job->defer_to_main_loop_data.bh = NULL;
> >      }
> > +    if (job->txn) {
> > +        block_job_completed_txn(job->txn, job, ret);
> > +        return;
> > +    }
> >      job->cb(job->opaque, ret);
> >      block_job_release(bs);
> >  }
> > @@ -384,3 +451,22 @@ void block_job_defer_to_main_loop(BlockJob *job,
> >  
> >      qemu_bh_schedule(data->bh);
> >  }
> > +
> > +BlockJobTxn *block_job_txn_new(void)
> > +{
> > +    BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
> > +    QLIST_INIT(&txn->jobs);
> > +    return txn;
> > +}
> > +
> > +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> > +{
> > +    if (!txn) {
> > +        return;
> > +    }
> > +
> > +    assert(!job->txn);
> > +    job->txn = txn;
> > +
> > +    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
> > +}
> > diff --git a/include/block/block.h b/include/block/block.h
> > index 7437590..c7fc5b6 100644
> > --- a/include/block/block.h
> > +++ b/include/block/block.h
> > @@ -13,6 +13,7 @@
> >  typedef struct BlockDriver BlockDriver;
> >  typedef struct BlockJob BlockJob;
> >  typedef struct BdrvChildRole BdrvChildRole;
> > +typedef struct BlockJobTxn BlockJobTxn;
> >  
> >  typedef struct BlockDriverInfo {
> >      /* in bytes, 0 if irrelevant */
> > diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> > index 5bac2e2..d854ee0 100644
> > --- a/include/block/blockjob.h
> > +++ b/include/block/blockjob.h
> > @@ -157,6 +157,9 @@ struct BlockJob {
> >       */
> >      int ret;
> >  
> > +    /** Non-NULL if this job is part of a transaction */
> > +    BlockJobTxn *txn;
> > +    QLIST_ENTRY(BlockJob) txn_list;
> >  };
> >  
> >  /**
> > @@ -389,4 +392,27 @@ void block_job_defer_to_main_loop(BlockJob *job,
> >                                    BlockJobDeferToMainLoopFn *fn,
> >                                    void *opaque);
> >  
> > +/**
> > + * block_job_txn_new:
> > + *
> > + * Allocate and return a new block job transaction.  Jobs can be added to the
> > + * transaction using block_job_txn_add_job().
> > + *
> > + * All jobs in the transaction either complete successfully or fail/cancel as a
> > + * group.  Jobs wait for each other before completing.  Cancelling one job
> > + * cancels all jobs in the transaction.
> > + */
> > +BlockJobTxn *block_job_txn_new(void);
> > +
> > +/**
> > + * block_job_txn_add_job:
> > + * @txn: The transaction (may be NULL)
> > + * @job: Job to add to the transaction
> > + *
> > + * Add @job to the transaction.  The @job must not already be in a transaction.
> > + * The block job driver must call block_job_txn_prepare_to_complete() before
> > + * final cleanup and completion.
> > + */
> > +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
> > +
> >  #endif
> > 
> 
> This definitely combines the best of our approaches AND has less code to
> boot. Great!
> 
> Nitpicks and comment spelling aside:
> Reviewed-by: John Snow <jsnow@redhat.com>

Thanks!

> 
> --
> (linguistic postscript: I don't know how to say "Stefan and I's patches"
> because I think 'I' is incorrect here because the subject of the
> sentence is the patches, not Stefan or myself. I think it's some variant
> on "me" "my" or "mine," but can't quite work it out. English is hard, or
> I'm stupid, or both.)
> 

LOL!

Fam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions
  2015-07-13 23:14   ` John Snow
@ 2015-07-14  3:12     ` Fam Zheng
  0 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-14  3:12 UTC (permalink / raw)
  To: John Snow
  Cc: Kevin Wolf, Markus Armbruster, Jeff Cody, qemu-devel, Max Reitz,
	vsementsov, stefanha

On Mon, 07/13 19:14, John Snow wrote:
> > +static void backup_txn_commit(BlockJob *job)
> > +{
> > +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> > +    if (s->sync_bitmap) {
> > +        backup_handle_dirty_bitmap(s, 0);
> > +    }
> > +}
> > +
> > +static void backup_txn_abort(BlockJob *job)
> > +{
> > +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> > +    if (s->sync_bitmap) {
> > +        backup_handle_dirty_bitmap(s, -1);
> > +    }
> > +}
> > +
> 
> If you're going to check for sync_bitmap out here in these two functions
> instead of inside backup_handle_dirty_bitmap, add an assertion into the
> called function that job->sync_bitmap *will* be present.

The assertion is not really necessary because we'll get a seg fault if it's
NULL.

> 
> >  static const BlockJobDriver backup_job_driver = {
> >      .instance_size  = sizeof(BackupBlockJob),
> >      .job_type       = BLOCK_JOB_TYPE_BACKUP,
> >      .set_speed      = backup_set_speed,
> >      .iostatus_reset = backup_iostatus_reset,
> > +    .txn_commit     = backup_txn_commit,
> > +    .txn_abort      = backup_txn_abort,
> >  };
> >  
> >  static BlockErrorAction backup_error_action(BackupBlockJob *job,
> > @@ -444,7 +462,7 @@ static void coroutine_fn backup_run(void *opaque)
> >      qemu_co_rwlock_wrlock(&job->flush_rwlock);
> >      qemu_co_rwlock_unlock(&job->flush_rwlock);
> >  
> > -    if (job->sync_bitmap) {
> > +    if (!job->common.txn && job->sync_bitmap) {
> >          backup_handle_dirty_bitmap(job, ret);
> >      }
> 
> Can we add a little call to blockjobs, like:
> 
> bool block_cleanup_needed(BlockJob *job)
> {
>     return job->common.txn == NULL;
> }
> 
> To make this status check look less magical?
> Then, if the sync bitmap check is pushed into backup_handle, you can
> just simply say:
> 
> if (blockjob_cleanup_needed(job)) {
>     backup_handle_dirty_bitmap(job, ret);
> }
> 
> which IMO is nicer.

Good idea.

> 
> >      hbitmap_free(job->bitmap);
> > @@ -463,7 +481,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
> >                    BlockdevOnError on_source_error,
> >                    BlockdevOnError on_target_error,
> >                    BlockCompletionFunc *cb, void *opaque,
> > -                  Error **errp)
> > +                  BlockJobTxn *txn, Error **errp)
> >  {
> >      int64_t len;
> >  
> > @@ -545,6 +563,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
> >                         sync_bitmap : NULL;
> >      job->common.len = len;
> >      job->common.co = qemu_coroutine_create(backup_run);
> > +    block_job_txn_add_job(txn, &job->common);
> 
> OK, so all backup jobs will participate in the transaction -- but they
> can control this based on whether or not they pass forward the txn
> parameter.

It's a NOP if txn is NULL.

> 
> >      qemu_coroutine_enter(job->common.co, job);
> >      return;
> >  

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function Fam Zheng
  2015-07-13 23:06   ` John Snow
@ 2015-07-14  8:26   ` Stefan Hajnoczi
  1 sibling, 0 replies; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-07-14  8:26 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]

On Fri, Jul 10, 2015 at 11:46:42AM +0800, Fam Zheng wrote:
> This will be reused by the coming new transactional completion code.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/backup.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 965654d..6e24384 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -210,6 +210,21 @@ static void backup_iostatus_reset(BlockJob *job)
>  
>      bdrv_iostatus_reset(s->target);
>  }
> +static void backup_handle_dirty_bitmap(BackupBlockJob *job, int ret)

"Dirty bitmap" is a general term.  This block job uses dirty bitmaps
besides the sync_bitmap.  I suggest making the name clearer:

backup_cleanup_sync_bitmap()

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions Fam Zheng
  2015-07-13 23:06   ` John Snow
@ 2015-07-14  8:35   ` Stefan Hajnoczi
  2015-07-14  9:26     ` Fam Zheng
  1 sibling, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-07-14  8:35 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

On Fri, Jul 10, 2015 at 11:46:43AM +0800, Fam Zheng wrote:
> They will be called if the job is part of a transaction, after all jobs in a
> transaction are completed or cancelled, before calling job->cb().
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/blockjob.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index dd9d5e6..a7b7f66 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -50,6 +50,18 @@ typedef struct BlockJobDriver {
>       * manually.
>       */
>      void (*complete)(BlockJob *job, Error **errp);
> +
> +    /**
> +     * Optional callback for job types that can be in a transaction. Called
> +     * when the transaction succeeds.
> +     */
> +    void (*txn_commit)(BlockJob *job);
> +
> +    /**
> +     * Optional callback for job types that can be in a transaction. Call when
> +     * the transaction fails.
> +     */
> +    void (*txn_abort)(BlockJob *job);
>  } BlockJobDriver;

The semantics of transactions aren't fully documented.  My understanding
is that you want:
1. either .txn_commit() or .txn_abort() to be called
2. after all jobs complete (due to success, error, or cancellation).

These two points are important for understanding these interfaces.

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions
  2015-07-14  8:35   ` Stefan Hajnoczi
@ 2015-07-14  9:26     ` Fam Zheng
  0 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-14  9:26 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

On Tue, 07/14 09:35, Stefan Hajnoczi wrote:
> On Fri, Jul 10, 2015 at 11:46:43AM +0800, Fam Zheng wrote:
> > They will be called if the job is part of a transaction, after all jobs in a
> > transaction are completed or cancelled, before calling job->cb().
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  include/block/blockjob.h | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> > index dd9d5e6..a7b7f66 100644
> > --- a/include/block/blockjob.h
> > +++ b/include/block/blockjob.h
> > @@ -50,6 +50,18 @@ typedef struct BlockJobDriver {
> >       * manually.
> >       */
> >      void (*complete)(BlockJob *job, Error **errp);
> > +
> > +    /**
> > +     * Optional callback for job types that can be in a transaction. Called
> > +     * when the transaction succeeds.
> > +     */
> > +    void (*txn_commit)(BlockJob *job);
> > +
> > +    /**
> > +     * Optional callback for job types that can be in a transaction. Call when
> > +     * the transaction fails.
> > +     */
> > +    void (*txn_abort)(BlockJob *job);
> >  } BlockJobDriver;
> 
> The semantics of transactions aren't fully documented.  My understanding
> is that you want:
> 1. either .txn_commit() or .txn_abort() to be called
> 2. after all jobs complete (due to success, error, or cancellation).
> 
> These two points are important for understanding these interfaces.

Yes, I'll add document these.

Fam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob Fam Zheng
@ 2015-07-14 10:03   ` Stefan Hajnoczi
  2015-07-14 10:36     ` Fam Zheng
  0 siblings, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-07-14 10:03 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

[-- Attachment #1: Type: text/plain, Size: 2415 bytes --]

On Fri, Jul 10, 2015 at 11:46:46AM +0800, Fam Zheng wrote:
> diff --git a/blockjob.c b/blockjob.c
> index 11b48f5..e057dd5 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -92,6 +92,10 @@ void block_job_completed(BlockJob *job, int ret)
>      assert(!job->completed);
>      job->completed = true;
>      job->ret = ret;
> +    if (job->defer_to_main_loop_data.bh) {
> +        qemu_bh_delete(job->defer_to_main_loop_data.bh);
> +        job->defer_to_main_loop_data.bh = NULL;
> +    }
>      job->cb(job->opaque, ret);
>      block_job_release(bs);
>  }

This doesn't make sense to me:

1. This function is called from a defer_to_main_loop BH so
   job->defer_to_main_loop_data.bh should already be running here.

   In fact, we've already called qemu_bh_delete() in
   block_job_defer_to_main_loop_bh().  This call is pointless but you
   could change it to an assertion and assign bh = NULL in
   block_job_defer_to_main_loop_bh().

2. If there was a BH scheduled, why is it safe to silently delete it?
   Wouldn't the caller rely on the BH code executing to clean up, for
   example?

If you drop this if statement, is it necessary to move
BlockJobDeferToMainLoopData into BlockJob at all?  Maybe you can just
drop this patch.

> @@ -344,44 +348,36 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
>      return action;
>  }
>  
> -typedef struct {
> -    BlockJob *job;
> -    QEMUBH *bh;
> -    AioContext *aio_context;
> -    BlockJobDeferToMainLoopFn *fn;
> -    void *opaque;
> -} BlockJobDeferToMainLoopData;
> -
>  static void block_job_defer_to_main_loop_bh(void *opaque)
>  {
> -    BlockJobDeferToMainLoopData *data = opaque;
> +    BlockJob *job = opaque;
> +    /* Copy the struct in case job get released in data.fn() */
> +    BlockJobDeferToMainLoopData data = job->defer_to_main_loop_data;

A comment is warranted since this access is only safe due to the
following:

The job may still be executing in another thread (the AioContext hasn't
been acquired by us yet), but job->defer_to_main_loop_data isn't
modified by the other thread after the qemu_bh_schedule() call.

Additionally, the atomic_xchg memory barriers in qemu_bh_schedule() and
aio_bh_poll() to ensure that this thread sees the latest
job->defer_to_main_loop_data data.

Access to other job fields would probably not be safe here!

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 10/15] block: add block job transactions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 10/15] block: add block job transactions Fam Zheng
  2015-07-13 23:12   ` John Snow
@ 2015-07-14 10:27   ` Stefan Hajnoczi
  1 sibling, 0 replies; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-07-14 10:27 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

[-- Attachment #1: Type: text/plain, Size: 3831 bytes --]

On Fri, Jul 10, 2015 at 11:46:47AM +0800, Fam Zheng wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>

Please take authorship, most of the code is yours.

> +static void block_job_completed_txn(BlockJobTxn *txn, BlockJob *job, int ret)
> +{
> +    AioContext *ctx;
> +    BlockJob *other_job, *next;
> +    if (ret < 0 || block_job_is_cancelled(job)) {
> +        if (!txn->aborting) {
> +            /* We are the first failed job. Cancel other jobs. */
> +            txn->aborting = true;
> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +                if (other_job == job || other_job->completed) {

You didn't leave any clues as to why other_job->complete is safe to
access outside its AioContext.

Perhaps because it's only modified from the main loop and we are in the
main loop too?

Please either access it inside the AioContext or add a comment
explaining why it is safe.

> +                    continue;
> +                }
> +                ctx = bdrv_get_aio_context(other_job->bs);
> +                aio_context_acquire(ctx);
> +                block_job_cancel_sync(other_job);
> +                assert(other_job->completed);
> +                aio_context_release(ctx);
> +            }
> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +                if (other_job->driver->txn_abort) {
> +                    other_job->driver->txn_abort(other_job);
> +                }
> +                other_job->cb(other_job->opaque,
> +                              other_job->ret ? : -ECANCELED);

Please don't add ECANCELED here since block_job_completed() doesn't do
that either.  We should be consistent: currently cb() needs to check
block_job_is_cancelled() to determine whether the job was cancelled.

Also, if a job finished but another job aborts, then we need to mark the
first job as cancelled (other_job->cancelled == true).

> +                block_job_release(other_job->bs);
> +            }

No AioContext acquire/release in this loop?  Remember the job's bs can
still be accessed from another thread while we're running.

> +        } else {
> +            /*
> +             * We are cancelled by another job, who will handle everything.
> +             */
> +            return;
> +        }
> +    } else {
> +        /*
> +         * Successful completion, see if there is other running jobs in this

s/is/are/ (plural)

> +         * txn. */
> +        QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> +            if (!other_job->completed) {
> +                return;
> +            }
> +        }
> +        /* We are the last completed job, commit the transaction. */
> +        QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +            if (other_job->driver->txn_commit) {
> +                other_job->driver->txn_commit(other_job);
> +            }
> +            assert(other_job->ret == 0);
> +            other_job->cb(other_job->opaque, 0);
> +            block_job_release(other_job->bs);
> +        }

More other_job field accesses outside AioContext that need to be checked
and probably protected using acquire/release.

> +/**
> + * block_job_txn_new:
> + *
> + * Allocate and return a new block job transaction.  Jobs can be added to the
> + * transaction using block_job_txn_add_job().
> + *
> + * All jobs in the transaction either complete successfully or fail/cancel as a
> + * group.  Jobs wait for each other before completing.  Cancelling one job
> + * cancels all jobs in the transaction.
> + */
> +BlockJobTxn *block_job_txn_new(void);

The doc comments says "Allocate and return a new block job transaction"
but does not explain how/when the object is freed.  Please add a
sentence along the lines of: The transaction is automatically freed when
the last job completes or is cancelled.

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions
  2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions Fam Zheng
  2015-07-13 23:14   ` John Snow
@ 2015-07-14 10:32   ` Stefan Hajnoczi
  1 sibling, 0 replies; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-07-14 10:32 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

[-- Attachment #1: Type: text/plain, Size: 2736 bytes --]

On Fri, Jul 10, 2015 at 11:46:49AM +0800, Fam Zheng wrote:
>  static BlockErrorAction backup_error_action(BackupBlockJob *job,
> @@ -444,7 +462,7 @@ static void coroutine_fn backup_run(void *opaque)
>      qemu_co_rwlock_wrlock(&job->flush_rwlock);
>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>  
> -    if (job->sync_bitmap) {
> +    if (!job->common.txn && job->sync_bitmap) {
>          backup_handle_dirty_bitmap(job, ret);
>      }
>      hbitmap_free(job->bitmap);

It would be nice if the core blockjob code called commit/abort even when
there is no txn object.  That way we can avoid special case code

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 7b2efb8..d5e33fd 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -736,6 +736,10 @@
>  #                   default 'report' (no limitations, since this applies to
>  #                   a different block device than @device).
>  #
> +# @transactional-cancel: #optional whether failure or cancellation of other
> +#                        block jobs with @transactional-cancel true causes the
> +#                        whole group to cancel.
> +#
>  # Note that @on-source-error and @on-target-error only affect background I/O.
>  # If an error occurs during a guest write request, the device's rerror/werror
>  # actions will be used.
> @@ -747,7 +751,8 @@
>              'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
>              '*speed': 'int', '*bitmap': 'str',
>              '*on-source-error': 'BlockdevOnError',
> -            '*on-target-error': 'BlockdevOnError' } }
> +            '*on-target-error': 'BlockdevOnError',
> +            '*transactional-cancel': 'bool' } }
>  
>  ##
>  # @BlockdevBackup
> @@ -771,6 +776,10 @@
>  #                   default 'report' (no limitations, since this applies to
>  #                   a different block device than @device).
>  #
> +# @transactional-cancel: #optional whether failure or cancellation of other
> +#                        block jobs with @transactional-cancel true causes the
> +#                        whole group to cancel.
> +#
>  # Note that @on-source-error and @on-target-error only affect background I/O.
>  # If an error occurs during a guest write request, the device's rerror/werror
>  # actions will be used.
> @@ -782,7 +791,8 @@
>              'sync': 'MirrorSyncMode',
>              '*speed': 'int',
>              '*on-source-error': 'BlockdevOnError',
> -            '*on-target-error': 'BlockdevOnError' } }
> +            '*on-target-error': 'BlockdevOnError',
> +            '*transactional-cancel': 'bool' } }

The doc comments are missing (Since: 2.5).  My fault, sorry!

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob
  2015-07-14 10:03   ` Stefan Hajnoczi
@ 2015-07-14 10:36     ` Fam Zheng
  0 siblings, 0 replies; 35+ messages in thread
From: Fam Zheng @ 2015-07-14 10:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	John Snow

On Tue, 07/14 11:03, Stefan Hajnoczi wrote:
> On Fri, Jul 10, 2015 at 11:46:46AM +0800, Fam Zheng wrote:
> > diff --git a/blockjob.c b/blockjob.c
> > index 11b48f5..e057dd5 100644
> > --- a/blockjob.c
> > +++ b/blockjob.c
> > @@ -92,6 +92,10 @@ void block_job_completed(BlockJob *job, int ret)
> >      assert(!job->completed);
> >      job->completed = true;
> >      job->ret = ret;
> > +    if (job->defer_to_main_loop_data.bh) {
> > +        qemu_bh_delete(job->defer_to_main_loop_data.bh);
> > +        job->defer_to_main_loop_data.bh = NULL;
> > +    }
> >      job->cb(job->opaque, ret);
> >      block_job_release(bs);
> >  }
> 
> This doesn't make sense to me:
> 
> 1. This function is called from a defer_to_main_loop BH so
>    job->defer_to_main_loop_data.bh should already be running here.
> 
>    In fact, we've already called qemu_bh_delete() in
>    block_job_defer_to_main_loop_bh().  This call is pointless but you
>    could change it to an assertion and assign bh = NULL in
>    block_job_defer_to_main_loop_bh().
> 
> 2. If there was a BH scheduled, why is it safe to silently delete it?
>    Wouldn't the caller rely on the BH code executing to clean up, for
>    example?
> 
> If you drop this if statement, is it necessary to move
> BlockJobDeferToMainLoopData into BlockJob at all?  Maybe you can just
> drop this patch.

You're right, I agree this patch is superfluous and can be dropped.

Fam

> 
> > @@ -344,44 +348,36 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
> >      return action;
> >  }
> >  
> > -typedef struct {
> > -    BlockJob *job;
> > -    QEMUBH *bh;
> > -    AioContext *aio_context;
> > -    BlockJobDeferToMainLoopFn *fn;
> > -    void *opaque;
> > -} BlockJobDeferToMainLoopData;
> > -
> >  static void block_job_defer_to_main_loop_bh(void *opaque)
> >  {
> > -    BlockJobDeferToMainLoopData *data = opaque;
> > +    BlockJob *job = opaque;
> > +    /* Copy the struct in case job get released in data.fn() */
> > +    BlockJobDeferToMainLoopData data = job->defer_to_main_loop_data;
> 
> A comment is warranted since this access is only safe due to the
> following:
> 
> The job may still be executing in another thread (the AioContext hasn't
> been acquired by us yet), but job->defer_to_main_loop_data isn't
> modified by the other thread after the qemu_bh_schedule() call.
> 
> Additionally, the atomic_xchg memory barriers in qemu_bh_schedule() and
> aio_bh_poll() to ensure that this thread sees the latest
> job->defer_to_main_loop_data data.
> 
> Access to other job fields would probably not be safe here!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v3 10/15] block: add block job transactions
  2015-07-14  3:04     ` Fam Zheng
@ 2015-07-14 15:05       ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-07-14 15:05 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, Max Reitz, vsementsov,
	stefanha



On 07/13/2015 11:04 PM, Fam Zheng wrote:
> On Mon, 07/13 19:12, John Snow wrote:
>>
>>
>> On 07/09/2015 11:46 PM, Fam Zheng wrote:
>>> From: Stefan Hajnoczi <stefanha@redhat.com>
>>>
>>> Sometimes block jobs must execute as a transaction group.  Finishing
>>> jobs wait until all other jobs are ready to complete successfully.
>>> Failure or cancellation of one job cancels the other jobs in the group.
>>>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> [Rewrite the implementation which is now contained in block_job_completed.
>>> --Fam]
>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>> ---
>>>  blockjob.c               | 86 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/block/block.h    |  1 +
>>>  include/block/blockjob.h | 26 +++++++++++++++
>>>  3 files changed, 113 insertions(+)
>>>
>>> diff --git a/blockjob.c b/blockjob.c
>>> index e057dd5..7b59b53 100644
>>> --- a/blockjob.c
>>> +++ b/blockjob.c
>>> @@ -36,6 +36,16 @@
>>>  #include "qemu/timer.h"
>>>  #include "qapi-event.h"
>>>  
>>> +/* Transactional group of block jobs */
>>> +struct BlockJobTxn {
>>> +
>>> +    /* Is this txn being cancelled? */
>>> +    bool aborting;
>>> +
>>> +    /* List of jobs */
>>> +    QLIST_HEAD(, BlockJob) jobs;
>>> +};
>>> +
>>>  void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>>>                         int64_t speed, BlockCompletionFunc *cb,
>>>                         void *opaque, Error **errp)
>>> @@ -84,6 +94,59 @@ void block_job_release(BlockDriverState *bs)
>>>      g_free(job);
>>>  }
>>>  
>>> +static void block_job_completed_txn(BlockJobTxn *txn, BlockJob *job, int ret)
>>> +{
>>> +    AioContext *ctx;
>>> +    BlockJob *other_job, *next;
>>> +    if (ret < 0 || block_job_is_cancelled(job)) {
>>> +        if (!txn->aborting) {
>>> +            /* We are the first failed job. Cancel other jobs. */
>>
>> What guarantee against a race is there at this point? Something to do
>> with deferring back to the main loop before we call completion?
> 
> This is always in the main thread, so there can't be any race.
>>
>> Do jobs always complete in the main thread?
> 
> Yes.
> 

Thought as much, now I know, thanks.

>>
>>> +            txn->aborting = true;
>>> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
>>> +                if (other_job == job || other_job->completed) {
>>> +                    continue;
>>> +                }
>>> +                ctx = bdrv_get_aio_context(other_job->bs);
>>> +                aio_context_acquire(ctx);
>>> +                block_job_cancel_sync(other_job);
>>> +                assert(other_job->completed);
>>> +                aio_context_release(ctx);
>>> +            }
>>> +            QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
>>> +                if (other_job->driver->txn_abort) {
>>> +                    other_job->driver->txn_abort(other_job);
>>> +                }
>>> +                other_job->cb(other_job->opaque,
>>> +                              other_job->ret ? : -ECANCELED);
>>> +                block_job_release(other_job->bs);
>>> +            }
>>> +        } else {
>>> +            /*
>>> +             * We are cancelled by another job, who will handle everything.
>>> +             */
>>> +            return;
>>> +        }
>>> +    } else {
>>> +        /*
>>> +         * Successful completion, see if there is other running jobs in this
>>> +         * txn. */
>>
>> s/is/are/
> 
> OK.
> 
>>
>>> +        QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
>>> +            if (!other_job->completed) {
>>> +                return;
>>> +            }
>>> +        }
>>
>> Clever (though quadratic!) :~)
> 
> If someone suggest there could be 1000+ jobs in a txn, I can add a counter.
> 

My idea of a joke. Forgive me :)

>>
>>> +        /* We are the last completed job, commit the transaction. */
>>> +        QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
>>> +            if (other_job->driver->txn_commit) {
>>> +                other_job->driver->txn_commit(other_job);
>>> +            }
>>> +            assert(other_job->ret == 0);
>>> +            other_job->cb(other_job->opaque, 0);
>>> +            block_job_release(other_job->bs);
>>
>> Worth creating some static local/inline routine and sharing it with
>> block_job_completed?
> 
> Mayb when a 3rd common line comes?

Fine by me, just my habit of asking out loud. The answer can always
simply be "I don't feel like it."

>>
>>> +        }
>>> +    }
>>> +}
>>> +
>>>  void block_job_completed(BlockJob *job, int ret)
>>>  {
>>>      BlockDriverState *bs = job->bs;
>>> @@ -96,6 +159,10 @@ void block_job_completed(BlockJob *job, int ret)
>>>          qemu_bh_delete(job->defer_to_main_loop_data.bh);
>>>          job->defer_to_main_loop_data.bh = NULL;
>>>      }
>>> +    if (job->txn) {
>>> +        block_job_completed_txn(job->txn, job, ret);
>>> +        return;
>>> +    }
>>>      job->cb(job->opaque, ret);
>>>      block_job_release(bs);
>>>  }
>>> @@ -384,3 +451,22 @@ void block_job_defer_to_main_loop(BlockJob *job,
>>>  
>>>      qemu_bh_schedule(data->bh);
>>>  }
>>> +
>>> +BlockJobTxn *block_job_txn_new(void)
>>> +{
>>> +    BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
>>> +    QLIST_INIT(&txn->jobs);
>>> +    return txn;
>>> +}
>>> +
>>> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
>>> +{
>>> +    if (!txn) {
>>> +        return;
>>> +    }
>>> +
>>> +    assert(!job->txn);
>>> +    job->txn = txn;
>>> +
>>> +    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
>>> +}
>>> diff --git a/include/block/block.h b/include/block/block.h
>>> index 7437590..c7fc5b6 100644
>>> --- a/include/block/block.h
>>> +++ b/include/block/block.h
>>> @@ -13,6 +13,7 @@
>>>  typedef struct BlockDriver BlockDriver;
>>>  typedef struct BlockJob BlockJob;
>>>  typedef struct BdrvChildRole BdrvChildRole;
>>> +typedef struct BlockJobTxn BlockJobTxn;
>>>  
>>>  typedef struct BlockDriverInfo {
>>>      /* in bytes, 0 if irrelevant */
>>> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
>>> index 5bac2e2..d854ee0 100644
>>> --- a/include/block/blockjob.h
>>> +++ b/include/block/blockjob.h
>>> @@ -157,6 +157,9 @@ struct BlockJob {
>>>       */
>>>      int ret;
>>>  
>>> +    /** Non-NULL if this job is part of a transaction */
>>> +    BlockJobTxn *txn;
>>> +    QLIST_ENTRY(BlockJob) txn_list;
>>>  };
>>>  
>>>  /**
>>> @@ -389,4 +392,27 @@ void block_job_defer_to_main_loop(BlockJob *job,
>>>                                    BlockJobDeferToMainLoopFn *fn,
>>>                                    void *opaque);
>>>  
>>> +/**
>>> + * block_job_txn_new:
>>> + *
>>> + * Allocate and return a new block job transaction.  Jobs can be added to the
>>> + * transaction using block_job_txn_add_job().
>>> + *
>>> + * All jobs in the transaction either complete successfully or fail/cancel as a
>>> + * group.  Jobs wait for each other before completing.  Cancelling one job
>>> + * cancels all jobs in the transaction.
>>> + */
>>> +BlockJobTxn *block_job_txn_new(void);
>>> +
>>> +/**
>>> + * block_job_txn_add_job:
>>> + * @txn: The transaction (may be NULL)
>>> + * @job: Job to add to the transaction
>>> + *
>>> + * Add @job to the transaction.  The @job must not already be in a transaction.
>>> + * The block job driver must call block_job_txn_prepare_to_complete() before
>>> + * final cleanup and completion.
>>> + */
>>> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
>>> +
>>>  #endif
>>>
>>
>> This definitely combines the best of our approaches AND has less code to
>> boot. Great!
>>
>> Nitpicks and comment spelling aside:
>> Reviewed-by: John Snow <jsnow@redhat.com>
> 
> Thanks!
> 
>>
>> --
>> (linguistic postscript: I don't know how to say "Stefan and I's patches"
>> because I think 'I' is incorrect here because the subject of the
>> sentence is the patches, not Stefan or myself. I think it's some variant
>> on "me" "my" or "mine," but can't quite work it out. English is hard, or
>> I'm stupid, or both.)
>>
> 
> LOL!
> 
> Fam
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-07-14 15:05 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-10  3:46 [Qemu-devel] [PATCH v3 00/15] block: incremental backup transactions using BlockJobTxn Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 01/15] qapi: Add transaction support to block-dirty-bitmap operations Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 02/15] iotests: add transactional incremental backup test Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 03/15] block: rename BlkTransactionState and BdrvActionOps Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 04/15] block: keep bitmap if incremental backup job is cancelled Fam Zheng
2015-07-13 19:48   ` John Snow
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 05/15] backup: Extract dirty bitmap handling as a separate function Fam Zheng
2015-07-13 23:06   ` John Snow
2015-07-14  2:46     ` Fam Zheng
2015-07-14  8:26   ` Stefan Hajnoczi
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 06/15] blockjob: Add .txn_commit and .txn_abort transaction actions Fam Zheng
2015-07-13 23:06   ` John Snow
2015-07-14  8:35   ` Stefan Hajnoczi
2015-07-14  9:26     ` Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 07/15] blockjob: Add "completed" and "ret" in BlockJob Fam Zheng
2015-07-13 23:08   ` John Snow
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 08/15] blockjob: Simplify block_job_finish_sync Fam Zheng
2015-07-13 23:08   ` John Snow
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 09/15] blockjob: Move BlockJobDeferToMainLoopData into BlockJob Fam Zheng
2015-07-14 10:03   ` Stefan Hajnoczi
2015-07-14 10:36     ` Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 10/15] block: add block job transactions Fam Zheng
2015-07-13 23:12   ` John Snow
2015-07-14  3:04     ` Fam Zheng
2015-07-14 15:05       ` John Snow
2015-07-14 10:27   ` Stefan Hajnoczi
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 11/15] blockdev: make BlockJobTxn available to qmp 'transaction' Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 12/15] block/backup: support block job transactions Fam Zheng
2015-07-13 23:14   ` John Snow
2015-07-14  3:12     ` Fam Zheng
2015-07-14 10:32   ` Stefan Hajnoczi
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 13/15] iotests: 124 - transactional failure test Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 14/15] qmp-commands.hx: Update the supported 'transaction' operations Fam Zheng
2015-07-10  3:46 ` [Qemu-devel] [PATCH v3 15/15] tests: add BlockJobTxn unit test Fam Zheng
2015-07-13 23:14   ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.