* [Qemu-devel] [PATCH] block: bdrv_invalidate_cache_all: invalidate children first
@ 2017-01-30 14:17 Vladimir Sementsov-Ogievskiy
2017-01-31 11:26 ` [Qemu-devel] [PATCH] DROP THIS " Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 2+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-01-30 14:17 UTC (permalink / raw
To: qemu-block, qemu-devel
Cc: kwolf, mreitz, armbru, eblake, jsnow, famz, den, stefanha,
vsementsov, pbonzini
Current implementation iterates by bdrv_next, and, for example, will
invalidate firstly parent bds and then its children. This leads to the
following bug:
after incoming migration, in bdrv_invalidate_cache_all:
1. invalidate parent bds - reopen it with BDRV_O_INACTIVE cleared
2. child is not yet invalidated
3. parent check that its BDRV_O_INACTIVE is cleared
4. parent writes to child
5. assert in bdrv_co_pwritev, as BDRV_O_INACTIVE is set for child
This patch fixes it by just changing invalidate sequence: invalidate
children first.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
Hi all!
There is a bug, which was found during testing of my qcow2-bitmap series. Actually,
we run into this assert when reopen qcow2 with dirty bitmap after incoming migration.
How to test:
1. apply the following change (force qcow2 driver write to child on open):
diff --git a/block/qcow2.c b/block/qcow2.c
index 96fb8a8f16..96d803e593 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1152,6 +1152,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Clear unknown autoclear feature bits */
+ s->autoclear_features = 1;
if (!bs->read_only && !(flags & BDRV_O_INACTIVE) && s->autoclear_features) {
s->autoclear_features = 0;
ret = qcow2_update_header(bs);
2. run iotest 091
It will fail, because of:
Program received signal SIGABRT, Aborted.
0x00007f56340925f7 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f56340925f7 in raise () from /lib64/libc.so.6
#1 0x00007f5634093ce8 in abort () from /lib64/libc.so.6
#2 0x00007f563408b566 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f563408b612 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f563bd5e2b7 in bdrv_co_pwritev (child=0x7f563dd8b990, offset=0, bytes=65536,
qiov=0x7ffec2c87120, flags=0) at block/io.c:1510
#5 0x00007f563bd5bbf4 in bdrv_rw_co_entry (opaque=0x7ffec2c87060) at block/io.c:591
#6 0x00007f563bdfb9e3 in coroutine_trampoline (i0=1056362256, i1=32598) at util/coroutine-ucontext.c:79
#7 0x00007f56340a4110 in ?? () from /lib64/libc.so.6
#8 0x00007ffec2c86870 in ?? ()
#9 0x0000000000000000 in ?? ()
(gdb) frame 4
#4 0x00007f563bd5e2b7 in bdrv_co_pwritev (child=0x7f563dd8b990, offset=0, bytes=65536,
qiov=0x7ffec2c87120, flags=0) at block/io.c:1510
1510 assert(!(bs->open_flags & BDRV_O_INACTIVE));
3. with both applied change (1.) and the following bugfix test doesn't fail.
block.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/block.c b/block.c
index a0346c80c6..d9e2ba9b5a 100644
--- a/block.c
+++ b/block.c
@@ -3263,6 +3263,29 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
}
}
+static void bdrv_invalidate_cache_recurse(BlockDriverState *bs, Error **errp)
+{
+ Error *local_err = NULL;
+ AioContext *aio_context = bdrv_get_aio_context(bs);
+ BdrvChild *child;
+
+ if (!(bs->open_flags & BDRV_O_INACTIVE)) {
+ return;
+ }
+
+ QLIST_FOREACH(child, &bs->children, next) {
+ bdrv_invalidate_cache_recurse(child->bs, &local_err);
+ if (local_err) {
+ error_propagate(errp, local_err);
+ return;
+ }
+ }
+
+ aio_context_acquire(aio_context);
+ bdrv_invalidate_cache(bs, errp);
+ aio_context_release(aio_context);
+}
+
void bdrv_invalidate_cache_all(Error **errp)
{
BlockDriverState *bs;
@@ -3270,11 +3293,7 @@ void bdrv_invalidate_cache_all(Error **errp)
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
- AioContext *aio_context = bdrv_get_aio_context(bs);
-
- aio_context_acquire(aio_context);
- bdrv_invalidate_cache(bs, &local_err);
- aio_context_release(aio_context);
+ bdrv_invalidate_cache_recurse(bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
--
2.11.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [Qemu-devel] [PATCH] DROP THIS block: bdrv_invalidate_cache_all: invalidate children first
2017-01-30 14:17 [Qemu-devel] [PATCH] block: bdrv_invalidate_cache_all: invalidate children first Vladimir Sementsov-Ogievskiy
@ 2017-01-31 11:26 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 2+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-01-31 11:26 UTC (permalink / raw
To: qemu-block, qemu-devel
Cc: kwolf, mreitz, armbru, eblake, jsnow, famz, den, stefanha,
pbonzini
New version is "[PATCH] block: bdrv_invalidate_cache: invalidate
children first"
30.01.2017 17:17, Vladimir Sementsov-Ogievskiy wrote:
> Current implementation iterates by bdrv_next, and, for example, will
> invalidate firstly parent bds and then its children. This leads to the
> following bug:
>
> after incoming migration, in bdrv_invalidate_cache_all:
> 1. invalidate parent bds - reopen it with BDRV_O_INACTIVE cleared
> 2. child is not yet invalidated
> 3. parent check that its BDRV_O_INACTIVE is cleared
> 4. parent writes to child
> 5. assert in bdrv_co_pwritev, as BDRV_O_INACTIVE is set for child
>
> This patch fixes it by just changing invalidate sequence: invalidate
> children first.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>
> Hi all!
>
> There is a bug, which was found during testing of my qcow2-bitmap series. Actually,
> we run into this assert when reopen qcow2 with dirty bitmap after incoming migration.
>
> How to test:
>
> 1. apply the following change (force qcow2 driver write to child on open):
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 96fb8a8f16..96d803e593 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1152,6 +1152,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> }
>
> /* Clear unknown autoclear feature bits */
> + s->autoclear_features = 1;
> if (!bs->read_only && !(flags & BDRV_O_INACTIVE) && s->autoclear_features) {
> s->autoclear_features = 0;
> ret = qcow2_update_header(bs);
>
> 2. run iotest 091
> It will fail, because of:
> Program received signal SIGABRT, Aborted.
> 0x00007f56340925f7 in raise () from /lib64/libc.so.6
> (gdb) bt
> #0 0x00007f56340925f7 in raise () from /lib64/libc.so.6
> #1 0x00007f5634093ce8 in abort () from /lib64/libc.so.6
> #2 0x00007f563408b566 in __assert_fail_base () from /lib64/libc.so.6
> #3 0x00007f563408b612 in __assert_fail () from /lib64/libc.so.6
> #4 0x00007f563bd5e2b7 in bdrv_co_pwritev (child=0x7f563dd8b990, offset=0, bytes=65536,
> qiov=0x7ffec2c87120, flags=0) at block/io.c:1510
> #5 0x00007f563bd5bbf4 in bdrv_rw_co_entry (opaque=0x7ffec2c87060) at block/io.c:591
> #6 0x00007f563bdfb9e3 in coroutine_trampoline (i0=1056362256, i1=32598) at util/coroutine-ucontext.c:79
> #7 0x00007f56340a4110 in ?? () from /lib64/libc.so.6
> #8 0x00007ffec2c86870 in ?? ()
> #9 0x0000000000000000 in ?? ()
> (gdb) frame 4
> #4 0x00007f563bd5e2b7 in bdrv_co_pwritev (child=0x7f563dd8b990, offset=0, bytes=65536,
> qiov=0x7ffec2c87120, flags=0) at block/io.c:1510
> 1510 assert(!(bs->open_flags & BDRV_O_INACTIVE));
>
> 3. with both applied change (1.) and the following bugfix test doesn't fail.
>
> block.c | 29 ++++++++++++++++++++++++-----
> 1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/block.c b/block.c
> index a0346c80c6..d9e2ba9b5a 100644
> --- a/block.c
> +++ b/block.c
> @@ -3263,6 +3263,29 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
> }
> }
>
> +static void bdrv_invalidate_cache_recurse(BlockDriverState *bs, Error **errp)
> +{
> + Error *local_err = NULL;
> + AioContext *aio_context = bdrv_get_aio_context(bs);
> + BdrvChild *child;
> +
> + if (!(bs->open_flags & BDRV_O_INACTIVE)) {
> + return;
> + }
> +
> + QLIST_FOREACH(child, &bs->children, next) {
> + bdrv_invalidate_cache_recurse(child->bs, &local_err);
> + if (local_err) {
> + error_propagate(errp, local_err);
> + return;
> + }
> + }
> +
> + aio_context_acquire(aio_context);
> + bdrv_invalidate_cache(bs, errp);
> + aio_context_release(aio_context);
> +}
> +
> void bdrv_invalidate_cache_all(Error **errp)
> {
> BlockDriverState *bs;
> @@ -3270,11 +3293,7 @@ void bdrv_invalidate_cache_all(Error **errp)
> BdrvNextIterator it;
>
> for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
> - AioContext *aio_context = bdrv_get_aio_context(bs);
> -
> - aio_context_acquire(aio_context);
> - bdrv_invalidate_cache(bs, &local_err);
> - aio_context_release(aio_context);
> + bdrv_invalidate_cache_recurse(bs, &local_err);
> if (local_err) {
> error_propagate(errp, local_err);
> return;
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-01-31 11:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-30 14:17 [Qemu-devel] [PATCH] block: bdrv_invalidate_cache_all: invalidate children first Vladimir Sementsov-Ogievskiy
2017-01-31 11:26 ` [Qemu-devel] [PATCH] DROP THIS " Vladimir Sementsov-Ogievskiy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.