QEMU-Devel Archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race
@ 2015-07-28 12:12 Stefan Hajnoczi
  2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 1/2] AioContext: avoid leaking deleted BHs on cleanup Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 12:12 UTC (permalink / raw
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

v2:
 * Free BHs after thread_pool_free(), which calls qemu_bh_delete() [Cornelia]
 * Remove assert for leaked BHs since we don't know how many existing cases
   there are yet and QEMU 2.4-rc3 is a poor time to risk assertion failures

See Patch 2 for details on the deadlock after two aio_context_acquire() calls
race.  This caused dataplane to hang on startup.

Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.

Stefan Hajnoczi (2):
  AioContext: avoid leaking deleted BHs on cleanup
  AioContext: force event loop iteration using BH

 async.c             | 29 +++++++++++++++++++++++++++--
 include/block/aio.h |  3 +++
 2 files changed, 30 insertions(+), 2 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] AioContext: avoid leaking deleted BHs on cleanup
  2015-07-28 12:12 [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
@ 2015-07-28 12:12 ` Stefan Hajnoczi
  2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
  2015-07-28 14:08 ` [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 12:12 UTC (permalink / raw
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

BHs are freed during aio_bh_poll().  This leads to memory leaks if there
is no aio_bh_poll() between qemu_bh_delete() and aio_ctx_finalize().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/async.c b/async.c
index 9a98a74..929d533 100644
--- a/async.c
+++ b/async.c
@@ -234,7 +234,20 @@ aio_ctx_finalize(GSource     *source)
     aio_set_event_notifier(ctx, &ctx->notifier, NULL);
     event_notifier_cleanup(&ctx->notifier);
     rfifolock_destroy(&ctx->lock);
+
+    qemu_mutex_lock(&ctx->bh_lock);
+    while (ctx->first_bh) {
+        QEMUBH *next = ctx->first_bh->next;
+
+        /* TODO ignore leaks for now, change to an assertion in the future */
+        if (ctx->first_bh->deleted) {
+            g_free(ctx->first_bh);
+        }
+        ctx->first_bh = next;
+    }
+    qemu_mutex_unlock(&ctx->bh_lock);
     qemu_mutex_destroy(&ctx->bh_lock);
+
     timerlistgroup_deinit(&ctx->tlg);
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] AioContext: force event loop iteration using BH
  2015-07-28 12:12 [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 1/2] AioContext: avoid leaking deleted BHs on cleanup Stefan Hajnoczi
@ 2015-07-28 12:12 ` Stefan Hajnoczi
  2015-07-28 14:08 ` [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 12:12 UTC (permalink / raw
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

The notify_me optimization introduced in commit eabc97797310
("AioContext: fix broken ctx->dispatching optimization") skips
event_notifier_set() calls when the event loop thread is not blocked in
ppoll(2).

This optimization causes a deadlock if two aio_context_acquire() calls
race.  notify_me = 0 during the race so the winning thread can enter
ppoll(2) unaware that the other thread is waiting its turn to acquire
the AioContext.

This patch forces ppoll(2) to return by scheduling a BH instead of
calling aio_notify().

The following deadlock with virtio-blk dataplane is fixed:

  qemu ... -object iothread,id=iothread0 \
           -drive if=none,id=drive0,file=test.img,... \
           -device virtio-blk-pci,iothread=iothread0,drive=drive0

This command-line results in a hang early on without this patch.

Thanks to Paolo Bonzini <pbonzini@redhat.com> for investigating this bug
with me.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c             | 16 ++++++++++++++--
 include/block/aio.h |  3 +++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/async.c b/async.c
index 929d533..7dd6dd4 100644
--- a/async.c
+++ b/async.c
@@ -79,8 +79,10 @@ int aio_bh_poll(AioContext *ctx)
          * aio_notify again if necessary.
          */
         if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
-            if (!bh->idle)
+            /* Idle BHs and the notify BH don't count as progress */
+            if (!bh->idle && bh != ctx->notify_dummy_bh) {
                 ret = 1;
+            }
             bh->idle = 0;
             bh->cb(bh->opaque);
         }
@@ -230,6 +232,7 @@ aio_ctx_finalize(GSource     *source)
 {
     AioContext *ctx = (AioContext *) source;
 
+    qemu_bh_delete(ctx->notify_dummy_bh);
     thread_pool_free(ctx->thread_pool);
     aio_set_event_notifier(ctx, &ctx->notifier, NULL);
     event_notifier_cleanup(&ctx->notifier);
@@ -298,8 +301,15 @@ static void aio_timerlist_notify(void *opaque)
 
 static void aio_rfifolock_cb(void *opaque)
 {
+    AioContext *ctx = opaque;
+
     /* Kick owner thread in case they are blocked in aio_poll() */
-    aio_notify(opaque);
+    qemu_bh_schedule(ctx->notify_dummy_bh);
+}
+
+static void notify_dummy_bh(void *opaque)
+{
+    /* Do nothing, we were invoked just to force the event loop to iterate */
 }
 
 static void event_notifier_dummy_cb(EventNotifier *e)
@@ -326,6 +336,8 @@ AioContext *aio_context_new(Error **errp)
     rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
     timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
 
+    ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
+
     return ctx;
 }
 
diff --git a/include/block/aio.h b/include/block/aio.h
index 9dd32e0..400b1b0 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -114,6 +114,9 @@ struct AioContext {
     bool notified;
     EventNotifier notifier;
 
+    /* Scheduling this BH forces the event loop it iterate */
+    QEMUBH *notify_dummy_bh;
+
     /* Thread pool for performing work and receiving completion callbacks */
     struct ThreadPool *thread_pool;
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 12:12 [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 1/2] AioContext: avoid leaking deleted BHs on cleanup Stefan Hajnoczi
  2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
@ 2015-07-28 14:08 ` Stefan Hajnoczi
  2015-07-28 16:17   ` Paolo Bonzini
  2 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 14:08 UTC (permalink / raw
  To: Stefan Hajnoczi
  Cc: Cornelia Huck, Christian Borntraeger, qemu-devel, Paolo Bonzini

On Tue, Jul 28, 2015 at 1:12 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> v2:
>  * Free BHs after thread_pool_free(), which calls qemu_bh_delete() [Cornelia]
>  * Remove assert for leaked BHs since we don't know how many existing cases
>    there are yet and QEMU 2.4-rc3 is a poor time to risk assertion failures

The v2 isn't necessary if we apply Paolo's "[PATCH for-2.4] block:
delete bottom halves before the AioContext is freed" on top of my v1.
Paolo has audited all BHs so the risk of hitting assertion failures is
very low.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 14:08 ` [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
@ 2015-07-28 16:17   ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2015-07-28 16:17 UTC (permalink / raw
  To: Stefan Hajnoczi, Stefan Hajnoczi
  Cc: Cornelia Huck, Christian Borntraeger, qemu-devel



On 28/07/2015 16:08, Stefan Hajnoczi wrote:
>> > v2:
>> >  * Free BHs after thread_pool_free(), which calls qemu_bh_delete() [Cornelia]
>> >  * Remove assert for leaked BHs since we don't know how many existing cases
>> >    there are yet and QEMU 2.4-rc3 is a poor time to risk assertion failures
> The v2 isn't necessary if we apply Paolo's "[PATCH for-2.4] block:
> delete bottom halves before the AioContext is freed" on top of my v1.
> Paolo has audited all BHs so the risk of hitting assertion failures is
> very low.

As you prefer; you're right that there is no use-after-free because of
the way you wrote patch 1/2.

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-28 16:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-28 12:12 [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 1/2] AioContext: avoid leaking deleted BHs on cleanup Stefan Hajnoczi
2015-07-28 12:12 ` [Qemu-devel] [PATCH v2 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
2015-07-28 14:08 ` [Qemu-devel] [PATCH v2 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
2015-07-28 16:17   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).