From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54148)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <imammedo@redhat.com>) id 1ZFfgh-0000Nm-7K
	for qemu-devel@nongnu.org; Thu, 16 Jul 2015 05:42:44 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <imammedo@redhat.com>) id 1ZFfgd-0007Ta-WF
	for qemu-devel@nongnu.org; Thu, 16 Jul 2015 05:42:43 -0400
Received: from mx1.redhat.com ([209.132.183.28]:34423)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <imammedo@redhat.com>) id 1ZFfgd-0007TV-OQ
	for qemu-devel@nongnu.org; Thu, 16 Jul 2015 05:42:39 -0400
Date: Thu, 16 Jul 2015 11:42:36 +0200
From: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20150716114236.67615189@nial.brq.redhat.com>
In-Reply-To: <20150716103150-mutt-send-email-mst@redhat.com>
References: <1436442444-132020-1-git-send-email-imammedo@redhat.com>
	<20150715171201.05f6bc0d@nial.brq.redhat.com>
	<20150715192933-mutt-send-email-mst@redhat.com>
	<20150716092621.5ff1bd40@nial.brq.redhat.com>
	<20150716103150-mutt-send-email-mst@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory
 hotplug with vhost=on
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, peter.maydell@linaro.org

On Thu, 16 Jul 2015 10:35:33 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Jul 16, 2015 at 09:26:21AM +0200, Igor Mammedov wrote:
> > On Wed, 15 Jul 2015 19:32:31 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
> > > > On Thu,  9 Jul 2015 13:47:17 +0200
> > > > Igor Mammedov <imammedo@redhat.com> wrote:
> > > > 
> > > > there also is yet another issue with vhost-user. It also has
> > > > very low limit on amount of memory regions (if I recall correctly 8)
> > > > and it's possible to trigger even without memory hotplug.
> > > > one just need to start QEMU with a several -numa memdev= options
> > > > to create a necessary amount of memory regions to trigger it.
> > > > 
> > > > lowrisk option to fix it would be increasing limit in vhost-user
> > > > backend.
> > > > 
> > > > another option is disabling vhost and fall-back to virtio,
> > > > but I don't know much about vhost if it's possible to 
> > > > to switch it off without loosing packets guest was sending
> > > > at the moment and if it will work at all with vhost.
> > > 
> > > With vhost-user you can't fall back to virtio: it's
> > > not an accelerator, it's the backend.
> > > 
> > > Updating the protocol to support a bigger table
> > > is possible but old remotes won't be able to support it.
> > > 
> > it looks like increasing limit is the only option left.
> > 
> > it's not ideal that old remotes /with hardcoded limit/
> > might not be able to handle bigger table but at least
> > new ones and ones that handle VhostUserMsg payload
> > dynamically would be able to work without crashing.
> 
> I think we need a way for hotplug to fail gracefully.  As long as we
> don't implement the hva trick, it's needed for old kernels with vhost in
> kernel, too.
I don't see a reliable way to fail hotplug though.

In case of hotplug failure path comes from memory listener
which can't fail by design but it fails in vhost case, i.e.
vhost side doesn't follow protocol.

We already have considered idea of querying vhost, for limit
from memory hotplug handler before mapping memory region
but it has drawbacks:
 1. amount of memory ranges changes during guest lifecycle
   as it initializes different devices.
   which leads to a case when we can hotplug more pc-dimms
   than cold-plug.
   Which leads to inability to migrate guest with hotplugged
   pc-dimms since target QEMU won't start with that amount
   of dimms from source due to hitting limit.
 2. it's ugly hack to query random 'vhost' entity when plugging
   dimm device from modeling pov, but we can live with it
   if it helps QEMU not to crash.

If it's acceptable to break/ignore #1 issue, I can post related
QEMU patches that I have, at least qemu won't crash with old
vhost backends.