All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	jiangkunkun@huawei.com, kvm@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Keqian Zhu <zhukeqian1@huawei.com>,
	Marc Zyngier <maz@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	kvmarm@lists.cs.columbia.edu, wanghaibin.wang@huawei.com,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Alexios Zavras <alexios.zavras@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Cornelia Huck <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	James Morse <james.morse@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose
Date: Wed, 13 Jan 2021 08:13:05 -0700	[thread overview]
Message-ID: <20210113081305.1df7de8d@omen.home.shazbot.org> (raw)
In-Reply-To: <3f4f9a82-0934-b114-8bd8-452e9e56712f@nvidia.com>

On Wed, 13 Jan 2021 18:05:43 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 1/13/2021 2:50 AM, Alex Williamson wrote:
> > On Thu, 7 Jan 2021 17:28:57 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
> >> is easy to lose dirty log. For example, after promoting pinned_scope of
> >> vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
> >> dirty log that occurs before vfio_iommu is promoted.
> >>
> >> The key point is that pinned-dirty is not a real dirty tracking way, it
> >> can't continuously track dirty pages, but just restrict dirty scope. It
> >> is essentially the same as fully-dirty. Fully-dirty is of full-scope and
> >> pinned-dirty is of pinned-scope.
> >>
> >> So we must mark pinned-dirty or fully-dirty after we start dirty tracking
> >> or clear dirty bitmap, to ensure that dirty log is marked right away.  
> > 
> > I was initially convinced by these first three patches, but upon
> > further review, I think the premise is wrong.  AIUI, the concern across
> > these patches is that our dirty bitmap is only populated with pages
> > dirtied by pinning and we only take into account the pinned page dirty
> > scope at the time the bitmap is retrieved by the user.  You suppose
> > this presents a gap where if a vendor driver has not yet identified
> > with a page pinning scope that the entire bitmap should be considered
> > dirty regardless of whether that driver later pins pages prior to the
> > user retrieving the dirty bitmap.
> > 
> > I don't think this is how we intended the cooperation between the iommu
> > driver and vendor driver to work.  By pinning pages a vendor driver is
> > not declaring that only their future dirty page scope is limited to
> > pinned pages, instead they're declaring themselves as a participant in
> > dirty page tracking and take responsibility for pinning any necessary
> > pages.  For example we might extend VFIO_IOMMU_DIRTY_PAGES_FLAG_START
> > to trigger a blocking notification to groups to not only begin dirty
> > tracking, but also to synchronously register their current device DMA
> > footprint.  This patch would require a vendor driver to possibly perform
> > a gratuitous page pinning in order to set the scope prior to dirty
> > logging being enabled, or else the initial bitmap will be fully dirty.
> > 
> > Therefore, I don't see that this series is necessary or correct.  Kirti,
> > does this match your thinking?
> >   
> 
> That's correct Alex and I agree with you.
> 
> > Thinking about these semantics, it seems there might still be an issue
> > if a group with non-pinned-page dirty scope is detached with dirty
> > logging enabled.    
> 
> Hot-unplug a device while migration process has started - is this 
> scenario supported?

It's not prevented, it would rely on a userspace policy, right?  The
kernel should do the right thing regardless.  Thanks,

Alex

> > It seems this should in fact fully populate the dirty
> > bitmaps at the time it's removed since we don't know the extent of its
> > previous DMA, nor will the group be present to trigger the full bitmap
> > when the user retrieves the dirty bitmap.  Creating fully populated
> > bitmaps at the time tracking is enabled negates our ability to take
> > advantage of later enlightenment though.  Thanks,
> > 
> > Alex
> >   
> >> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking")
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>   drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
> >>   1 file changed, 22 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index bceda5e8baaa..b0a26e8e0adf 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -224,7 +224,7 @@ static void vfio_dma_bitmap_free(struct vfio_dma *dma)
> >>   	dma->bitmap = NULL;
> >>   }
> >>   
> >> -static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >> +static void vfio_dma_populate_bitmap_pinned(struct vfio_dma *dma, size_t pgsize)
> >>   {
> >>   	struct rb_node *p;
> >>   	unsigned long pgshift = __ffs(pgsize);
> >> @@ -236,6 +236,25 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >>   	}
> >>   }
> >>   
> >> +static void vfio_dma_populate_bitmap_full(struct vfio_dma *dma, size_t pgsize)
> >> +{
> >> +	unsigned long pgshift = __ffs(pgsize);
> >> +	unsigned long nbits = dma->size >> pgshift;
> >> +
> >> +	bitmap_set(dma->bitmap, 0, nbits);
> >> +}
> >> +
> >> +static void vfio_dma_populate_bitmap(struct vfio_iommu *iommu,
> >> +				     struct vfio_dma *dma)
> >> +{
> >> +	size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
> >> +
> >> +	if (iommu->pinned_page_dirty_scope)
> >> +		vfio_dma_populate_bitmap_pinned(dma, pgsize);
> >> +	else if (dma->iommu_mapped)
> >> +		vfio_dma_populate_bitmap_full(dma, pgsize);
> >> +}
> >> +
> >>   static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   {
> >>   	struct rb_node *n;
> >> @@ -257,7 +276,7 @@ static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   			}
> >>   			return ret;
> >>   		}
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }
> >> @@ -987,13 +1006,6 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	unsigned long shift = bit_offset % BITS_PER_LONG;
> >>   	unsigned long leftover;
> >>   
> >> -	/*
> >> -	 * mark all pages dirty if any IOMMU capable device is not able
> >> -	 * to report dirty pages and all pages are pinned and mapped.
> >> -	 */
> >> -	if (!iommu->pinned_page_dirty_scope && dma->iommu_mapped)
> >> -		bitmap_set(dma->bitmap, 0, nbits);
> >> -
> >>   	if (shift) {
> >>   		bitmap_shift_left(dma->bitmap, dma->bitmap, shift,
> >>   				  nbits + shift);
> >> @@ -1019,7 +1031,6 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	struct vfio_dma *dma;
> >>   	struct rb_node *n;
> >>   	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> -	size_t pgsize = (size_t)1 << pgshift;
> >>   	int ret;
> >>   
> >>   	/*
> >> @@ -1055,7 +1066,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   		 * pages which are marked dirty by vfio_dma_rw()
> >>   		 */
> >>   		bitmap_clear(dma->bitmap, 0, dma->size >> pgshift);
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }  
> >   
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: kvm@vger.kernel.org, Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	kvmarm@lists.cs.columbia.edu,
	Alexios Zavras <alexios.zavras@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Cornelia Huck <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose
Date: Wed, 13 Jan 2021 08:13:05 -0700	[thread overview]
Message-ID: <20210113081305.1df7de8d@omen.home.shazbot.org> (raw)
In-Reply-To: <3f4f9a82-0934-b114-8bd8-452e9e56712f@nvidia.com>

On Wed, 13 Jan 2021 18:05:43 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 1/13/2021 2:50 AM, Alex Williamson wrote:
> > On Thu, 7 Jan 2021 17:28:57 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
> >> is easy to lose dirty log. For example, after promoting pinned_scope of
> >> vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
> >> dirty log that occurs before vfio_iommu is promoted.
> >>
> >> The key point is that pinned-dirty is not a real dirty tracking way, it
> >> can't continuously track dirty pages, but just restrict dirty scope. It
> >> is essentially the same as fully-dirty. Fully-dirty is of full-scope and
> >> pinned-dirty is of pinned-scope.
> >>
> >> So we must mark pinned-dirty or fully-dirty after we start dirty tracking
> >> or clear dirty bitmap, to ensure that dirty log is marked right away.  
> > 
> > I was initially convinced by these first three patches, but upon
> > further review, I think the premise is wrong.  AIUI, the concern across
> > these patches is that our dirty bitmap is only populated with pages
> > dirtied by pinning and we only take into account the pinned page dirty
> > scope at the time the bitmap is retrieved by the user.  You suppose
> > this presents a gap where if a vendor driver has not yet identified
> > with a page pinning scope that the entire bitmap should be considered
> > dirty regardless of whether that driver later pins pages prior to the
> > user retrieving the dirty bitmap.
> > 
> > I don't think this is how we intended the cooperation between the iommu
> > driver and vendor driver to work.  By pinning pages a vendor driver is
> > not declaring that only their future dirty page scope is limited to
> > pinned pages, instead they're declaring themselves as a participant in
> > dirty page tracking and take responsibility for pinning any necessary
> > pages.  For example we might extend VFIO_IOMMU_DIRTY_PAGES_FLAG_START
> > to trigger a blocking notification to groups to not only begin dirty
> > tracking, but also to synchronously register their current device DMA
> > footprint.  This patch would require a vendor driver to possibly perform
> > a gratuitous page pinning in order to set the scope prior to dirty
> > logging being enabled, or else the initial bitmap will be fully dirty.
> > 
> > Therefore, I don't see that this series is necessary or correct.  Kirti,
> > does this match your thinking?
> >   
> 
> That's correct Alex and I agree with you.
> 
> > Thinking about these semantics, it seems there might still be an issue
> > if a group with non-pinned-page dirty scope is detached with dirty
> > logging enabled.    
> 
> Hot-unplug a device while migration process has started - is this 
> scenario supported?

It's not prevented, it would rely on a userspace policy, right?  The
kernel should do the right thing regardless.  Thanks,

Alex

> > It seems this should in fact fully populate the dirty
> > bitmaps at the time it's removed since we don't know the extent of its
> > previous DMA, nor will the group be present to trigger the full bitmap
> > when the user retrieves the dirty bitmap.  Creating fully populated
> > bitmaps at the time tracking is enabled negates our ability to take
> > advantage of later enlightenment though.  Thanks,
> > 
> > Alex
> >   
> >> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking")
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>   drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
> >>   1 file changed, 22 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index bceda5e8baaa..b0a26e8e0adf 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -224,7 +224,7 @@ static void vfio_dma_bitmap_free(struct vfio_dma *dma)
> >>   	dma->bitmap = NULL;
> >>   }
> >>   
> >> -static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >> +static void vfio_dma_populate_bitmap_pinned(struct vfio_dma *dma, size_t pgsize)
> >>   {
> >>   	struct rb_node *p;
> >>   	unsigned long pgshift = __ffs(pgsize);
> >> @@ -236,6 +236,25 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >>   	}
> >>   }
> >>   
> >> +static void vfio_dma_populate_bitmap_full(struct vfio_dma *dma, size_t pgsize)
> >> +{
> >> +	unsigned long pgshift = __ffs(pgsize);
> >> +	unsigned long nbits = dma->size >> pgshift;
> >> +
> >> +	bitmap_set(dma->bitmap, 0, nbits);
> >> +}
> >> +
> >> +static void vfio_dma_populate_bitmap(struct vfio_iommu *iommu,
> >> +				     struct vfio_dma *dma)
> >> +{
> >> +	size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
> >> +
> >> +	if (iommu->pinned_page_dirty_scope)
> >> +		vfio_dma_populate_bitmap_pinned(dma, pgsize);
> >> +	else if (dma->iommu_mapped)
> >> +		vfio_dma_populate_bitmap_full(dma, pgsize);
> >> +}
> >> +
> >>   static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   {
> >>   	struct rb_node *n;
> >> @@ -257,7 +276,7 @@ static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   			}
> >>   			return ret;
> >>   		}
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }
> >> @@ -987,13 +1006,6 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	unsigned long shift = bit_offset % BITS_PER_LONG;
> >>   	unsigned long leftover;
> >>   
> >> -	/*
> >> -	 * mark all pages dirty if any IOMMU capable device is not able
> >> -	 * to report dirty pages and all pages are pinned and mapped.
> >> -	 */
> >> -	if (!iommu->pinned_page_dirty_scope && dma->iommu_mapped)
> >> -		bitmap_set(dma->bitmap, 0, nbits);
> >> -
> >>   	if (shift) {
> >>   		bitmap_shift_left(dma->bitmap, dma->bitmap, shift,
> >>   				  nbits + shift);
> >> @@ -1019,7 +1031,6 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	struct vfio_dma *dma;
> >>   	struct rb_node *n;
> >>   	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> -	size_t pgsize = (size_t)1 << pgshift;
> >>   	int ret;
> >>   
> >>   	/*
> >> @@ -1055,7 +1066,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   		 * pages which are marked dirty by vfio_dma_rw()
> >>   		 */
> >>   		bitmap_clear(dma->bitmap, 0, dma->size >> pgshift);
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }  
> >   
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Keqian Zhu <zhukeqian1@huawei.com>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux-foundation.org>, <kvm@vger.kernel.org>,
	<kvmarm@lists.cs.columbia.edu>, Cornelia Huck <cohuck@redhat.com>,
	"Will Deacon" <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	James Morse <james.morse@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexios Zavras <alexios.zavras@intel.com>,
	<wanghaibin.wang@huawei.com>, <jiangkunkun@huawei.com>
Subject: Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose
Date: Wed, 13 Jan 2021 08:13:05 -0700	[thread overview]
Message-ID: <20210113081305.1df7de8d@omen.home.shazbot.org> (raw)
In-Reply-To: <3f4f9a82-0934-b114-8bd8-452e9e56712f@nvidia.com>

On Wed, 13 Jan 2021 18:05:43 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 1/13/2021 2:50 AM, Alex Williamson wrote:
> > On Thu, 7 Jan 2021 17:28:57 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
> >> is easy to lose dirty log. For example, after promoting pinned_scope of
> >> vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
> >> dirty log that occurs before vfio_iommu is promoted.
> >>
> >> The key point is that pinned-dirty is not a real dirty tracking way, it
> >> can't continuously track dirty pages, but just restrict dirty scope. It
> >> is essentially the same as fully-dirty. Fully-dirty is of full-scope and
> >> pinned-dirty is of pinned-scope.
> >>
> >> So we must mark pinned-dirty or fully-dirty after we start dirty tracking
> >> or clear dirty bitmap, to ensure that dirty log is marked right away.  
> > 
> > I was initially convinced by these first three patches, but upon
> > further review, I think the premise is wrong.  AIUI, the concern across
> > these patches is that our dirty bitmap is only populated with pages
> > dirtied by pinning and we only take into account the pinned page dirty
> > scope at the time the bitmap is retrieved by the user.  You suppose
> > this presents a gap where if a vendor driver has not yet identified
> > with a page pinning scope that the entire bitmap should be considered
> > dirty regardless of whether that driver later pins pages prior to the
> > user retrieving the dirty bitmap.
> > 
> > I don't think this is how we intended the cooperation between the iommu
> > driver and vendor driver to work.  By pinning pages a vendor driver is
> > not declaring that only their future dirty page scope is limited to
> > pinned pages, instead they're declaring themselves as a participant in
> > dirty page tracking and take responsibility for pinning any necessary
> > pages.  For example we might extend VFIO_IOMMU_DIRTY_PAGES_FLAG_START
> > to trigger a blocking notification to groups to not only begin dirty
> > tracking, but also to synchronously register their current device DMA
> > footprint.  This patch would require a vendor driver to possibly perform
> > a gratuitous page pinning in order to set the scope prior to dirty
> > logging being enabled, or else the initial bitmap will be fully dirty.
> > 
> > Therefore, I don't see that this series is necessary or correct.  Kirti,
> > does this match your thinking?
> >   
> 
> That's correct Alex and I agree with you.
> 
> > Thinking about these semantics, it seems there might still be an issue
> > if a group with non-pinned-page dirty scope is detached with dirty
> > logging enabled.    
> 
> Hot-unplug a device while migration process has started - is this 
> scenario supported?

It's not prevented, it would rely on a userspace policy, right?  The
kernel should do the right thing regardless.  Thanks,

Alex

> > It seems this should in fact fully populate the dirty
> > bitmaps at the time it's removed since we don't know the extent of its
> > previous DMA, nor will the group be present to trigger the full bitmap
> > when the user retrieves the dirty bitmap.  Creating fully populated
> > bitmaps at the time tracking is enabled negates our ability to take
> > advantage of later enlightenment though.  Thanks,
> > 
> > Alex
> >   
> >> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking")
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>   drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
> >>   1 file changed, 22 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index bceda5e8baaa..b0a26e8e0adf 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -224,7 +224,7 @@ static void vfio_dma_bitmap_free(struct vfio_dma *dma)
> >>   	dma->bitmap = NULL;
> >>   }
> >>   
> >> -static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >> +static void vfio_dma_populate_bitmap_pinned(struct vfio_dma *dma, size_t pgsize)
> >>   {
> >>   	struct rb_node *p;
> >>   	unsigned long pgshift = __ffs(pgsize);
> >> @@ -236,6 +236,25 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >>   	}
> >>   }
> >>   
> >> +static void vfio_dma_populate_bitmap_full(struct vfio_dma *dma, size_t pgsize)
> >> +{
> >> +	unsigned long pgshift = __ffs(pgsize);
> >> +	unsigned long nbits = dma->size >> pgshift;
> >> +
> >> +	bitmap_set(dma->bitmap, 0, nbits);
> >> +}
> >> +
> >> +static void vfio_dma_populate_bitmap(struct vfio_iommu *iommu,
> >> +				     struct vfio_dma *dma)
> >> +{
> >> +	size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
> >> +
> >> +	if (iommu->pinned_page_dirty_scope)
> >> +		vfio_dma_populate_bitmap_pinned(dma, pgsize);
> >> +	else if (dma->iommu_mapped)
> >> +		vfio_dma_populate_bitmap_full(dma, pgsize);
> >> +}
> >> +
> >>   static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   {
> >>   	struct rb_node *n;
> >> @@ -257,7 +276,7 @@ static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   			}
> >>   			return ret;
> >>   		}
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }
> >> @@ -987,13 +1006,6 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	unsigned long shift = bit_offset % BITS_PER_LONG;
> >>   	unsigned long leftover;
> >>   
> >> -	/*
> >> -	 * mark all pages dirty if any IOMMU capable device is not able
> >> -	 * to report dirty pages and all pages are pinned and mapped.
> >> -	 */
> >> -	if (!iommu->pinned_page_dirty_scope && dma->iommu_mapped)
> >> -		bitmap_set(dma->bitmap, 0, nbits);
> >> -
> >>   	if (shift) {
> >>   		bitmap_shift_left(dma->bitmap, dma->bitmap, shift,
> >>   				  nbits + shift);
> >> @@ -1019,7 +1031,6 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	struct vfio_dma *dma;
> >>   	struct rb_node *n;
> >>   	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> -	size_t pgsize = (size_t)1 << pgshift;
> >>   	int ret;
> >>   
> >>   	/*
> >> @@ -1055,7 +1066,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   		 * pages which are marked dirty by vfio_dma_rw()
> >>   		 */
> >>   		bitmap_clear(dma->bitmap, 0, dma->size >> pgshift);
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }  
> >   
> 


WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	jiangkunkun@huawei.com, kvm@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Keqian Zhu <zhukeqian1@huawei.com>,
	Marc Zyngier <maz@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	kvmarm@lists.cs.columbia.edu, wanghaibin.wang@huawei.com,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Alexios Zavras <alexios.zavras@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Cornelia Huck <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	James Morse <james.morse@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose
Date: Wed, 13 Jan 2021 08:13:05 -0700	[thread overview]
Message-ID: <20210113081305.1df7de8d@omen.home.shazbot.org> (raw)
In-Reply-To: <3f4f9a82-0934-b114-8bd8-452e9e56712f@nvidia.com>

On Wed, 13 Jan 2021 18:05:43 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 1/13/2021 2:50 AM, Alex Williamson wrote:
> > On Thu, 7 Jan 2021 17:28:57 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
> >> is easy to lose dirty log. For example, after promoting pinned_scope of
> >> vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
> >> dirty log that occurs before vfio_iommu is promoted.
> >>
> >> The key point is that pinned-dirty is not a real dirty tracking way, it
> >> can't continuously track dirty pages, but just restrict dirty scope. It
> >> is essentially the same as fully-dirty. Fully-dirty is of full-scope and
> >> pinned-dirty is of pinned-scope.
> >>
> >> So we must mark pinned-dirty or fully-dirty after we start dirty tracking
> >> or clear dirty bitmap, to ensure that dirty log is marked right away.  
> > 
> > I was initially convinced by these first three patches, but upon
> > further review, I think the premise is wrong.  AIUI, the concern across
> > these patches is that our dirty bitmap is only populated with pages
> > dirtied by pinning and we only take into account the pinned page dirty
> > scope at the time the bitmap is retrieved by the user.  You suppose
> > this presents a gap where if a vendor driver has not yet identified
> > with a page pinning scope that the entire bitmap should be considered
> > dirty regardless of whether that driver later pins pages prior to the
> > user retrieving the dirty bitmap.
> > 
> > I don't think this is how we intended the cooperation between the iommu
> > driver and vendor driver to work.  By pinning pages a vendor driver is
> > not declaring that only their future dirty page scope is limited to
> > pinned pages, instead they're declaring themselves as a participant in
> > dirty page tracking and take responsibility for pinning any necessary
> > pages.  For example we might extend VFIO_IOMMU_DIRTY_PAGES_FLAG_START
> > to trigger a blocking notification to groups to not only begin dirty
> > tracking, but also to synchronously register their current device DMA
> > footprint.  This patch would require a vendor driver to possibly perform
> > a gratuitous page pinning in order to set the scope prior to dirty
> > logging being enabled, or else the initial bitmap will be fully dirty.
> > 
> > Therefore, I don't see that this series is necessary or correct.  Kirti,
> > does this match your thinking?
> >   
> 
> That's correct Alex and I agree with you.
> 
> > Thinking about these semantics, it seems there might still be an issue
> > if a group with non-pinned-page dirty scope is detached with dirty
> > logging enabled.    
> 
> Hot-unplug a device while migration process has started - is this 
> scenario supported?

It's not prevented, it would rely on a userspace policy, right?  The
kernel should do the right thing regardless.  Thanks,

Alex

> > It seems this should in fact fully populate the dirty
> > bitmaps at the time it's removed since we don't know the extent of its
> > previous DMA, nor will the group be present to trigger the full bitmap
> > when the user retrieves the dirty bitmap.  Creating fully populated
> > bitmaps at the time tracking is enabled negates our ability to take
> > advantage of later enlightenment though.  Thanks,
> > 
> > Alex
> >   
> >> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking")
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>   drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
> >>   1 file changed, 22 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index bceda5e8baaa..b0a26e8e0adf 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -224,7 +224,7 @@ static void vfio_dma_bitmap_free(struct vfio_dma *dma)
> >>   	dma->bitmap = NULL;
> >>   }
> >>   
> >> -static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >> +static void vfio_dma_populate_bitmap_pinned(struct vfio_dma *dma, size_t pgsize)
> >>   {
> >>   	struct rb_node *p;
> >>   	unsigned long pgshift = __ffs(pgsize);
> >> @@ -236,6 +236,25 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> >>   	}
> >>   }
> >>   
> >> +static void vfio_dma_populate_bitmap_full(struct vfio_dma *dma, size_t pgsize)
> >> +{
> >> +	unsigned long pgshift = __ffs(pgsize);
> >> +	unsigned long nbits = dma->size >> pgshift;
> >> +
> >> +	bitmap_set(dma->bitmap, 0, nbits);
> >> +}
> >> +
> >> +static void vfio_dma_populate_bitmap(struct vfio_iommu *iommu,
> >> +				     struct vfio_dma *dma)
> >> +{
> >> +	size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
> >> +
> >> +	if (iommu->pinned_page_dirty_scope)
> >> +		vfio_dma_populate_bitmap_pinned(dma, pgsize);
> >> +	else if (dma->iommu_mapped)
> >> +		vfio_dma_populate_bitmap_full(dma, pgsize);
> >> +}
> >> +
> >>   static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   {
> >>   	struct rb_node *n;
> >> @@ -257,7 +276,7 @@ static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
> >>   			}
> >>   			return ret;
> >>   		}
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }
> >> @@ -987,13 +1006,6 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	unsigned long shift = bit_offset % BITS_PER_LONG;
> >>   	unsigned long leftover;
> >>   
> >> -	/*
> >> -	 * mark all pages dirty if any IOMMU capable device is not able
> >> -	 * to report dirty pages and all pages are pinned and mapped.
> >> -	 */
> >> -	if (!iommu->pinned_page_dirty_scope && dma->iommu_mapped)
> >> -		bitmap_set(dma->bitmap, 0, nbits);
> >> -
> >>   	if (shift) {
> >>   		bitmap_shift_left(dma->bitmap, dma->bitmap, shift,
> >>   				  nbits + shift);
> >> @@ -1019,7 +1031,6 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   	struct vfio_dma *dma;
> >>   	struct rb_node *n;
> >>   	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> -	size_t pgsize = (size_t)1 << pgshift;
> >>   	int ret;
> >>   
> >>   	/*
> >> @@ -1055,7 +1066,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> >>   		 * pages which are marked dirty by vfio_dma_rw()
> >>   		 */
> >>   		bitmap_clear(dma->bitmap, 0, dma->size >> pgshift);
> >> -		vfio_dma_populate_bitmap(dma, pgsize);
> >> +		vfio_dma_populate_bitmap(iommu, dma);
> >>   	}
> >>   	return 0;
> >>   }  
> >   
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-01-13 15:13 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07  9:28 [PATCH 0/5] vfio/iommu_type1: Some fixes about dirty tracking Keqian Zhu
2021-01-07  9:28 ` Keqian Zhu
2021-01-07  9:28 ` Keqian Zhu
2021-01-07  9:28 ` Keqian Zhu
2021-01-07  9:28 ` [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-12 21:20   ` Alex Williamson
2021-01-12 21:20     ` Alex Williamson
2021-01-12 21:20     ` Alex Williamson
2021-01-12 21:20     ` Alex Williamson
2021-01-13 12:35     ` Kirti Wankhede
2021-01-13 12:35       ` Kirti Wankhede
2021-01-13 12:35       ` Kirti Wankhede
2021-01-13 12:35       ` Kirti Wankhede
2021-01-13 15:13       ` Alex Williamson [this message]
2021-01-13 15:13         ` Alex Williamson
2021-01-13 15:13         ` Alex Williamson
2021-01-13 15:13         ` Alex Williamson
2021-01-14 13:05     ` Keqian Zhu
2021-01-14 13:05       ` Keqian Zhu
2021-01-14 13:05       ` Keqian Zhu
2021-01-14 13:05       ` Keqian Zhu
2021-01-14 17:14       ` Alex Williamson
2021-01-14 17:14         ` Alex Williamson
2021-01-14 17:14         ` Alex Williamson
2021-01-14 17:14         ` Alex Williamson
2021-01-15  9:41         ` Keqian Zhu
2021-01-15  9:41           ` Keqian Zhu
2021-01-15  9:41           ` Keqian Zhu
2021-01-15  9:41           ` Keqian Zhu
2021-01-07  9:28 ` [PATCH 2/5] vfio/iommu_type1: Populate dirty bitmap for new vfio_dma Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28 ` [PATCH 3/5] vfio/iommu_type1: Populate dirty bitmap when attach group Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:28   ` Keqian Zhu
2021-01-07  9:29 ` [PATCH 4/5] vfio/iommu_type1: Carefully use unmap_unpin_all during dirty tracking Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu
2021-01-11 21:49   ` Alex Williamson
2021-01-11 21:49     ` Alex Williamson
2021-01-11 21:49     ` Alex Williamson
2021-01-11 21:49     ` Alex Williamson
2021-01-12 12:04     ` Keqian Zhu
2021-01-12 12:04       ` Keqian Zhu
2021-01-12 12:04       ` Keqian Zhu
2021-01-12 12:04       ` Keqian Zhu
2021-01-12 19:53       ` Alex Williamson
2021-01-12 19:53         ` Alex Williamson
2021-01-12 19:53         ` Alex Williamson
2021-01-12 19:53         ` Alex Williamson
2021-01-14 12:11         ` Keqian Zhu
2021-01-14 12:11           ` Keqian Zhu
2021-01-14 12:11           ` Keqian Zhu
2021-01-14 12:11           ` Keqian Zhu
2021-01-07  9:29 ` [PATCH 5/5] vfio/iommu_type1: Move sanity_check_pfn_list to unmap_unpin_all Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu
2021-01-07  9:29   ` Keqian Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210113081305.1df7de8d@omen.home.shazbot.org \
    --to=alex.williamson@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexios.zavras@intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=cohuck@redhat.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=james.morse@arm.com \
    --cc=jiangkunkun@huawei.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.