LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2
@ 2023-06-08 22:05 Chun-Tse Shao
  2023-06-08 22:05 ` [PATCH v1 2/3] KVM: arm64: Only initiate walk if page_count() > 1 in free_removed_table() Chun-Tse Shao
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Chun-Tse Shao @ 2023-06-08 22:05 UTC (permalink / raw
  To: linux-kernel, yuzhao, oliver.upton
  Cc: Chun-Tse Shao, Marc Zyngier, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

From: Oliver Upton <oliver.upton@linux.dev>

free_removed_table() is essential to the RCU-protected parallel walking
scheme, as behind the scenes the cleanup is deferred until an RCU grace
period. Nonetheless, the stage-2 unmap path calls put_page() directly,
which leads to table memory being freed inline with the table walk.

This is safe for the time being, as the stage-2 unmap walker is called
while holding the write lock. A future change to KVM will further relax
the locking mechanics around the stage-2 page tables to allow lock-free
walkers protected only by RCU. As such, switch to the RCU-safe mechanism
for freeing table memory.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 5282cb9ca4cf..cc1af0286755 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1042,7 +1042,7 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 					       kvm_granule_size(ctx->level));
 
 	if (childp)
-		mm_ops->put_page(childp);
+		mm_ops->free_removed_table(childp, ctx->level);
 
 	return 0;
 }
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1 2/3] KVM: arm64: Only initiate walk if page_count() > 1 in free_removed_table()
  2023-06-08 22:05 [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Chun-Tse Shao
@ 2023-06-08 22:05 ` Chun-Tse Shao
  2023-06-08 22:05 ` [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung() Chun-Tse Shao
  2023-06-08 22:13 ` [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Yu Zhao
  2 siblings, 0 replies; 7+ messages in thread
From: Chun-Tse Shao @ 2023-06-08 22:05 UTC (permalink / raw
  To: linux-kernel, yuzhao, oliver.upton
  Cc: Chun-Tse Shao, Marc Zyngier, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

Page table walk is unnecessary in free_removed_table() being called from
the stage-2 unmap path while PTEs on the table is empty. It can be
fast-pathed by only initiating a walk if page_count() > 1

Original disussion can be found in:
https://lore.kernel.org/kvmarm/ZHfWzX04GlcNngdU@linux.dev/

Suggested-by: Yu Zhao <yuzhao@google.com>
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index cc1af0286755..d8e570263388 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1319,5 +1319,6 @@ void kvm_pgtable_stage2_free_removed(struct kvm_pgtable_mm_ops *mm_ops, void *pg
 		.end	= kvm_granule_size(level),
 	};
 
-	WARN_ON(__kvm_pgtable_walk(&data, mm_ops, ptep, level + 1));
+	if (mm_ops->page_count(pgtable) > 1)
+		WARN_ON(__kvm_pgtable_walk(&data, mm_ops, ptep, level + 1));
 }
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung()
  2023-06-08 22:05 [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Chun-Tse Shao
  2023-06-08 22:05 ` [PATCH v1 2/3] KVM: arm64: Only initiate walk if page_count() > 1 in free_removed_table() Chun-Tse Shao
@ 2023-06-08 22:05 ` Chun-Tse Shao
  2023-06-09  7:44   ` Marc Zyngier
  2023-06-09 14:51   ` Oliver Upton
  2023-06-08 22:13 ` [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Yu Zhao
  2 siblings, 2 replies; 7+ messages in thread
From: Chun-Tse Shao @ 2023-06-08 22:05 UTC (permalink / raw
  To: linux-kernel, yuzhao, oliver.upton
  Cc: Chun-Tse Shao, Marc Zyngier, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

Access bit is RCU safe and can be set without taking kvm->mmu_lock().
Replacing existing kvm->mmu_lock() with rcu_read_lock() for better
performance.

Original disussion can be found in:
https://lore.kernel.org/kvmarm/CAOUHufZrfnfcbrqSzmHkejR5MA2gmGKZ3LMRhbLHV+1427z=Tw@mail.gmail.com/

Suggested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
 arch/arm64/kvm/mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3b9d4d24c361..0f7ea66fb894 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1437,10 +1437,10 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 
 	trace_kvm_access_fault(fault_ipa);
 
-	read_lock(&vcpu->kvm->mmu_lock);
+	rcu_read_lock();
 	mmu = vcpu->arch.hw_mmu;
 	pte = kvm_pgtable_stage2_mkyoung(mmu->pgt, fault_ipa);
-	read_unlock(&vcpu->kvm->mmu_lock);
+	rcu_read_unlock();
 
 	if (kvm_pte_valid(pte))
 		kvm_set_pfn_accessed(kvm_pte_to_pfn(pte));
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2
  2023-06-08 22:05 [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Chun-Tse Shao
  2023-06-08 22:05 ` [PATCH v1 2/3] KVM: arm64: Only initiate walk if page_count() > 1 in free_removed_table() Chun-Tse Shao
  2023-06-08 22:05 ` [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung() Chun-Tse Shao
@ 2023-06-08 22:13 ` Yu Zhao
  2 siblings, 0 replies; 7+ messages in thread
From: Yu Zhao @ 2023-06-08 22:13 UTC (permalink / raw
  To: Chun-Tse Shao
  Cc: linux-kernel, oliver.upton, Marc Zyngier, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Ben Gardon, Gavin Shan, linux-arm-kernel, kvmarm

On Thu, Jun 8, 2023 at 4:06 PM Chun-Tse Shao <ctshao@google.com> wrote:
>
> From: Oliver Upton <oliver.upton@linux.dev>
>
> free_removed_table() is essential to the RCU-protected parallel walking
> scheme, as behind the scenes the cleanup is deferred until an RCU grace
> period. Nonetheless, the stage-2 unmap path calls put_page() directly,
> which leads to table memory being freed inline with the table walk.
>
> This is safe for the time being, as the stage-2 unmap walker is called
> while holding the write lock. A future change to KVM will further relax
> the locking mechanics around the stage-2 page tables to allow lock-free
> walkers protected only by RCU. As such, switch to the RCU-safe mechanism
> for freeing table memory.
>
> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> Signed-off-by: Chun-Tse Shao <ctshao@google.com>

Acked-by: Yu Zhao <yuzhao@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung()
  2023-06-08 22:05 ` [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung() Chun-Tse Shao
@ 2023-06-09  7:44   ` Marc Zyngier
  2023-06-09 22:58     ` Chun-Tse Shao
  2023-06-09 14:51   ` Oliver Upton
  1 sibling, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2023-06-09  7:44 UTC (permalink / raw
  To: Chun-Tse Shao
  Cc: linux-kernel, yuzhao, oliver.upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

On Thu, 08 Jun 2023 23:05:41 +0100,
Chun-Tse Shao <ctshao@google.com> wrote:
> 
> Access bit is RCU safe and can be set without taking kvm->mmu_lock().

Please explain why. What happens when the page tables are *not* RCU
controlled, such as in the pKVM case?

> Replacing existing kvm->mmu_lock() with rcu_read_lock() for better
> performance.

Please define "better performance", quote workloads, figures, HW setup
and point to a reproducer. Please add a cover letter to your patch
series explaining the context this happens in.

Also, I'm getting increasingly annoyed by the lack of coordination
between seamingly overlapping patch series (this, Yu's, Anish's and
Vipin's), all from a single company.

Surely you can talk to each other and devise a coordinated approach?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung()
  2023-06-08 22:05 ` [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung() Chun-Tse Shao
  2023-06-09  7:44   ` Marc Zyngier
@ 2023-06-09 14:51   ` Oliver Upton
  1 sibling, 0 replies; 7+ messages in thread
From: Oliver Upton @ 2023-06-09 14:51 UTC (permalink / raw
  To: Chun-Tse Shao
  Cc: linux-kernel, yuzhao, Marc Zyngier, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

On Thu, Jun 08, 2023 at 03:05:41PM -0700, Chun-Tse Shao wrote:
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 3b9d4d24c361..0f7ea66fb894 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1437,10 +1437,10 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>  
>  	trace_kvm_access_fault(fault_ipa);
>  
> -	read_lock(&vcpu->kvm->mmu_lock);
> +	rcu_read_lock();
>  	mmu = vcpu->arch.hw_mmu;
>  	pte = kvm_pgtable_stage2_mkyoung(mmu->pgt, fault_ipa);
> -	read_unlock(&vcpu->kvm->mmu_lock);
> +	rcu_read_unlock();

What is the point of acquiring the RCU read lock here?
kvm_pgtable_walk_{begin,end}() already do the exact same for any
'shared' walk.

I agree with Marc that this warrants some very clear benchmark data
showing the value of the change. As I had mentioned to Yu, I already
implemented this for my own purposes, but wasn't able to see a
significant improvement over acquiring the MMU lock for read.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung()
  2023-06-09  7:44   ` Marc Zyngier
@ 2023-06-09 22:58     ` Chun-Tse Shao
  0 siblings, 0 replies; 7+ messages in thread
From: Chun-Tse Shao @ 2023-06-09 22:58 UTC (permalink / raw
  To: Marc Zyngier
  Cc: linux-kernel, yuzhao, oliver.upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Ben Gardon, Gavin Shan,
	linux-arm-kernel, kvmarm

On Fri, Jun 9, 2023 at 12:44 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 08 Jun 2023 23:05:41 +0100,
> Chun-Tse Shao <ctshao@google.com> wrote:
> >
> > Access bit is RCU safe and can be set without taking kvm->mmu_lock().
>
> Please explain why. What happens when the page tables are *not* RCU
> controlled, such as in the pKVM case?
>
> > Replacing existing kvm->mmu_lock() with rcu_read_lock() for better
> > performance.
>
> Please define "better performance", quote workloads, figures, HW setup
> and point to a reproducer. Please add a cover letter to your patch
> series explaining the context this happens in.

Thanks for the suggestion, we are currently working on the performance
test in parallel and will update after gathering more data.

>
> Also, I'm getting increasingly annoyed by the lack of coordination
> between seamingly overlapping patch series (this, Yu's, Anish's and
> Vipin's), all from a single company.
>
> Surely you can talk to each other and devise a coordinated approach?

Sure, I will start internal meeting as needed.

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
CT

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-09 22:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-08 22:05 [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Chun-Tse Shao
2023-06-08 22:05 ` [PATCH v1 2/3] KVM: arm64: Only initiate walk if page_count() > 1 in free_removed_table() Chun-Tse Shao
2023-06-08 22:05 ` [PATCH v1 3/3] KVM: arm64: Using rcu_read_lock() for kvm_pgtable_stage2_mkyoung() Chun-Tse Shao
2023-06-09  7:44   ` Marc Zyngier
2023-06-09 22:58     ` Chun-Tse Shao
2023-06-09 14:51   ` Oliver Upton
2023-06-08 22:13 ` [PATCH v1 1/3] KVM: arm64: Consistently use free_removed_table() for stage-2 Yu Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).