loongarch.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Ackerley Tng <ackerleytng@google.com>
To: Takahiro Itazuri <itazur@amazon.com>,
	fvdl@google.com, seanjc@google.com, ljs@kernel.org
Cc: Liam.Howlett@oracle.com, agordeev@linux.ibm.com,
	ajones@ventanamicro.com,  akpm@linux-foundation.org,
	alex@ghiti.fr, andrii@kernel.org,  aou@eecs.berkeley.edu,
	ast@kernel.org, baolu.lu@linux.intel.com,
	 borntraeger@linux.ibm.com, bp@alien8.de, bpf@vger.kernel.org,
	 catalin.marinas@arm.com, chenhuacai@kernel.org, corbet@lwn.net,
	 coxu@redhat.com, daniel@iogearbox.net,
	dave.hansen@linux.intel.com,  david@kernel.org,
	derekmn@amazon.com, dev.jain@arm.com, eddyz87@gmail.com,
	 gerald.schaefer@linux.ibm.com, gor@linux.ibm.com,
	haoluo@google.com,  hca@linux.ibm.com, hpa@zytor.com,
	itazur@amazon.co.uk, jackabt@amazon.co.uk,  jackmanb@google.com,
	jannh@google.com, jgg@ziepe.ca, jgross@suse.com,
	 jhubbard@nvidia.com, jiayuan.chen@shopee.com,
	jmattson@google.com,  joey.gouly@arm.com,
	john.fastabend@gmail.com, jolsa@kernel.org,
	 jthoughton@google.com, kalyazin@amazon.co.uk, kas@kernel.org,
	 kernel@xen0n.name, kpsingh@kernel.org, kvm@vger.kernel.org,
	 kvmarm@lists.linux.dev, lenb@kernel.org,
	linux-arm-kernel@lists.infradead.org,  linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org,  linux-mm@kvack.org,
	linux-pm@vger.kernel.org, linux-riscv@lists.infradead.org,
	 linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	 lorenzo.stoakes@oracle.com, luto@kernel.org,
	maobibo@loongson.cn,  martin.lau@linux.dev, maz@kernel.org,
	mhocko@suse.com, mingo@redhat.com,  mlevitsk@redhat.com,
	nikita.kalyazin@linux.dev, oupton@kernel.org,
	 palmer@dabbelt.com, patrick.roy@linux.dev, pavel@kernel.org,
	 pbonzini@redhat.com, peterx@redhat.com, peterz@infradead.org,
	 pfalcato@suse.de, pjw@kernel.org, prsampat@amd.com,
	rafael@kernel.org,  riel@surriel.com, rppt@kernel.org,
	ryan.roberts@arm.com, sdf@fomichev.me,
	 shijie@os.amperecomputing.com, skhan@linuxfoundation.org,
	song@kernel.org,  surenb@google.com, suzuki.poulose@arm.com,
	svens@linux.ibm.com,  tabba@google.com, tglx@kernel.org,
	thuth@redhat.com, urezki@gmail.com,  vannapurve@google.com,
	vbabka@kernel.org, will@kernel.org,  willy@infradead.org,
	wu.fei9@sanechips.com.cn, x86@kernel.org,
	 yang@os.amperecomputing.com, yangyicong@hisilicon.com,
	 yonghong.song@linux.dev, yosry@kernel.org,
	yu-cheng.yu@intel.com,  yuzenghui@huawei.com,
	zhengqi.arch@bytedance.com, zulinx86@gmai.com
Subject: Re: [PATCH v12 10/16] KVM: guest_memfd: Add flag to remove from direct map
Date: Thu, 14 May 2026 09:45:19 -0700	[thread overview]
Message-ID: <CAEvNRgG07EMrx-SpMaO3gHmdGVwOb75XNy7_RARBo0chidn7Yg@mail.gmail.com> (raw)
In-Reply-To: <20260508081812.12345-1-itazur@amazon.com>

Takahiro Itazuri <itazur@amazon.com> writes:

>
> [...snip...]
>

Brought this topic up on the guest_memfd biweekly today!

>
> Agreed with both of you.  I'll adopt the filemap-level approach:
>
> - Move the zap/restore hooks from guest_memfd into filemap_add_folio()
>   / filemap_remove_folio().
> - Tighten AS_NO_DIRECT_MAP semantics so that, for folios in such a
>   mapping, the direct map is invalid for the entire time the folio
>   resides in the page cache.
> - Drop the per-folio KVM_GMEM_FOLIO_NO_DIRECT_MAP bookkeeping in
>   folio->private, since the existence of the folio in the mapping is
>   itself the state.
>
> On each guest memory population path,
>
> - memcpy-based population from userspace goes through the userspace
>   mapping of guest_memfd, not through the kernel direct map, so the
>   filemap-level invariant doesn't affect it.  But this is slow, which
>   is what motivated the write() syscall support.
>
> - write(): meant to speed up the userspace-memcpy case above by doing
>   the copy in the kernel.  I believe Brendan's __GFP_UNMAPPED/mermap
>   work [1] would give us a low-overhead way to get temporary kernel
>   access to an AS_NO_DIRECT_MAP.  Landing mermap may take a while, but
>   this series does not introduce the write() path, so mermap is not a
>   blocker for now.
>
> - kvm_gmem_populate(): this is a TDX/SNP-only path, and NO_DIRECT_MAP
>   is not available on those VM types —
>   kvm_arch_gmem_supports_no_direct_map() returns false for
>   KVM_X86_TDX_VM and KVM_X86_SNP_VM, which are its only callers
>   today.  So it doesn't interact with the filemap invariant IIUC.
>

I'm a little bit uncomfortable this statement since it seems to say TDX
and SNP aren't taken care of. Would just like to discuss (for
a line of sight to SNP and TDX support):

For non-in-place population where the source physical page is different
from the destination physical page,

+ TDX: the TDX module does the population and works with physical
  addresses, so no issue with populate? Other parts of TDX may have
  trouble though, but that can be handled later.
+ SNP: sev_gmem_post_populate() does a memcpy() after using
  kmap_local_page()

Would mermap be a drop in replacement for kmap_local_page() here? Would
guest_memfd need to force a TLB flush after mermap+memcpy?

> So, unless I'm missing any path, adopting the filemap-level approach in
> this series should be fine.
>
>
> I'd like to consult with you folks on how to proceed in advance.  In a
> separate reply on the cover letter thread [2], Lorenzo and Sean
> suggested that the mm pieces should go through the mm subsystem:
>
> On Tue, Apr 21, 2026 at 04:36:00PM +0000, Sean Christopherson wrote:
>> Yeah, when the time comes, the mm pieces definitely need to go through the mm
>> tree.  Ideally, I think this would be merged in two separate parts, with all mm
>> changes going through the mm tree, and then the KVM changes through the KVM tree
>> using a stable topic branch/tag from Andrew.
>
> I see two reasonable paths to get there, and would appreciate your
> input on which you prefer:
>
> Path A — validate on KVM side first, then split:
>   - Post v13 as a single series on the KVM list, gather feedback and
>     make sure the design is acceptable to KVM reviewers.
>   - Once v13 looks good ("the time comes"), do the MM/KVM split,
>     rebase the MM part onto the appropriate MM branch, and post the
>     MM part to linux-mm to build consensus with MM maintainers.
>
> Path B — split early and seek MM consensus in parallel:
>   - With the filemap rework already in place, do the MM/KVM split
>     now and post the MM part to linux-mm directly.  The KVM part follows
>     on top of a stable topic from MM.
>
> Which of the two would you rather see?  Happy to go either way.
>

Vlastimil pointed out that there's a temporary limitation now that the
mm-tree cannot do stable branches shared between trees now.

I think it depends on how quickly you plan to refresh this series, but
Path A wouldn't be blocked by the temporary limitation.

My opinion would be to go ahead with a new revision (Path A) to fully
address comments before splitting the series. Any Reviewed-bys can be
carried over to the split series anyway :)

Alternatively you could wait till conversion lands :P Either one of us
will need to do more work for conversion wrt direct map removal.

>
> [1] https://lore.kernel.org/all/20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com/
> [2] https://lore.kernel.org/all/20260506080753.14517-1-itazur@amazon.com/
>
> Takahiro

  reply	other threads:[~2026-05-14 16:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 15:17 [PATCH v12 00/16] Direct Map Removal Support for guest_memfd Kalyazin, Nikita
2026-04-10 15:17 ` [PATCH v12 01/16] set_memory: set_direct_map_* to take address Kalyazin, Nikita
2026-04-21 14:43   ` Lorenzo Stoakes
2026-04-10 15:18 ` [PATCH v12 02/16] set_memory: add folio_{zap,restore}_direct_map helpers Kalyazin, Nikita
2026-04-10 15:18 ` [PATCH v12 03/16] mm/secretmem: make use of folio_{zap,restore}_direct_map Kalyazin, Nikita
2026-04-10 15:18 ` [PATCH v12 04/16] mm/gup: drop secretmem optimization from gup_fast_folio_allowed Kalyazin, Nikita
2026-04-10 15:18 ` [PATCH v12 05/16] mm/gup: drop local variable in gup_fast_folio_allowed Kalyazin, Nikita
2026-04-10 15:18 ` [PATCH v12 06/16] mm: introduce AS_NO_DIRECT_MAP Kalyazin, Nikita
2026-04-10 15:19 ` [PATCH v12 07/16] KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate Kalyazin, Nikita
2026-04-10 15:19 ` [PATCH v12 08/16] KVM: x86: define kvm_arch_gmem_supports_no_direct_map() Kalyazin, Nikita
2026-04-10 15:19 ` [PATCH v12 09/16] KVM: arm64: " Kalyazin, Nikita
2026-04-21 16:55   ` Marc Zyngier
2026-04-10 15:19 ` [PATCH v12 10/16] KVM: guest_memfd: Add flag to remove from direct map Kalyazin, Nikita
2026-04-21 16:31   ` Sean Christopherson
2026-04-21 17:08     ` Frank van der Linden
2026-05-08  8:18       ` Takahiro Itazuri
2026-05-14 16:45         ` Ackerley Tng [this message]
2026-04-10 15:19 ` [PATCH v12 11/16] KVM: selftests: load elf via bounce buffer Kalyazin, Nikita
2026-04-10 15:19 ` [PATCH v12 12/16] KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 Kalyazin, Nikita
2026-04-10 15:20 ` [PATCH v12 13/16] KVM: selftests: Add guest_memfd based vm_mem_backing_src_types Kalyazin, Nikita
2026-04-10 15:20 ` [PATCH v12 14/16] KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing selftests Kalyazin, Nikita
2026-04-10 15:20 ` [PATCH v12 15/16] KVM: selftests: stuff vm_mem_backing_src_type into vm_shape Kalyazin, Nikita
2026-04-10 15:20 ` [PATCH v12 16/16] KVM: selftests: Test guest execution from direct map removed gmem Kalyazin, Nikita
2026-04-21 13:40 ` [PATCH v12 00/16] Direct Map Removal Support for guest_memfd Lorenzo Stoakes
2026-04-21 16:36   ` Sean Christopherson
2026-05-06  8:07     ` Takahiro Itazuri
2026-05-26 16:27       ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEvNRgG07EMrx-SpMaO3gHmdGVwOb75XNy7_RARBo0chidn7Yg@mail.gmail.com \
    --to=ackerleytng@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=ajones@ventanamicro.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=andrii@kernel.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=ast@kernel.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=corbet@lwn.net \
    --cc=coxu@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=derekmn@amazon.com \
    --cc=dev.jain@arm.com \
    --cc=eddyz87@gmail.com \
    --cc=fvdl@google.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=haoluo@google.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=itazur@amazon.co.uk \
    --cc=itazur@amazon.com \
    --cc=jackabt@amazon.co.uk \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jgross@suse.com \
    --cc=jhubbard@nvidia.com \
    --cc=jiayuan.chen@shopee.com \
    --cc=jmattson@google.com \
    --cc=joey.gouly@arm.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jthoughton@google.com \
    --cc=kalyazin@amazon.co.uk \
    --cc=kas@kernel.org \
    --cc=kernel@xen0n.name \
    --cc=kpsingh@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=lenb@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=loongarch@lists.linux.dev \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=martin.lau@linux.dev \
    --cc=maz@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=nikita.kalyazin@linux.dev \
    --cc=oupton@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=patrick.roy@linux.dev \
    --cc=pavel@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=pjw@kernel.org \
    --cc=prsampat@amd.com \
    --cc=rafael@kernel.org \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=sdf@fomichev.me \
    --cc=seanjc@google.com \
    --cc=shijie@os.amperecomputing.com \
    --cc=skhan@linuxfoundation.org \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=svens@linux.ibm.com \
    --cc=tabba@google.com \
    --cc=tglx@kernel.org \
    --cc=thuth@redhat.com \
    --cc=urezki@gmail.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@kernel.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=wu.fei9@sanechips.com.cn \
    --cc=x86@kernel.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yonghong.song@linux.dev \
    --cc=yosry@kernel.org \
    --cc=yu-cheng.yu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=zulinx86@gmai.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).