Linux-Fsdevel Archive mirror
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: "Christopherson,, Sean" <seanjc@google.com>
Cc: "Li, Xiaoyao" <xiaoyao.li@intel.com>,
	"kvm-riscv@lists.infradead.org" <kvm-riscv@lists.infradead.org>,
	"mic@digikod.net" <mic@digikod.net>,
	"liam.merwick@oracle.com" <liam.merwick@oracle.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"amoorthy@google.com" <amoorthy@google.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"tabba@google.com" <tabba@google.com>,
	"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>,
	"michael.roth@amd.com" <michael.roth@amd.com>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"chenhuacai@kernel.org" <chenhuacai@kernel.org>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"Annapurve, Vishal" <vannapurve@google.com>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"mail@maciej.szmigiero.name" <mail@maciej.szmigiero.name>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"maz@kernel.org" <maz@kernel.org>,
	"willy@infradead.org" <willy@infradead.org>,
	"dmatlack@google.com" <dmatlack@google.com>,
	"anup@brainfault.org" <anup@brainfault.org>,
	"yu.c.zhang@linux.intel.com" <yu.c.zhang@linux.intel.com>,
	"Xu, Yilun" <yilun.xu@intel.com>,
	"qperret@google.com" <qperret@google.com>,
	"brauner@kernel.org" <brauner@kernel.org>,
	"isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
	"ackerleytng@google.com" <ackerleytng@google.com>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Wang, Wei W" <wei.w.wang@intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH v13 09/35] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace
Date: Thu, 2 Nov 2023 09:35:35 +0000	[thread overview]
Message-ID: <32cb71700aedcbd1f65276cf44a601760ffc364b.camel@intel.com> (raw)
In-Reply-To: <64e3764e36ba7a00d94cc7db1dea1ef06b620aaf.camel@intel.com>

On Thu, 2023-11-02 at 03:17 +0000, Huang, Kai wrote:
> On Wed, 2023-11-01 at 10:36 -0700, Sean Christopherson wrote:
> > On Wed, Nov 01, 2023, Kai Huang wrote:
> > > 
> > > > +7.34 KVM_CAP_MEMORY_FAULT_INFO
> > > > +------------------------------
> > > > +
> > > > +:Architectures: x86
> > > > +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
> > > > +
> > > > +The presence of this capability indicates that KVM_RUN will fill
> > > > +kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-Exit, e.g. if
> > > > +there is a valid memslot but no backing VMA for the corresponding host virtual
> > > > +address.
> > > > +
> > > > +The information in kvm_run.memory_fault is valid if and only if KVM_RUN returns
> > > > +an error with errno=EFAULT or errno=EHWPOISON *and* kvm_run.exit_reason is set
> > > > +to KVM_EXIT_MEMORY_FAULT.
> > > 
> > > IIUC returning -EFAULT or whatever -errno is sort of KVM internal
> > > implementation.
> > 
> > The errno that is returned to userspace is ABI.  In KVM, it's a _very_ poorly
> > defined ABI for the vast majority of ioctls(), but it's still technically ABI.
> > KVM gets away with being cavalier with errno because the vast majority of errors
> > are considered fatal by userespace, i.e. in most cases, userspace simply doesn't
> > care about the exact errno.
> > 
> > A good example is KVM_RUN with -EINTR; if KVM were to return something other than
> > -EINTR on a pending signal or vcpu->run->immediate_exit, userspace would fall over.
> > 
> > > Is it better to relax the validity of kvm_run.memory_fault when
> > > KVM_RUN returns any -errno?
> > 
> > Not unless there's a need to do so, and if there is then we can update the
> > documentation accordingly.  If KVM's ABI is that kvm_run.memory_fault is valid
> > for any errno, then KVM would need to purge kvm_run.exit_reason super early in
> > KVM_RUN, e.g. to prevent an -EINTR return due to immediate_exit from being
> > misinterpreted as KVM_EXIT_MEMORY_FAULT.  And purging exit_reason super early is
> > subtly tricky because KVM's (again, poorly documented) ABI is that *some* exit
> > reasons are preserved across KVM_RUN with vcpu->run->immediate_exit (or with a
> > pending signal).
> > 
> > https://lore.kernel.org/all/ZFFbwOXZ5uI%2Fgdaf@google.com
> > 
> > 
> 
> Agreed with not to relax to any errno.  However using -EFAULT as part of ABI
> definition seems a little bit dangerous, e.g., someone could accidentally or
> mistakenly return -EFAULT in KVM_RUN at early time and/or in a completely
> different code path, etc.  -EINTR has well defined meaning, but -EFAULT (which
> is "Bad address") seems doesn't but I am not sure either. :-)
> 
> One example is, for backing VMA with VM_IO | VM_PFNMAP, hva_to_pfn() returns
> KVM_PFN_ERR_FAULT when the kernel cannot get a valid PFN (e.g. when SGX vepc
> fault handler failed to allocate EPC) and kvm_handle_error_pfn() will just
> return -EFAULT.  If kvm_run.exit_reason isn't purged early then is it possible
> to have some issue here?
> 

Also, regardless whether -EFAULT is too ambiguous to be part of ABI, could you
elaborate the EHWPOISON part?  IIUC KVM can already handle the case of poisoned
page by sending signal to user app: 

	static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, 
			struct kvm_page_fault *fault)                                               
	{       
		...

       		if (fault->pfn == KVM_PFN_ERR_HWPOISON) {
              		kvm_send_hwpoison_signal(fault->slot, fault->gfn);
                	return RET_PF_RETRY;                                          
        	}
	}

And (sorry to hijack) I am thinking whether "SGX vepc unable to allocate EPC"
can also use this memory_fault mechanism.  Currently as mentioned above when
vepc fault handler cannot allocate EPC page KVM returns -EFAULT to Qemu, and
Qemu prints ...

	...: Bad address
	<dump guest cpu registers>

... which is nonsense.

If we can use memory_fault.flags (or is 'fault_reason' a better name?) to carry
a specific value for EPC to let Qemu know and Qemu can then do more reasonable
things.


  reply	other threads:[~2023-11-02  9:35 UTC|newest]

Thread overview: 148+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-27 18:21 [PATCH v13 00/35] KVM: guest_memfd() and per-page attributes Sean Christopherson
2023-10-27 18:21 ` [PATCH v13 01/35] KVM: Tweak kvm_hva_range and hva_handler_t to allow reusing for gfn ranges Sean Christopherson
2023-11-01 12:46   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 02/35] KVM: Assert that mmu_invalidate_in_progress *never* goes negative Sean Christopherson
2023-10-30 16:27   ` Paolo Bonzini
2023-11-01 12:46   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 03/35] KVM: Use gfn instead of hva for mmu_notifier_retry Sean Christopherson
2023-10-30 16:30   ` Paolo Bonzini
2023-10-30 16:53   ` David Matlack
2023-10-30 17:00     ` Paolo Bonzini
2023-10-30 18:21       ` David Matlack
2023-10-30 18:19     ` David Matlack
2023-11-01 15:31   ` Xu Yilun
2023-10-27 18:21 ` [PATCH v13 04/35] KVM: WARN if there are dangling MMU invalidations at VM destruction Sean Christopherson
2023-10-30 16:32   ` Paolo Bonzini
2023-11-01 12:50   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 05/35] KVM: PPC: Drop dead code related to KVM_ARCH_WANT_MMU_NOTIFIER Sean Christopherson
2023-10-30 16:34   ` Paolo Bonzini
2023-11-01 12:51   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 06/35] KVM: PPC: Return '1' unconditionally for KVM_CAP_SYNC_MMU Sean Christopherson
2023-10-27 18:21 ` [PATCH v13 07/35] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER Sean Christopherson
2023-10-30 16:37   ` Paolo Bonzini
2023-11-01 12:54   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 08/35] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-10-30 16:41   ` Paolo Bonzini
2023-10-30 20:25     ` Sean Christopherson
2023-10-30 22:12       ` Sean Christopherson
2023-10-30 23:22       ` Paolo Bonzini
2023-10-31  0:18         ` Sean Christopherson
2023-10-31  2:26   ` Xiaoyao Li
2023-10-31 14:04     ` Sean Christopherson
2023-11-01 14:19   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 09/35] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace Sean Christopherson
2023-10-30 17:22   ` Paolo Bonzini
2023-11-01  7:30   ` Binbin Wu
2023-11-01 10:52   ` Huang, Kai
2023-11-01 17:36     ` Sean Christopherson
2023-11-02  2:19       ` Xiaoyao Li
2023-11-02 15:51         ` Sean Christopherson
2023-11-02  3:17       ` Huang, Kai
2023-11-02  9:35         ` Huang, Kai [this message]
2023-11-02 11:03           ` Paolo Bonzini
2023-11-02 15:44             ` Sean Christopherson
2023-11-02 18:35               ` Huang, Kai
2023-11-02 15:56         ` Sean Christopherson
2023-11-02 11:01       ` Paolo Bonzini
2023-11-03  4:09   ` Xu Yilun
2023-10-27 18:21 ` [PATCH v13 10/35] KVM: Add a dedicated mmu_notifier flag for reclaiming freed memory Sean Christopherson
2023-10-30 17:11   ` Paolo Bonzini
2023-11-02 13:55   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 11/35] KVM: Drop .on_unlock() mmu_notifier hook Sean Christopherson
2023-10-30 17:18   ` Paolo Bonzini
2023-11-02 13:55   ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 12/35] KVM: Prepare for handling only shared mappings in mmu_notifier events Sean Christopherson
2023-10-30 17:21   ` Paolo Bonzini
2023-10-30 22:07     ` Sean Christopherson
2023-11-02  5:59   ` Binbin Wu
2023-11-02 11:14     ` Paolo Bonzini
2023-11-02 14:01   ` Fuad Tabba
2023-11-02 14:41     ` Sean Christopherson
2023-11-02 14:57       ` Fuad Tabba
2023-10-27 18:21 ` [PATCH v13 13/35] KVM: Introduce per-page memory attributes Sean Christopherson
2023-10-30  8:11   ` Chao Gao
2023-10-30 16:10     ` Sean Christopherson
2023-10-30 22:05       ` Sean Christopherson
2023-10-31 16:43   ` David Matlack
2023-11-02  3:01   ` Huang, Kai
2023-11-02 10:32     ` Paolo Bonzini
2023-11-02 10:55       ` Huang, Kai
2023-10-27 18:21 ` [PATCH v13 14/35] mm: Add AS_UNMOVABLE to mark mapping as completely unmovable Sean Christopherson
2023-10-30 17:24   ` Paolo Bonzini
2023-10-27 18:21 ` [PATCH v13 15/35] fs: Export anon_inode_getfile_secure() for use by KVM Sean Christopherson
2023-10-30 17:30   ` Paolo Bonzini
2023-11-02 16:24   ` Christian Brauner
2023-11-03 10:40     ` Paolo Bonzini
2023-10-27 18:21 ` [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory Sean Christopherson
2023-10-31  2:27   ` Xiaoyao Li
2023-10-31  6:30   ` Chao Gao
2023-10-31 14:10     ` Sean Christopherson
2023-10-31 15:05   ` Fuad Tabba
2023-10-31 22:13     ` Sean Christopherson
2023-10-31 22:18       ` Paolo Bonzini
2023-11-01 10:51       ` Fuad Tabba
2023-11-01 21:55         ` Sean Christopherson
2023-11-02 13:52           ` Fuad Tabba
2023-11-03 23:17             ` Sean Christopherson
2023-10-31 18:24   ` David Matlack
2023-10-31 21:36     ` Sean Christopherson
2023-10-31 22:39       ` David Matlack
2023-11-02 15:48         ` Paolo Bonzini
2023-11-02 16:03           ` Sean Christopherson
2023-11-02 16:28             ` David Matlack
2023-11-02 17:37               ` Sean Christopherson
2023-11-03  9:42   ` Fuad Tabba
2023-11-04 10:26   ` Xu Yilun
2023-11-06 15:43     ` Sean Christopherson
2023-10-27 18:21 ` [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory Sean Christopherson
2023-10-31  8:35   ` Xiaoyao Li
2023-10-31 14:16     ` Sean Christopherson
2023-11-01  7:25       ` Xiaoyao Li
2023-11-01 13:41         ` Sean Christopherson
2023-11-01 13:49           ` Paolo Bonzini
2023-11-01 16:36             ` Sean Christopherson
2023-11-01 22:28               ` Paolo Bonzini
2023-11-01 22:34                 ` Sean Christopherson
2023-11-01 23:17                   ` Paolo Bonzini
2023-11-02 15:38                     ` Sean Christopherson
2023-11-02 15:46                       ` Paolo Bonzini
2023-11-27 11:13                         ` Vlastimil Babka
2023-11-29 22:40                           ` Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 18/35] KVM: x86: "Reset" vcpu->run->exit_reason early in KVM_RUN Sean Christopherson
2023-10-30 17:31   ` Paolo Bonzini
2023-11-02 14:16   ` Fuad Tabba
2023-10-27 18:22 ` [PATCH v13 19/35] KVM: x86: Disallow hugepages when memory attributes are mixed Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 20/35] KVM: x86/mmu: Handle page fault for private memory Sean Christopherson
2023-11-02 14:34   ` Fuad Tabba
2023-11-05 13:02   ` Xu Yilun
2023-11-05 16:19     ` Paolo Bonzini
2023-11-06 13:29       ` Xu Yilun
2023-11-06 15:56         ` Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 21/35] KVM: Drop superfluous __KVM_VCPU_MULTIPLE_ADDRESS_SPACE macro Sean Christopherson
2023-11-02 14:35   ` Fuad Tabba
2023-10-27 18:22 ` [PATCH v13 22/35] KVM: Allow arch code to track number of memslot address spaces per VM Sean Christopherson
2023-10-30 17:34   ` Paolo Bonzini
2023-11-02 14:52   ` Fuad Tabba
2023-10-27 18:22 ` [PATCH v13 23/35] KVM: x86: Add support for "protected VMs" that can utilize private memory Sean Christopherson
2023-10-30 17:36   ` Paolo Bonzini
2023-11-06 11:00   ` Fuad Tabba
2023-11-06 11:03     ` Paolo Bonzini
2023-10-27 18:22 ` [PATCH v13 24/35] KVM: selftests: Drop unused kvm_userspace_memory_region_find() helper Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 25/35] KVM: selftests: Convert lib's mem regions to KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2024-04-25 14:12   ` Dan Carpenter
2024-04-25 14:45     ` Shuah Khan
2024-04-25 15:09       ` Sean Christopherson
2024-04-25 16:22         ` Shuah Khan
2024-04-26  7:33         ` Jarkko Sakkinen
2023-10-27 18:22 ` [PATCH v13 26/35] KVM: selftests: Add support for creating private memslots Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 27/35] KVM: selftests: Add helpers to convert guest memory b/w private and shared Sean Christopherson
2023-11-06 11:26   ` Fuad Tabba
2023-10-27 18:22 ` [PATCH v13 28/35] KVM: selftests: Add helpers to do KVM_HC_MAP_GPA_RANGE hypercalls (x86) Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 29/35] KVM: selftests: Introduce VM "shape" to allow tests to specify the VM type Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 30/35] KVM: selftests: Add GUEST_SYNC[1-6] macros for synchronizing more data Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 31/35] KVM: selftests: Add x86-only selftest for private memory conversions Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 32/35] KVM: selftests: Add KVM_SET_USER_MEMORY_REGION2 helper Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 33/35] KVM: selftests: Expand set_memory_region_test to validate guest_memfd() Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 34/35] KVM: selftests: Add basic selftest for guest_memfd() Sean Christopherson
2023-10-27 18:22 ` [PATCH v13 35/35] KVM: selftests: Test KVM exit behavior for private memory/access Sean Christopherson
2023-10-30 17:39 ` [PATCH v13 00/35] KVM: guest_memfd() and per-page attributes Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32cb71700aedcbd1f65276cf44a601760ffc364b.camel@intel.com \
    --to=kai.huang@intel.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jarkko@kernel.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=seanjc@google.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).