From: Catalin Marinas <catalin.marinas@arm.com>
To: Yang Shi <yang@os.amperecomputing.com>
Cc: peterx@redhat.com, will@kernel.org, scott@os.amperecomputing.com,
cl@gentwo.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: mm: force write fault for atomic RMW instructions
Date: Fri, 17 May 2024 18:25:42 +0100 [thread overview]
Message-ID: <ZkeTFiF_OOy80stO@arm.com> (raw)
In-Reply-To: <570c686c-6aa1-43f0-ba31-3597a329e037@os.amperecomputing.com>
On Fri, May 17, 2024 at 09:30:23AM -0700, Yang Shi wrote:
> On 5/14/24 3:39 AM, Catalin Marinas wrote:
> > It would be good to understand why openjdk is doing this instead of a
> > plain write. Is it because it may be racing with some other threads
> > already using the heap? That would be a valid pattern.
>
> Yes, you are right. I think I quoted the JVM justification in earlier email,
> anyway they said "permit use of memory concurrently with pretouch".
Ah, sorry, I missed that. This seems like a valid reason.
> > A point Will raised was on potential ABI changes introduced by this
> > patch. The ESR_EL1 reported to user remains the same as per the hardware
> > spec (read-only), so from a SIGSEGV we may have some slight behaviour
> > changes:
> >
> > 1. PTE invalid:
> >
> > a) vma is VM_READ && !VM_WRITE permission - SIGSEGV reported with
> > ESR_EL1.WnR == 0 in sigcontext with your patch. Without this
> > patch, the PTE is mapped as PTE_RDONLY first and a subsequent
> > fault will report SIGSEGV with ESR_EL1.WnR == 1.
>
> I think I can do something like the below conceptually:
>
> if is_el0_atomic_instr && !is_write_abort
> force_write = true
>
> if VM_READ && !VM_WRITE && force_write == true
Nit: write implies read, so you only need to check !write.
> vm_flags = VM_READ
> mm_flags ~= FAULT_FLAG_WRITE
>
> Then we just fallback to read fault. The following write fault will trigger
> SIGSEGV with consistent ABI.
I think this should work. So instead of reporting the write fault
directly in case of a read-only vma, we let the core code handle the
read fault and first and we retry the atomic instruction.
> > b) vma is !VM_READ && !VM_WRITE permission - SIGSEGV reported with
> > ESR_EL1.WnR == 0, so no change from current behaviour, unless we
> > fix the patch for (1.a) to fake the WnR bit which would change the
> > current expectations.
> >
> > 2. PTE valid with PTE_RDONLY - we get a normal writeable fault in
> > hardware, no need to fix ESR_EL1 up.
> >
> > The patch would have to address (1) above but faking the ESR_EL1.WnR bit
> > based on the vma flags looks a bit fragile.
>
> I think we don't need to fake the ESR_EL1.WnR bit with the fallback.
I agree, with your approach above we don't need to fake WnR.
> > Similarly, we have userfaultfd that reports the fault to user. I think
> > in scenario (1) the kernel will report UFFD_PAGEFAULT_FLAG_WRITE with
> > your patch but no UFFD_PAGEFAULT_FLAG_WP. Without this patch, there are
> > indeed two faults, with the second having both UFFD_PAGEFAULT_FLAG_WP
> > and UFFD_PAGEFAULT_FLAG_WRITE set.
>
> I don't quite get what the problem is. IIUC, uffd just needs a signal from
> kernel to tell this area will be written. It seems not break the semantic.
> Added Peter Xu in this loop, who is the uffd developer. He may shed some
> light.
Not really familiar with uffd but just looking at the code, if a handler
is registered for both MODE_MISSING and MODE_WP, currently the atomic
instruction signals a user fault without UFFD_PAGEFAULT_FLAG_WRITE (the
do_anonymous_page() path). If the page is mapped by the uffd handler as
the zero page, a restart of the instruction would signal
UFFD_PAGEFAULT_FLAG_WRITE and UFFD_PAGEFAULT_FLAG_WP (the do_wp_page()
path).
With your patch, we get the equivalent of UFFD_PAGEFAULT_FLAG_WRITE on
the first attempt, just like having a STR instruction instead of
separate LDR + STR (as the atomics behave from a fault perspective).
However, I don't think that's a problem, the uffd handler should cope
with an STR anyway, so it's not some unexpected combination of flags.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-05-17 17:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 22:35 [PATCH] arm64: mm: force write fault for atomic RMW instructions Yang Shi
2024-05-07 22:42 ` Christoph Lameter (Ampere)
2024-05-08 6:45 ` Anshuman Khandual
2024-05-08 17:15 ` Christoph Lameter (Ampere)
2024-05-09 4:23 ` Anshuman Khandual
2024-05-13 22:39 ` Christoph Lameter (Ampere)
2024-05-08 18:37 ` Yang Shi
2024-05-09 4:31 ` Anshuman Khandual
2024-05-09 21:46 ` Yang Shi
2024-05-10 4:28 ` Anshuman Khandual
2024-05-10 16:37 ` Yang Shi
2024-05-10 12:11 ` Catalin Marinas
2024-05-10 17:13 ` Yang Shi
2024-05-13 22:41 ` Christoph Lameter (Ampere)
2024-05-14 10:39 ` Catalin Marinas
2024-05-14 15:57 ` David Hildenbrand
2024-05-17 16:30 ` Yang Shi
2024-05-17 17:25 ` Catalin Marinas [this message]
2024-05-17 17:35 ` Yang Shi
2024-05-14 3:19 ` Yang Shi
2024-05-14 10:53 ` Catalin Marinas
2024-05-17 16:10 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkeTFiF_OOy80stO@arm.com \
--to=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=scott@os.amperecomputing.com \
--cc=will@kernel.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).