From: Johannes Berg <johannes@sipsolutions.net>
To: Hajime Tazaki <thehajime@gmail.com>
Cc: hch@infradead.org, linux-um@lists.infradead.org,
ricarkol@google.com, Liam.Howlett@oracle.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v13 00/13] nommu UML
Date: Tue, 25 Nov 2025 10:58:53 +0100 [thread overview]
Message-ID: <defcec3945fbc37e90070b030bf1596b11b6d926.camel@sipsolutions.net> (raw)
In-Reply-To: <m2bjl7y6mv.wl-thehajime@gmail.com> (sfid-20251112_095303_672501_9A7DDF36)
On Wed, 2025-11-12 at 17:52 +0900, Hajime Tazaki wrote:
> > > What is it for ?
> > > ================
> > >
> > > - Alleviate syscall hook overhead implemented with ptrace(2)
> > > - To exercises nommu code over UML (and over KUnit)
> > > - Less dependency to host facilities
> >
> > FWIW, in some way, this order of priorities is exactly why this hasn't
> > been going anywhere, and every time I looked at it I got somewhat
> > annoyed by what seems to me like choices made to support especially the
> > first bullet.
>
> over the past versions, I've been emphasized that the 2nd bullet (testing)
> is the primary usecase as I saw several actually cases from mm folks,
>
> https://lists.infradead.org/pipermail/maple-tree/2024-November/003775.html
> https://lore.kernel.org/all/cb1cf0be-871d-4982-9a1b-5fdd54deec8d@lucifer.local/
>
> and I think this is not limited to mm code.
Not sure there's much value in testing much else in no-MMU, but sure,
I'll give you that it's useful for testing.
> other 2 bullets are additional benefits which we observed in a
> comment, and our experience.
But are they really _worthwhile_ benefits? A lot of this design adds
additional complexity, and it doesn't really seem necessary for the
testing use case. Making it faster is nice, but it's not like the
speedup really is 20x for arbitrary tests, that's just for corner cases
like "sit in a loop of gettimeofday()". And for kunit there's no syscall
boundary at all, so there's no speedup.
> > I suspect that the first and third bullet are not even really true any
> > more, since you moved to seccomp (per our request), yet I think design
> > choices influenced by them persist.
>
> this observation is not true; the first bullet is still true even
> using seccomp. please look at the benchmark result in the patch
> [12/13], quoted below.
> [snip]
So thanks for the correction. If that's the case, however, it means the
speedup can't be due to the syscall boundary itself (seccomp) but must
rather be due to some pagefault/mapping handling issue? Which would be
inherent in no-MMU, even taking an approach of using two host processes
rather than embedding everything into one.
> > However, I'm not yet convinced that all of the complexities presented in
> > this patchset (such as completely separate seccomp implementation) are
> > actually necessary in support of _just_ the second bullet. These seem to
> > me like design choices necessary to support the _first_ bullet [1].
>
> separate seccomp implementation is indeed needed due to the design
> choice we made, to use a single process to host a (um) userspace.
That sounds misleading or even wrong to me, I'd say it's due to putting
the (um) userspace in the same host process as the kernel space?
> I don't see why you see this as a _complexity_, as functionally both
> seccomp handling don't interfere each other.
The complexity isn't so much in the separate code, which is a small
factor, but in the "put everything into the same process" aspect of it.
That has consequences around the host context state handling, things we
didn't really need to consider before suddenly become crucially
important. In the current (with-MMU) design, we only need to worry about
being able to correctly switch between userspace tasks/threads within a
userspace mm (host) process. With the no-MMU design you propose, we also
need to be able to correctly switch between kernel and userspace tasks
within the same single (host) process.
I think this is a pretty significant difference, and saying "there's no
complexity here" is simply pretending it isn't a relevant difference. I
believe you're not even handling this correctly right now in this patch
set, specifically wrt. the GS register which has been pointed out
before, but I wouldn't say that I even have a complete picture in my
head over what state handling would be necessary and sufficient.
So yeah, I think this warrants taking another look as to whether or not
the approach of putting everything into the same host process is even
worth it. I tend to believe that it isn't, given the use cases. And if
you say the speedup still is with seccomp, that kills the speed argument
too.
> > I've thought about what would happen if we stuck to creating a (single)
> > separate process on the host to execute userspace, and just used
> > CLONE_VM for it. That way, it's still no-MMU with full memory access,
> > but there's some implicit isolation between the kernel and userspace
> > processes which will likely remove complexities around FP/SSE/AVX
> > handling, may completely remove the need for a separate seccomp
> > implementation, etc.
>
> this would be doable I think, but we went the different way, as
> using separate host processes (with ptrace/seccomp) is slow and add
> complexity by the synchronization between processes, which we think
> it's not easy to maintain in the future.
Which one is it then, slow or not? Not sure I follow. You just said you
do have seccomp when comparing speeds, so that in itself doesn't make it
slow. What synchronization? It'd (have to) be CLONE_VM, but that
actually _simplifies_ state transfer/synchronization, and we already
have (to have) state transfer between different userspace threads in the
same host process for the with-MMU case.
johannes
next prev parent reply other threads:[~2025-11-25 9:59 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-08 8:05 [PATCH v13 00/13] nommu UML Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 01/13] x86/um: nommu: elf loader for fdpic Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 02/13] um: decouple MMU specific code from the common part Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 03/13] um: nommu: memory handling Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 04/13] x86/um: nommu: syscall handling Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 05/13] um: nommu: seccomp syscalls hook Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 06/13] x86/um: nommu: process/thread handling Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 07/13] um: nommu: configure fs register on host syscall invocation Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 08/13] x86/um/vdso: nommu: vdso memory update Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 09/13] x86/um: nommu: signal handling Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 10/13] um: change machine name for uname output Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 11/13] um: nommu: disable SMP on nommu UML Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 12/13] um: nommu: add documentation of " Hajime Tazaki
2025-11-08 8:05 ` [PATCH v13 13/13] um: nommu: plug nommu code into build system Hajime Tazaki
2025-11-10 9:14 ` [PATCH v13 00/13] nommu UML Christoph Hellwig
2025-11-10 12:18 ` Hajime Tazaki
2025-11-11 8:01 ` Johannes Berg
2025-11-12 8:52 ` Hajime Tazaki
2025-11-12 16:36 ` Tiwei Bie
2025-11-14 6:47 ` Hajime Tazaki
2025-11-25 9:58 ` Johannes Berg [this message]
2025-11-28 12:57 ` Hajime Tazaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=defcec3945fbc37e90070b030bf1596b11b6d926.camel@sipsolutions.net \
--to=johannes@sipsolutions.net \
--cc=Liam.Howlett@oracle.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-um@lists.infradead.org \
--cc=ricarkol@google.com \
--cc=thehajime@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).