Linux-api Archive mirror
 help / color / mirror / Atom feed
From: Jeff Xu <jeffxu@google.com>
To: Aleksa Sarai <cyphar@cyphar.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>, Kees Cook <keescook@chromium.org>,
	Daniel Verkamp <dverkamp@chromium.org>,
	Christian Brauner <brauner@kernel.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	stable@vger.kernel.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 4/5] memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy
Date: Tue, 15 Aug 2023 22:13:18 -0700	[thread overview]
Message-ID: <CALmYWFvxLee5+RyLh=vo6kpwMVS-_C7BJ9kmTPDa2tetgHOHPw@mail.gmail.com> (raw)
In-Reply-To: <20230814-memfd-vm-noexec-uapi-fixes-v2-4-7ff9e3e10ba6@cyphar.com>

On Mon, Aug 14, 2023 at 1:41 AM Aleksa Sarai <cyphar@cyphar.com> wrote:
>
> This sysctl has the very unusual behaviour of not allowing any user (even
> CAP_SYS_ADMIN) to reduce the restriction setting, meaning that if you
> were to set this sysctl to a more restrictive option in the host pidns
> you would need to reboot your machine in order to reset it.
>
> The justification given in [1] is that this is a security feature and
> thus it should not be possible to disable. Aside from the fact that we
> have plenty of security-related sysctls that can be disabled after being
> enabled (fs.protected_symlinks for instance), the protection provided by
> the sysctl is to stop users from being able to create a binary and then
> execute it. A user with CAP_SYS_ADMIN can trivially do this without
> memfd_create(2):
>
>   % cat mount-memfd.c
>   #include <fcntl.h>
>   #include <string.h>
>   #include <stdio.h>
>   #include <stdlib.h>
>   #include <unistd.h>
>   #include <linux/mount.h>
>
>   #define SHELLCODE "#!/bin/echo this file was executed from this totally private tmpfs:"
>
>   int main(void)
>   {
>         int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC);
>         assert(fsfd >= 0);
>         assert(!fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 2));
>
>         int dfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0);
>         assert(dfd >= 0);
>
>         int execfd = openat(dfd, "exe", O_CREAT | O_RDWR | O_CLOEXEC, 0782);
>         assert(execfd >= 0);
>         assert(write(execfd, SHELLCODE, strlen(SHELLCODE)) == strlen(SHELLCODE));
>         assert(!close(execfd));
>
>         char *execpath = NULL;
>         char *argv[] = { "bad-exe", NULL }, *envp[] = { NULL };
>         execfd = openat(dfd, "exe", O_PATH | O_CLOEXEC);
>         assert(execfd >= 0);
>         assert(asprintf(&execpath, "/proc/self/fd/%d", execfd) > 0);
>         assert(!execve(execpath, argv, envp));
>   }
>   % ./mount-memfd
>   this file was executed from this totally private tmpfs: /proc/self/fd/5
>   %
>
> Given that it is possible for CAP_SYS_ADMIN users to create executable
> binaries without memfd_create(2) and without touching the host
> filesystem (not to mention the many other things a CAP_SYS_ADMIN process
> would be able to do that would be equivalent or worse), it seems strange
> to cause a fair amount of headache to admins when there doesn't appear
> to be an actual security benefit to blocking this. There appear to be
> concerns about confused-deputy-esque attacks[2] but a confused deputy that
> can write to arbitrary sysctls is a bigger security issue than
> executable memfds.
>
Something to point out: The demo code might be enough to prove your
case in other distributions, however, in ChromeOS, you can't run this
code. The executable in ChromeOS are all from known sources and
verified at boot.
If an attacker could run this code in ChromeOS, that means the
attacker already acquired arbitrary code execution through other ways,
at that point, the attacker no longer needs to create/find an
executable memfd, they already have the vehicle. You can't use an
example of an attacker already running arbitrary code to prove that
disable downgrading is useless.
I agree it is a big problem that an attacker already can modify a
sysctl.  Assuming this can happen by controlling arguments passed into
sysctl, at the time, the attacker might not have full arbitrary code
execution yet, that is the reason the original design is so
restrictive.

Best regards,
-Jeff

  reply	other threads:[~2023-08-16  5:14 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14  8:40 [PATCH v2 0/5] memfd: cleanups for vm.memfd_noexec Aleksa Sarai
2023-08-14  8:40 ` [PATCH v2 1/5] selftests: memfd: error out test process when child test fails Aleksa Sarai
2023-08-14  8:40 ` [PATCH v2 2/5] memfd: do not -EACCES old memfd_create() users with vm.memfd_noexec=2 Aleksa Sarai
2023-08-14  8:40 ` [PATCH v2 3/5] memfd: improve userspace warnings for missing exec-related flags Aleksa Sarai
2023-08-22  9:10   ` Christian Brauner
2023-09-01  5:13   ` Damian Tometzki
2023-09-02 22:58     ` Andrew Morton
2023-09-04  7:09       ` Aleksa Sarai
2023-09-05 16:20       ` Florian Weimer
2023-09-06  6:58         ` Aleksa Sarai
2023-08-14  8:41 ` [PATCH v2 4/5] memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy Aleksa Sarai
2023-08-16  5:13   ` Jeff Xu [this message]
2023-08-16  5:44     ` Dominique Martinet
2023-08-16 22:46       ` Jeff Xu
2023-08-14  8:41 ` [PATCH v2 5/5] selftests: improve vm.memfd_noexec sysctl tests Aleksa Sarai
2023-08-16  5:08 ` [PATCH v2 0/5] memfd: cleanups for vm.memfd_noexec Jeff Xu
2023-08-19  2:50   ` Aleksa Sarai
2023-08-21 19:04     ` Jeff Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALmYWFvxLee5+RyLh=vo6kpwMVS-_C7BJ9kmTPDa2tetgHOHPw@mail.gmail.com' \
    --to=jeffxu@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=asmadeus@codewreck.org \
    --cc=brauner@kernel.org \
    --cc=cyphar@cyphar.com \
    --cc=dverkamp@chromium.org \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shuah@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).