On 2024-04-27, Stas Sergeev <stsp2@yandex.ru> wrote:
> This patch-set implements the OA2_CRED_INHERIT flag for openat2() syscall.
> It is needed to perform an open operation with the creds that were in
> effect when the dir_fd was opened, if the dir was opened with O_CRED_ALLOW
> flag. This allows the process to pre-open some dirs and switch eUID
> (and other UIDs/GIDs) to the less-privileged user, while still retaining
> the possibility to open/create files within the pre-opened directory set.
> 
> The sand-boxing is security-oriented: symlinks leading outside of a
> sand-box are rejected. /proc magic links are rejected. fds opened with
> O_CRED_ALLOW are always closed on exec() and cannot be passed via unix
> socket.
> The more detailed description (including security considerations)
> is available in the log messages of individual patches.

(I meant to reply last week but I couldn't get my mail server to send
mail...)

It seems to me that this can already be implemented using
MOUNT_ATTR_IDMAP, without creating a new form of credential overriding
within the filesystem (and with such a deceptively simple
implementation...)

If you are a privileged process which plans to change users, you can
create a detached tree with a user mapping that gives that user access
to only that tree. This is far more effective at restricting possible
attacks because id-mapped mounts don't override credentials during VFS
operations (meaning that if you miss something, you have a big problem),
instead they only affect uid-related operations within the filesystem
for that mount. Since this implementation does no inherit
CAP_DAC_OVERRIDE, being able to rewrite uid/gids is all you need.

A new attack I just thought of while writing this mail is that because
there is no RESOLVE_NO_XDEV requirement, it should be possible for the
process to get an arbitrary write primitive by creating a new
userns+mountns and then bind-mounting / underneath the directory. Since
O_CRED_INHERIT uses override_creds, it doesn't care about whether
something about the O_CRED_ALLOW directory changed afterwards. Yes, you
can "just fix this" by adding a RESOLVE_NO_XDEV requirement too, but
given that there have been 2-3 security issues with this design found
already, it makes me feel really uneasy. Using id-mapped mounts avoids
this issue because the new mount will not have the id-mapping applied
and thus there is no security issue.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>