From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by smtp.lore.kernel.org (Postfix) with SMTP id 68629C07545 for ; Tue, 24 Oct 2023 15:56:19 +0000 (UTC) Received: (qmail 11689 invoked by uid 550); 24 Oct 2023 15:56:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Received: (qmail 11639 invoked from network); 24 Oct 2023 15:56:06 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sempervictus-com.20230601.gappssmtp.com; s=20230601; t=1698162955; x=1698767755; darn=lists.openwall.com; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=E94RLmyMd1HiweNv7aByojK3JqrvIFEoVQvzRulxw0o=; b=nmdmWsQkzlJGxjFL43Pjcyh2y4tB/NkSTfefZ6BGpMaY/Ml46EI8/Pbttf9NooBPd8 phlOSdZw2WYsgLrCauitX9GxGtWps/jiPoRH1LlZpUFciVLSgrG8QqnMUO0R3id+9utI 4X/e9IwN4Hx/52ji55VobOqhW6uyk0Pg4cdftbWzcsiXJ2uGU7+EhZps5/5XCrGKy+H6 n/uD9Zxq/16ZJHt1DwXJqpv8BGOBhUKmF3W3SLP6E5spVQfz/o3DfPdOmEFLTsTdvzfK 1Hvcr1AgZeboFu38SknfudwnWbdeaEVgAR822lw4kkWQPxPMdZRVNXq3i9xkiIp97unI 4O1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698162955; x=1698767755; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=E94RLmyMd1HiweNv7aByojK3JqrvIFEoVQvzRulxw0o=; b=CUD1WVEBYvuRuogTWkLvKJrB+L+USXSafRXGn7UW7Xt1wn51C4ytmIEOhqrTpXKdE1 BlDJVFb32d9XiswEOhhkXCNXyjmHfLYFVLxchHQPVlaN2qFNDAhZsMMnFDYbWLtT61Aj 0IHyoy90uJzTS+ca6xZtqoWfJBmr47jAsfgENzSLRlc8mKixDhOkrAQCx2sOrwkjPbKr G9WW8ogmgzvT35Ox5EXb7U3OhqCNTQ94yj83O8L6X0J6mwE40JUdSeLVbKB0pC4aq1X8 hdbQd8cMlvH9eXrDfs3X/fQQfFp4Gg/P9Vm8Mva0KXamc9L4YeacSos9dkzqZ6JhPQQK fHdQ== X-Gm-Message-State: AOJu0Yzyu54ezIk0T5IUd65PzhViXSgiYX3cw0LFsXgXXCTsO9WxJbtb 4auEMuq9/ux4DOI0EEWjQp9aXGzsrWu+cBfZHMqvHQ== X-Google-Smtp-Source: AGHT+IF0tmf8XNk/2f4CcL+8SuwkOoHkvNg7YiK34SZRP0YXoEriNpE1GN52TzGlhrzjr3vtL5EBfTK7s8sFKcKPZz8= X-Received: by 2002:a17:90a:fb48:b0:27d:4259:b7ef with SMTP id iq8-20020a17090afb4800b0027d4259b7efmr13729067pjb.23.1698162954818; Tue, 24 Oct 2023 08:55:54 -0700 (PDT) MIME-Version: 1.0 References: <20231024134608.GC320399@mail.hallyn.com> <20231024141512.GA321218@mail.hallyn.com> In-Reply-To: <20231024141512.GA321218@mail.hallyn.com> From: Boris Lukashev Date: Tue, 24 Oct 2023 11:55:43 -0400 Message-ID: Subject: Re: Isolating abstract sockets To: "Serge E. Hallyn" Cc: kernel-hardening@lists.openwall.com, Stefan Bavendiek , linux-hardening@vger.kernel.org Content-Type: multipart/alternative; boundary="000000000000b16f210608785f10" --000000000000b16f210608785f10 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Good point: from the "resources granted to a user" perspective, that does help bound their consumption. The nomenclature distinction seems like a good one to have, but if "network namespaces" *change the meaning of the term *and the original definition becomes "network device namespaces," then there would be a period where older and newer kernels have very different functions mapped to the same conceptual name. Might this make a bit more sense as "network namespaces" meaning what they do now - "network device namespaces," effectively; while the new concept would be "socket namespaces" to account for the various socket style interfaces provided? Thanks -Boris On Tue, Oct 24, 2023 at 10:15=E2=80=AFAM Serge E. Hallyn = wrote: > Thanks for the reply. Do you have any papers which came out of this r&d > phase? Sounds very interesting. > > > Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster > > Yes, but that could be a feature. I think of it as: I'm unprivileged > user serge, and I want to fire off firefox in a whatzit-namespace so > that I can redirect or forbid some connections. In this case, the > admins have not agreed to let me double my resource usage, so the fact > that the new namespace is sharing mine is a feature. And this lets > me use network-namespace-like features completely unprivileged, without > having to use a setuid-root helper to hook up a bridge. > > But, I didn't send this reply to advocate this approach. My main point > was to mention that "network namespaces are network device namespaces" > and hope that others would bring other suggestions for alternatives. > > -serge > > On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote: > > Namespacing at OSI4 seems a bit fraught as the underlying route, mac, > and physdev fall outside the callers control. Multiple NS' sharing an IP > stack would exhaust ephemeral ranges faster (likely asymmetrically too) a= nd > have bound socket collisions opaque to each other requiring handling > outside the NS/containers purview. We looked at this sort of thing during > the r&d phase of our assured comms work (namespaces were young) and found= a > bunch of overhead and collision concerns. Not saying it can't be done, bu= t > getting consumers to play nice enough with such an approach may be a heav= y > lift. > > > > Thanks, > > -Boris > > > > > > On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" > wrote: > > >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote: > > >> When building userspace application sandboxes, one issue that does > not seem trivial to solve is the isolation of abstract sockets. > > > > > >Veeery late reply. Have you had any productive discussions about this > in > > >other threads or venues? > > > > > >> While most IPC mechanism can be isolated by mechanisms like mount > namespaces, abstract sockets are part of the network namespace. > > >> It is possible to isolate abstract sockets by using a new network > namespace, however, unprivileged processes can only create a new empty > network namespace, which removes network access as well and makes this > useless for network clients. > > >> > > >> Same linux sandbox projects try to solve this by bridging the > existing network interfaces into the new namespace or use something like > slirp4netns to archive this, but this does not look like an ideal solutio= n > to this problem, especially since sandboxing should reduce the kernel > attack surface without introducing more complexity. > > >> > > >> Aside from containers using namespaces, sandbox implementations base= d > on seccomp and landlock would also run into the same problem, since > landlock only provides file system isolation and seccomp cannot filter th= e > path argument and therefore it can only be used to block new unix domain > socket connections completely. > > >> > > >> Currently there does not seem to be any way to disable network > namespaces in the kernel without also disabling unix domain sockets. > > >> > > >> The question is how to solve the issue of abstract socket isolation > in a clean and efficient way, possibly even without namespaces. > > >> What would be the ideal way to implement a mechanism to disable > abstract sockets either globally or even better, in the context of a > process. > > >> And would such a patch have a realistic chance to make it into the > kernel? > > > > > >Disabling them altogether would break lots of things depending on them= , > > >like X :) (@/tmp/.X11-unix/X0). The other path is to reconsider > network > > >namespaces. There are several directions this could lead. For one, a= s > > >Dinesh Subhraveti often points out, the current "network" namespace is > > >really a network device namespace. If we instead namespace at the > > >bind/connect/etc calls, we end up with much different abilities. You > > >can implement something like this today using seccomp-filter. > > > > > >-serge > --=20 Boris Lukashev Systems Architect Semper Victus --000000000000b16f210608785f10 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Good point: from the "resources granted to a use= r" perspective, that does help bound their consumption. The nomenclatu= re distinction seems like a good one to have, but if "network namespac= es" change the meaning of the term and the original definition = becomes "network device namespaces," then there would be a period= where older and newer kernels have very different functions mapped to the = same conceptual name. Might this make a bit more sense as "network nam= espaces" meaning what they do now - "network device namespaces,&q= uot; effectively; while the new concept would be "socket namespaces&qu= ot; to account for the various socket style interfaces provided?
=
Thanks
-Boris

On Tue, Oct 24, 2023 at 10:= 15=E2=80=AFAM Serge E. Hallyn <serge= @hallyn.com> wrote:
Thanks for the reply.=C2=A0 Do you have any papers which came ou= t of this r&d
phase?=C2=A0 Sounds very interesting.

> Multiple NS' sharing an IP stack would exhaust ephemeral ranges fa= ster

Yes, but that could be a feature.=C2=A0 I think of it as:=C2=A0 I'm unp= rivileged
user serge, and I want to fire off firefox in a whatzit-namespace so
that I can redirect or forbid some connections.=C2=A0 In this case, the
admins have not agreed to let me double my resource usage, so the fact
that the new namespace is sharing mine is a feature.=C2=A0 And this lets me use network-namespace-like features completely unprivileged, without
having to use a setuid-root helper to hook up a bridge.

But, I didn't send this reply to advocate this approach.=C2=A0 My main = point
was to mention that "network namespaces are network device namespaces&= quot;
and hope that others would bring other suggestions for alternatives.

-serge

On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote:
> Namespacing at OSI4 seems a bit fraught as the underlying route, mac, = and physdev fall outside the callers control. Multiple NS' sharing an I= P stack would exhaust ephemeral ranges faster (likely asymmetrically too) a= nd have bound socket collisions opaque to each other requiring handling out= side the NS/containers purview. We looked at this sort of thing during the = r&d phase of our assured comms work (namespaces were young) and found a= bunch of overhead and collision concerns. Not saying it can't be done,= but getting consumers to play nice enough with such an approach may be a h= eavy lift.
>
> Thanks,
> -Boris
>
>
> On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@hallyn.com>= ; wrote:
> >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:<= br> > >> When building userspace application sandboxes, one issue that= does not seem trivial to solve is the isolation of abstract sockets.
> >
> >Veeery late reply.=C2=A0 Have you had any productive discussions a= bout this in
> >other threads or venues?
> >
> >> While most IPC mechanism can be isolated by mechanisms like m= ount namespaces, abstract sockets are part of the network namespace.
> >> It is possible to isolate abstract sockets by using a new net= work namespace, however, unprivileged processes can only create a new empty= network namespace, which removes network access as well and makes this use= less for network clients.
> >>
> >> Same linux sandbox projects try to solve this by bridging the= existing network interfaces into the new namespace or use something like s= lirp4netns to archive this, but this does not look like an ideal solution t= o this problem, especially since sandboxing should reduce the kernel attack= surface without introducing more complexity.
> >>
> >> Aside from containers using namespaces, sandbox implementatio= ns based on seccomp and landlock would also run into the same problem, sinc= e landlock only provides file system isolation and seccomp cannot filter th= e path argument and therefore it can only be used to block new unix domain = socket connections completely.
> >>
> >> Currently there does not seem to be any way to disable networ= k namespaces in the kernel without also disabling unix domain sockets.
> >>
> >> The question is how to solve the issue of abstract socket iso= lation in a clean and efficient way, possibly even without namespaces.
> >> What would be the ideal way to implement a mechanism to disab= le abstract sockets either globally or even better, in the context of a pro= cess.
> >> And would such a patch have a realistic chance to make it int= o the kernel?
> >
> >Disabling them altogether would break lots of things depending on = them,
> >like X :)=C2=A0 (@/tmp/.X11-unix/X0).=C2=A0 The other path is to r= econsider network
> >namespaces.=C2=A0 There are several directions this could lead.=C2= =A0 For one, as
> >Dinesh Subhraveti often points out, the current "network"= ; namespace is
> >really a network device namespace.=C2=A0 If we instead namespace a= t the
> >bind/connect/etc calls, we end up with much different abilities.= =C2=A0 You
> >can implement something like this today using seccomp-filter.
> >
> >-serge


--
Boris Lukashev
Systems A= rchitect
Semp= er Victus
--000000000000b16f210608785f10--