Namespaces can also be used for sandboxing, but they have a series of problems. Most importantly, they require more substantial changes to your program that wants to sandbox itself, and the program has to jump through a series of hoops to get everything into the right state. It is possible, but the resulting program environment is in the end more unusual and the mechanisms for enabling unprivileged namespaces are making it difficult to use it for smaller use cases. (It involves re-execution of the program that wants to sandbox itself, whereas with Landlock, a small program can just install a Landlock policy during an early startup phase and continue with that.)
Controlling the rules through a separate process is not currently possible, but it was proposed earlier this month on the kernel mailing lists:
I think in the upstream kernel LSMs are also still the only way to prevent a process from creating child namespaces where it has privileges?
E.g. if you can cat CAP_NET_ADMIN even within a restricted namespace, you have access to huge amounts of horrbly broken kernel code. It's easy (for people who know how to exploit kernel bugs) to escalate privileges from there.
Distros have their own fixes for this issue so namespaces definitely aren't useless in practice for sandboxing. But the basic mechanism just doesn't that well suited to it.
Ah I didn't know about that. So you can block the child from creating a userns completely... That seems like an unnecessarily big hammer, but also probably 95% of cases works fine?
I think probably we want an inherited mask of what capabilities you can get in child namespaces. I think I heard someone proposed that upstream but I haven't seen the patches.
NO_NEW_PRIVS is quite irritating in a lot of contexts, since it breaks distant dependencies. For example, you can't run `ping`, so good luck debugging your networking!
> For example, you can't run `ping`, so good luck debugging your networking!
Sending ICMP Echo in userspace (over UDP) is a thing on Linux. From experience, for public Internet, where possible, it is always better to rely on TLS connects (then TCP or UDP, and then ICMP) to ascertain connectivity (lest some middleware meddle with IP or Transport replies).
Namespaces can also be used for sandboxing, but they have a series of problems. Most importantly, they require more substantial changes to your program that wants to sandbox itself, and the program has to jump through a series of hoops to get everything into the right state. It is possible, but the resulting program environment is in the end more unusual and the mechanisms for enabling unprivileged namespaces are making it difficult to use it for smaller use cases. (It involves re-execution of the program that wants to sandbox itself, whereas with Landlock, a small program can just install a Landlock policy during an early startup phase and continue with that.)
Controlling the rules through a separate process is not currently possible, but it was proposed earlier this month on the kernel mailing lists:
https://lore.kernel.org/all/cover.1741047969.git.m@maowtm.or...