Get the evil out – don’t run containers as root

Containers are not hot and shiny anymore. They are a common piece of the puzzle when it comes to building great software. But a common pitfall, even now, is that many containers run as root. This poses a security risk. So, get the evil out – don’t run containers as root.

Why containers run as root

Especially for eager developers, learning new things is a common practice. On the contrary, it is far more difficult to unlearn things. Why containers (still) run as root is a bit analogous to this principle. Although not needed to run as root, there are still several reasons why they do so:

  • The root user (UID 0) is the default user inside a container. If you don’t specify a non-root user, the container runs as root. Since the majority of container images in public repositories do not swap out the root user, this situation drags on.
  • Operating systems like Linux are known for starting services as a root user and then quickly dropping back to a non-root user. One reason to do so is to support port-binding below port 1024. Systemd can also start processes as non-root, but in most cases, it still assumes root.
  • Containers can use port forwarding to use a different port outside of the container compared to the port being used inside the container. Unfortunately, a lot of people are fond of common practices such as “port 80 for their web-server” so they won’t use this approach and thus keep being exposed to the security issue of having to run it as root.

Given these considerations which stem from old habits where old tools still work – it takes time and effort to handle issues like this. Technology is ready, now it’s time for the developers and operators to act.

What is the security problem?

Before we do a deeper dive into the actual problem, if everyone would use the “least privilege principle” and stick to it as much as possible, this problem would be less serious. Back to reality.

Privileges inside

One of the key arguments to avoid running a container as root is to prevent privilege escalation. A root user inside a container can basically run every command as a root user on a traditional host system. Think of installing software packages, start services, create users, etc. From an application perspective, this is undesirable. When running an application on a Virtual Machine, you should also not run it as a root user. For containers, this applies as well.

Privilege escalation

In case of a container breakout, the root user can access and execute anything on the underlying host as a highly privileged user as well. This means filesystem mounts are at risk, access to username/passwords which are configured on the host to connect to other services (yet another reason to use a secrets manager solution), installing unwanted malware and accessing other cloud resources.  Remember the article Let your Pods do the talking? This can also lead to the creation of cloud resources you are responsible to operate and pay for. As you can see from this risk, you need to limit the potential blast radius.

Risk management

Often, people have to balance the amount of time and money it takes to investigate, control and mitigate a security issue versus doing nothing and accept the risk. This is also the case here. Proper risk management strategies aims to find the balance between money spent on extra security measures versus the potential business loss in case it’s being exploited. It becomes extra complex since accepting risk is a business decision, not a technical one. And this problem is a tricky one for the business representatives to (fully) understand.

Root inside and outside

To shed some light on the problem, let’s take a closer look at what happens inside of the container and what happens outside.

Root outside and root inside

Consider a container which has been started as the root user (default when using Docker) and internally also runs as root. The user mapping is 0:0 (root:root) means that the root user inside the container also has root privileges on the host. If the container read-write mounts a sensitive filesystem path (say /etc/password) from the underlying host inside the container (it does not matter which path), the root user inside the container is able to overwrite the /etc/passwd file on the host itself. It is possible to mount this file read-only, but still it is a bad practice.

Start container with non-root user

When using Docker as a container runtime environment, it is possible to start a container with the –user <non-root-user> flag. This would overwrite the default root user inside the container. However, this doesn’t work if the container image has been built already with a fixed username. And if this username is bound to static values for the UID and GUID, they can’t be overwritten. Starting your container this way would not solve the problem.

Non-root user inside

A better way would be to switch to a non-root user inside the container itself. It’s best to create a non-root user inside the image which has a fixed UID. This number should be 1001+ which is then mapped to a user on the host. Avoid mappings with the root user or an existing user on the host system.

See the following snippet for an example:

FROM alpine:3.9
# Create a new group "fred" with guid 9901
RUN groupadd -g 9901 fred
# Create new user "fred" with user id 9901
# This user also has no shell so it can't login. It also has no home-dir
RUN useradd --no-create-home fred -s /bin/false -u 9901 -g fred
# Change user to fred
USER fred

The container now runs as user “fred”.

Considerations

While the last example greatly reduces security concerns, it has some considerations. First of all think of parent images. If you select a parent image that runs as a non-root user, you will inherit this user in the derived image. This makes it impossible to install new packages (using the package manager) in the downstream image. It is possible to switch (back) to the root user, but this also gives problems.

If installing a package in the downstream image is not needed, a root user might still be needed. For example to upgrade a package to patch it or to remove a secret (e.g. a private key or a token) from the parent image. Your default Hadolint rules will complain about it. A container image hierarchic should be easy to patch and maintain to fully reap the benefits. If one or another system prevents this for every change, it is less beneficial.

Dealing with exceptions

The devil is always in the details. Some containers should run as root to function correctly. See the following use case for an example.

Azure Pod Identity

  • The MIC controller container and the NMI component both require root. NMI uses IPtables to route traffic to/from the container to intercept requests which are sent to Azure Active Directory. The MIC controller needs to access various files and directories on the Kubernetes control plane. As this is a managed service, you can’t change that.

Kubernetes

  • In Kubernetes, it is possible to specify a userID and groupID from the Pod as well as the containers inside the Pod. Besides this, you can add a “security context” which adds or removes Linux “capabilities”. Both solutions do not solve the Azure Pod Identity scenario.

Runtime protection

Another example comes from the container runtime security tool. A container that runs as part of your security system (being a single host or an orchestrator) that has to check other containers and the underlying infrastructure itself, also runs as root. Even worse: it should (probably) run in privileged mode. You should definitively check the contents of this container (image) and closely monitor it.

So how would you deal with these kinds of exceptions when you can’t switch to a non-root user?

Solutions

  • Keep the number of containers that run as root as low as possible. Build a “trust-group” of accepted containers that are based on a fixed base image. Use the SHA256 hash value of the image, not the image name (anyone can create a new image with the same name and tag and overwrite it). Off course these images should come from your trusted registry.
  • Carefully monitor and log running containers that are started based on this image. Pay special care to logs to keep them even when the container is already replaced or disposed of. A centralized logging solution like AWS Cloudwatch or container insights can help. And if you’re even more paranoid, consider using a container forensics tool to dig into it in case of suspicious problems.
  • Create a whitelist of commands which are allowed to be executed inside these containers. Say, for example, the container attempts to execute the blacklisted “mount” command you need to get an alert. Stop the container if you’re even more strict. But keep in mind your application would be down by then.
  • Consider signing your container images so you know these are not tampered with in between build- and runtime. This is a complicated topic both from a technical as well as from an organizational point of view. Take a look at Notary to start.

New tools arrive every day, so is the same for rootless containers.

Rootless containers

Since most of these issues stem from the fact that the container runtime runs as a root user, it would make a great difference to run this process as a non-root user. Rootless containers can help here.

Podman

Redhat maintains a tool called “Podman” to enable rootless containers. They share a number of articles which zoom in into this topic. From a Podman’s point of view, you can either have the following configurations.

  • Run podman as a root user and the processes inside the container as well
  • Running podman as a non-root user,  but run the processes inside the container as root
  • Run podman as a root user and the processes inside the container as root
  • Running podman as a non-root user and also run the processes inside the container as non-root

Developers benefit from this since they can choose one of these four configurations when they deploy their applications on Openstack. Bear in mind that supported container images cannot be pulled from Dockerhub.

Buildkit & Kaniko

Buildkit has the option to run containers as a non-root user. For nested containers, you need to disable the security frameworks of Seccomp and AppArmore. Kaniko, a runtime container solution from Google uses Buildkit to handle containers. It does not need to run nested containers so no need to disable the security frameworks. Jenkins-X also uses Kaniko as a container runtime so this is gaining more traction.

Other solutions

It’s impossible to list all tools which are focused on running containers without the need for a highly privileged user. Be sure to check out the websites of Bazel rules_docker, buildah, FTL, IMG, orca-build and umoci.

Wrap-up

While root is still the default user to run containers and to run processes in the containers themselves, mitigating this security concern becomes more common nowadays. Luckily, multiple tools help to achieve this and find the balance between developer experience and security. I hope this article has inspired you to address the problem and look at the exceptions so you get an idea how to deal with these if you still require root.

6 ways available to address interoperability chall ...

Pros and cons of stateful and stateless architectu ...