EN

What is this thing called DistroLess?

Learn about distroless containers, essential for security, performance, and cost savings. Discover their advantages and start implementing them now!

CTO

João Brito

You have probably crossed paths with this concept somewhere and wondered what it could be? The time has come to make it clear, in plain English, what "DistroLess" is!


The idea of distroless containers comes from the goal of creating increasingly smaller containers, with fewer unnecessary packages (given that the container only needs the packages required to run the application) and consequently a smaller potential attack surface.


One of the big concerns with container-based distributed systems is the base image or the image of these containers. This is due to the fact that it possesses the necessary packages to run your application, at least that’s how it should be!


As time has taught us, this is an established good practice followed by Google itself and other major container and Kubernetes players as well. In this link, you can learn about Google's project on this.


Containers spread and multiply very quickly within a company, which typically leads to problems ranging from scalability to vulnerabilities.


But is reducing size really that important? Yes, for three main reasons: security, performance, and money.


Security — The larger the number of packages to manage, the greater the complexity of the environment and the harder it becomes to control and guarantee the immutability and provenance of each part of your image; moreover, the updating work becomes increasingly exhausting and difficult to manage, generating extra, often unnecessary effort. Of course, the opposite, having a clean, small, and exclusive base image for your needs generates hard work as well.


In short, there is no simple way out for either case, but the result appears on several fronts, not only in security but also in performance.


Performance — Performance is directly affected, not only because your container is smaller, transported faster, and available on other nodes more quickly, but also across your entire system, since you will have an abundance of containers, therefore, an abundance of images "circulating" through your cluster, being updated, scaled, and transferred from one node to another, because our systems today are no longer static but alive and constantly updating. This essence of "Cloud Native" brings these network/transport challenges mainly, which affect the core of a distributed environment, the distribution.


Money — Of course, cost is affected by all these factors, because your favorite cloud is good at several products, but the best of them is in charging! Data transfer, for example, and storage are directly impacted by the size of your images and, consequently, your system, increasing your costs. Having this issue well-resolved is of utmost importance to maintain a healthy operation.


So, what actually are distroless containers?


In practice, distroless, just like serverless, leads you into a literal translation error. Linux is Linux, but a container is not a VM that needs millions of packages and applications for us to work, mainly because containers do not have that purpose. As previously mentioned, ideally, a container should have a single purpose and that is normally to process your application.


In an example of a vulnerability scan of a JAVA base image, we have over 600 vulnerabilities in its "latest" version, close to 50 in its "slim" version, and ZERO in its "distroless" version, so right here we already have a great indicator of "attack surface" and the number of packages we would have to update or remove.


Given this introduction, your containers shouldn't have, for example, Bash, Grep, or even Find to be able to run your GoLang, Node, and even JAVA applications. If you are used to connecting to a container for debugging, this is highly discouraged and following the distroless model this will be impossible. Of course, the debugging process will still be possible and necessary, but that is not why tools should be there alongside your application 100% of the time for an "eventuality" of problems in a production environment. In our Kubernetes scenario, we have a powerful ally called “debug containers” working natively, removing that excuse: "what am I going to do without the xyz tool during the investigation", for example.


Of course, removing packages but having an application that needs to run as privileged in your cluster will be of no use, okay! As always, security is done in layers and attention must be paid to each stage of it. This takes a lot of work, but it is certainly better than relying on "luck".


Conclusion


As a good senior SRE, I will end by saying that here, everything "depends"! It depends on your scenario, your applications, your availability of time, and expertise to manage all of this, but my recommendation is that you keep this issue on your radar and start putting it into practice in some project. Give the necessary importance to security before it is too late! Start on a small project, if possible, to get the hang of it, hit your head a little, and then grow. After all, nobody besides Tony Stark starts flying before walking!

Newsletter Getup.

Atualizações sobre Kubernetes e Software Supply Chain Security todos os meses.

Operating Kubernetes in production for more than 13 years. With Quor, this experience extends to software supply chain security as well.