Lead Image © it studiom1, 123RF.com

Lead Image © it studiom1, 123RF.com

Building sustainably safe containers

Build by Number

Article from ADMIN 61/2021
By
The basic container images on which you base your work can often be out of date. We show you how to solve this problem and create significantly leaner containers.

Among other things, my job involves developing applications in the field of network automation on the basis of the Spring Boot framework, which requires a running Java environment. At the same time, some infrastructure applications are required, such as DNS servers.

Before containers existed, infrastructure services ran in minimal change root environments, containing only the necessary binaries (e.g., chroot/named), configuration files, and libraries. This setup reduced the number of potential attack vectors for exposed services. For example, an attempt by the attacker to call /bin/sh would fail because the environment would not have a shell.

Classical Docker build files, which use FROM ubuntu to include a complete Ubuntu environment, are the exact opposite of the approach just described. The resulting container is easier to debug because, for example, a shell is available. However, it is also far larger and less secure because an attacker could find and use the shell binary.

Manufacturers keep their official containers up to date, which means that when the container is rebuilt, an updated Ubuntu would also be dragged in. However, no mechanism automatically triggers such a rebuild. One of my goals was therefore to rebuild automatically all containers that contain components for which patches are available. At the same time, I wanted the containers to be leaner.

Dockerfiles

Docker supports the ability to import the compressed tarball of a change root environment, but the build process is hard to maintain. It makes more sense to use a Dockerfile that contains the components of the image and also lets you import single files from other images. Calling scripts or entire installations might be possible, as well. To create such a container, you would use docker build. To begin, though, copy an archive (usually a .tar.gz) into a folder and create a file named Dockerfile:

FROM dockerrepo.matrix.dev/gentoo-java:latest-amd64
ADD webapp.tar.gz /
ENTRYPOINT ["java", "-jar", "mywebapp.jar"]
EXPOSE 8080/tcp

The first line describes a base image whose filesystem is inserted into the current container. In this case, it's a Gentoo Linux-based image (see the "Why Gentoo?" box) that provides a runnable Java environment. The next line adds the contents of webapp.tar.gz to the root directory of the container. The third line ensures that the call java -jar mywebapp.jar is executed automatically if the container is started with docker run and without arguments. The last line finally exposes port 8080, so that you can leave out the -p 8080:8080 option in the Docker call.

Why Gentoo?

The system presented here would also work with other distributions. I chose Gentoo because the distribution compiles applications locally from source files. Therefore, you can easily archive and document the sources of the binaries for a later audit of each version of each container. Because admins compile the documentation themselves and the compiler sources are also available, the chain of documentation can be traced back to the source code. Only an infection of the build host would offer an attack vector, and the risk can be mitigated by appropriate protection.

The Docker build process is organized hierarchically. The images provided by the binaries in the container build on each other. Starting with a base image, which is initially created as an empty image with FROM scratch, several images can each completely import another one, which creates the layers that are downloaded one by one from the registry. If a layer remains unchanged, no download is required, saving time and bandwidth.

The referenced image, gentoo-java, includes the GNU C library (glibc ) image and (because the Java binaries require it) the zlib library and some GNU compiler collection (GCC) libraries. However, only the necessary shared libraries are included, not the complete images. Finally, the glibc image uses a base image in its FROM line, which contains a minimal filesystem with the /etc, /dev, and /tmp directories. Thanks to its hierarchical structure, the build system, described later, can update individual layers of the image separately.

The source files for the images are available as tar.gz archives, which are created from cleaned up file lists of packages. In the container, for example, neither man pages nor sample configurations are needed. Building up with one image per package might sound complex, but it only requires more work in the first step. The application images at the end of the chain can be exported as a single file and integrated into other registries if required.

Practical Implementation

The first step in creating a container image from a package is to collect the files from the operating environment. To help me keep track, I first defined a folder structure. Each container has a folder with a name that follows the <distribution>-<package name> pattern, resulting in folders in the form gentoo-glibc or gentoo-gcc. Each of these folders contains the respective Docker file and the tar.gz archive that was picked up.

GNU Make is used as the build tool because it makes it relatively easy to map dependencies to files by timestamps. If a package was updated since the last creation date of the tar.gz archive, the timestamp of the files is newer and Make triggers an action.

A list of files is necessary to create the archive. The easiest way for an admin on Gentoo to create this list is to run the q files <package> command. To discard unnecessary files, then, use grep filters and pass the resulting list into a tar command that reads the list of files to archive from standard input. For most of the packages that only deliver shared libraries, the section of the Makefile for the libuv package is:

gentoo-libuv/gentoo-libuv.tar.gz: /usr/lib64/libuv.so.1
   q files dev-libs/libuv | grep /usr/lib | tar -c -T - -v -z -f $@

Some packages need more files, so suitable grep filters more or less sort out or sort in. The example also shows the dependency. The archive is only rebuilt if the /usr/lib64/libuv.so.1 file has changed. The manual work for each package now consists of identifying a file that can be used as an indicator for a patch and sorting out which files in the archive are necessary at the end.

My environment has two Makefiles: one to create the tar.gz archives and one that then triggers the Docker build processes. Listing 1 shows the Makefile for the archives.

Listing 1

Makefile for Archives

all: gentoo-glibc/gentoo-glibc.tar.gz gentoo-gcc/gentoo-gcc.tar.gz gentoo-java/gentoo-java.tar.gz  gentoo-gmp/gentoo-gmp.tar.gz gentoo-mpc/gentoo-mpc.tar.gz gentoo-mpfr/gentoo-mpfr.tar.gz
gentoo-glibc/gentoo-glibc.tar.gz: /usr/include/libintl.h
    sh createglibctar.sh
gentoo-gcc/gentoo-gcc.tar.gz: /usr/bin/gcc
    sh creategcctar.sh
gentoo-java/gentoo-java.tar.gz: /usr/lib/jvm/icedtea-bin-8 createjavatar.sh
        sh createjavatar.sh
gentoo-gmp/gentoo-gmp.tar.gz: /usr/lib64/pkgconfig/gmp.pc
    q files dev-libs/gmp |grep usr/lib|tar czvf $@ -T -
gentoo-mpc/gentoo-mpc.tar.gz: /usr/lib64/libmpc.so
    q files dev-libs/mpc |grep lib|grep -v doc|tar czvf $@ -T -
gentoo-mpfr/gentoo-mpfr.tar.gz: /usr/lib64/libmpfr.so
    q files dev-libs/mpfr |grep lib|grep -v doc|tar czvf $@ -T -
gentoo-zlib/gentoo-zlib.tar.gz: /usr/lib64/pkgconfig/zlib.pc
    q files sys-libs/zlib | grep /lib64 | tar cvzf $@ -T -

For GCC and Java, a small shell script handles the task of compiling the packages, because softlinks still play a role that would otherwise be missing. The base container is not included in the Makefile, because it is not generated statically, but from packages.

After an upgrade, you now just need to call Make to recreate the archives where necessary, and the containers are then built. Immediately after building they are uploaded to the local registry with the latest tag.

The sticking point here was the modification date. Although it is possible to query the modification data of existing containers in the registry or on the local host with an API call, it is difficult to do in the Makefile, which was what prompted me to cheat and simply add && touch builddate to the docker build call and then && touch pushtime after docker push. The two files are only created if the step was successful, and pushtime serves as the target in the Makefile.

To map the hierarchy of the containers in the Makefile, the pushtime files of all images are also included in the dependencies that are necessary to build the container. The Makefile section in Listing 2 illustrates this.

Listing 2

Managing Dependencies

gentoo-java/pushtime: gentoo-java/gentoo-java.tar.gz gentoo-glibc/pushtime gentoo-zlib/pushtime gentoo-gcc/pushtime
    cd gentoo-java; docker build -t dockerrepo.matrix.dev:gentoo-java:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-java:latest-amd64 && touch pushtime

The Java image is based on the glibc image, but also copies files from zlib and GCC, which means you have to build and upload these images before the Java image can be created. Listing 3 (abridged) shows the call to Make and its screen output after patches for glibc were released, triggering a rebuild of all containers.

Listing 3

Make After glibc Update (Abridged)

# make -f Makefile.docker
cd gentoo-glibc; docker build -t dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 && touch pushtime
Sending build context to Docker daemon  21.12MB
Step 1/2 : FROM dockerrepo.matrix.dev/gentoo-base:latest
 ---> 22fe37b24ebe
Step 2/2 : ADD gentoo-glibc.tar.gz /
 ---> 4e800333acbd
Successfully built 4e800333acbd
Successfully tagged dockerrepo.matrix.dev/gentoo-glibc:latest-amd64
The push refers to repository [dockerrepo.matrix.dev/gentoo-glibc]
22bac475857f: Pushed
636634f1308a: Layer already exists
[...]
Step 2/8 : FROM dockerrepo.matrix.dev/gentoo-glibc:latest-amd64
[...]
Step 8/8 : ADD gentoo-gcc.tar.gz / ---> b89e1b4ab2ba
Successfully built b89e1b4ab2ba
Successfully tagged dockerrepo.matrix.dev/gentoo-gcc:latest-amd64
The push refers to repository [dockerrepo.matrix.dev/gentoo-gcc]
794c152bde4c: Pushed
[...]
22bac475857f: Mounted from gentoo-bind
636634f1308a: Layer already exists
latest-amd64: digest: sha256:667609580127bd14d287204eaa00f4844d9a5fd2847118a6025e386969fc88d5 size: 1996
cd gentoo-java; docker build -t dockerrepo.matrix.dev/gentoo-java:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-java:latest-amd64 && touch pushtime
Sending build context to Docker daemon  66.12MB
Step 1/6 : FROM dockerrepo.matrix.dev/gentoo-glibc:latest-amd64
 ---> 4e800333acbd
Step 2/6 : COPY --from=dockerrepo.matrix.dev/gentoo-zlib:latest-amd64 /lib64/* /lib64/
 ---> aaf3f557c027
Step 3/6 : COPY --from=dockerrepo.matrix.dev/gentoo-gcc:latest-amd64 /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/lib* /lib64/
 ---> 6f7d7264921c
Step 4/6 : ADD gentoo-java.tar.gz /
 ---> afb2d5612109
Step 5/6 : ENV JAVA_HOME /opt/icedtea-bin-3.16.0
[...]
441dec54d0dd: Pushed
22bac475857f: Mounted from gentoo-glibc
636634f1308a: Layer already exists
latest-amd64: digest: sha256:965aeac1b1cd78cde11aec58d6077f69190954ff59f5064900ae12285e170836 size: 1371

The trickiest task in this approach is that of resolving all the dependencies. Minimizing the container means finding all the necessary shared libraries, and the tool of choice is ldd, which lists the referenced shared libraries of a binary.

Instead of running the binary right in the container in the environment you created, it makes sense to launch it in a change root environment, which makes it easier to find out which library is missing. Also, a run with strace, which identifies missing configuration files, for example, is easier to handle in this way. If several binaries are used, the program might launch, but it could throw an error were a certain function called.

Developers also need to keep in mind that shared libraries occasionally change versions of dependencies. If the file used to determine whether the archive needs to be rebuilt is /usr/lib64/libdb-5.3.so, and if version 5.4 is available after the updates, then the indicator file is missing and the Makefile fails. This possibility must be taken into account when selecting the indicator files.

Debugging Containers?

If the container does not work even though all libraries are present, it would normally be possible to find the error by starting a shell in the container; however, this lean approach does not have a shell option. Instead, a debug container can be built very easily. In the first step you need to create a container for the BusyBox package and then the debug container with the Docker file:

FROM dockerrepo/applicationcontainer:latest-amd64
COPY --from=dockerrepo/gentoo-busybox:latest-amd64 /bin/ /bin/

In the busybox container a softlink needs to point from /bin/busybox to /bin/sh, which gives developers a version of the container with an interactive shell. However, this is a separate debug container, which means it is less likely to end up in production by mistake.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=