Buildah: A new way to build container images

A previous post covered a few different strategies for building container images. The first, building container images in place, is what everyone is familiar with from a traditional Docker build. The second strategy, injecting code into a pre-built image, allows developers to add their code to a pre-built environment without really messing with the setup itself. And finally, Asset Generation Pipelines use containers to compile assets that are then included during a subsequent image build, eventually implemented natively by Docker as Multi-Stage Builds. With the introduction of Project Atomic’s new Buildah tool for creating container images, it has become easier to implement a new build strategy that exists as a hybrid of the other three: using development tools installed elsewhere to build or compile code directly into an image.

Segregating build dependencies from production images

Buildah makes it easy to “expose” a working container to the build system, allowing tools on the build system to modify the container’s filesystem directly. The container can then be committed to a container image suitable for use with Docker, Runc, etc. This keeps the build tools from being installed in the image, resutling in a smaller, leaner image.

Using the ever-helpful GNU Hello as an example, consider the following Dockerfile:

FROM fedora:25
LABEL maintainer Chris Collins <[email protected]> RUN dnf install -y tar gzip gcc make

RUN curl http://ftp.gnu.org/gnu/hello/hello-2.10.tar.gz | tar xvz -C /opt

WORKDIR /opt/hello-2.10

RUN ./configure
RUN make
RUN make install

ENTRYPOINT "/usr/local/bin/hello"

This is a relatively straightforward Dockerfile. Hello needs gcc and make to compile, and the container needs tar and gzip to extract the source tarball containing the code. None of these packages are required for Hello to work once it has been built, though. Nor does Hello need any of the dependency packages installed alongside these four, binutils, cpp, gc, glibc-devel, glibc-header, guile, isl, kernel-headers, libatomic_ops, libgomp, libmpc, libstdc++, or libtool-ltdl, or updates to glibc, glibc-common, glibc-langpack-en, libcrypt-nss or libgcc. These packages add an extra 48M of data to the resulting image that isn’t needed to run GNU Hello. The extracted source files for Hello itself are another 3.7M.

With Buildah, an image can be built without any extra packages or source files making it into the final image.

#!/usr/bin/env bash
set -o errexit

# Create a container
container=$(buildah from fedora:25)

# Mount the container filesystem
mountpoint=$(buildah mount $container)

# A Buildah-native command to set the maintainer label
buildah config -label maintainer="Chris Collins <[email protected]>" $container

# Download & extract the source files to the host machine
curl http://ftp.gnu.org/gnu/hello/hello-2.10.tar.gz | tar xvz -C /tmp
pushd /tmp/hello-2.10

# Compile the code using make, gcc and their
# dependencies installed on the host machine
./configure
make

# Install Hello into the filesystem of the container
make install DESTDIR=${mountpoint}
popd

# Test that Hello works from inside the container filesystem
chroot $mountpoint bash -c "/usr/local/bin/hello -v"

# Set the entrypoint
buildah config -entrypoint "/usr/local/bin/hello" $container

# Save the container to an image
buildah commit -format docker $container hello

# Cleanup
buildah unmount $container
buildah rm $container

After using Buildah to create a container and mount its filesystem, the source files are extracted to the host. Hello is compiled using development packages from the host, and then make install DESTDIR=${mountpoint} installs the resulting compiled software to the container’s filesystem. Hello can be run to validate that it works from within the container by using chroot to change to the root of the container before running.

In addition to basic shell commands, a couple of Buildah commands are used to add container-specific information to the working container: buildah config --label is used to add the “maintainer” label, and buildah config --entrypoint sets the entrypoint.

Finally buildah commit --format docker saves the container to a Docker compatible container image.

This is a simple example, but it gets the general idea across. Of course some software has not only build dependencies, but runtime dependencies, as well. For those use cases, packages can be installed directly into the container’s filesystem with the host’s package manager. For example: dnf install -y --installroot=${mountpoint}.

Drawbacks to this method

Building images this way has some drawbacks, though. By removing the development tools from the image, the compilation of the software is no longer entirely contained in the image itself. The constant refrain of the container evangelists - “Build and Run Anywhere!” - is no longer true 1. When the devel tools are moved to the host, obviously, they must exist on the host. A stock Atomic host has no *-devel packages, so using the method above to build images that require these packages is not practical 2. The container images are no longer reliably reproducible.

A whole new world … er … container

These problems can be solved by using another container to build the image. Rather than installing development tools - or even Buildah - on the host, they can be built into a “builder” image that’s tailored to the type of image being created. For example, a builder image with make, gcc, and any other dependencies can be created to compile GNU Hello. Another image could include php and composer to compile assets for a PHP-based project. A Ruby builder image can be used for Ruby-on-Rails projects. This makes the build environment both portable and reproducible. Any project can contain not only its source code, but also code to create its build environment and production image.

Continuing with the GNU Hello example, a container image with Buildah, make, gcc, gzip, and tar pre-installed can be run, mounting the host’s /var/lib/containers directory and the buildah script from above:

docker run -volume /var/lib/containers:/var/lib/containers:z \
  --volume $(pwd)/buildah-hello.sh:/build/buildah-hello.sh:z \
  --rm \
  --interactive \
  --privileged \
  --tty buildah-hello:latest /build/buildah-hello.sh

But there’s a catch, at least for now. As of August 2017, using Buildah in the container but not on the host creates an image that is difficult to interact with. The image is not available to the Docker daemon by default, because it’s in /var/lib/containers. Additionally, Buildah itself doesn’t yet support pushing to private registries that require authentication, so it’s challenging to get the image out of the container.

Skopeo, Buildah’s sister tool for moving images around, would be ideal for this. Afterall, that’s the Skopeo project’s …ahem… scope. Unfortunately, Buildah has a known issue that prevents Skopeo from pushing Buildah images to other locations, despite the fact that Skopeo can read and inspect the images.

There are some possible workarounds for now, though. First, if Buildah is installed on the host system, it will be able to read from /var/lib/containers (mounted into the container in the example above, allowing the resulting image to persist on the host), and the buildah push command from the host can copy the image to a local Docker daemon’s storage:

buildah push IMAGE:TAG docker-daemon:IMAGE:TAG

Optionally, if Docker is installed on the host system and in the build container, the host’s Docker socket can be mounted into the container, allowing Buildah to push to the host’s Docker daemon storage.

Buildah builds three ways

So, Buildah can be used to interact directly with the container using tools on the host system, but Buildah also supports other ways of building images. Using buildah bud or “build-using-dockerfile”, an image can be created as simply as using docker build. This method does not have the benefit of segregating development tools from the resulting production image; it’s doing the same exact things Docker would do. On the other hand, Buildah does not create and save intermediate images for each step, so builds are slightly to significantly faster using buildah bud over docker build (depending on the number of external blockers, ie: checking yum mirrors, waiting for code to compile, etc).

Buildah also has its own native commands for interacting with a container, such as buildah run, buildah add, and buildah copy, each generally equivalent to their Docker counterparts. In the examples above, buildah config has been used to set container settings such as labels and the entrypoint. These native commands make it easy to build containers without a Dockerfile, using whatever tool works best for the job - bash, make, etc - but without the full complexity of modifying the container filesystem directly as in the examples above.

Buildah FTW

Buildah is a solid alternative to Docker for building container images, and, as shown, makes it easy to create a container image that includes only the code and packages needed for production. The resulting images are smaller, builds are quicker, and there is less surface area for attack should the container be compromised3.

Using Buildah inside a container with development tools installed adds another layer of portability, allowing images to be built on any host with Runc, and optionally Docker, installed. With this model, the “build anywhere” model of the Dockerfile is maintained while still segregating all the build tools from the resulting image.

Overall, Buildah is a great new way to build container images, and makes it easy to build images faster and leaner. With its build-from-dockerfile support, Buildah makes it easy to be a drop-in replacement for the Docker daemon in build pipelines, and makes gradual migration to more sophisticated build practices less painless.

1: It’s not entirely true anyway, but by removing the build itself from inside the image, now it’s REALLY not true.

2: For the Atomic host example, you could take advantage of package layering to install the tools you need.

3: For whatever that buys you. It’s arguable that not including tools like make or gcc, etc, just adds a hurdle for an attacker but doesn’t actively make it any safer per se.

Header Image: By Pelf at en.wikipedia - Originally from en.wikipedia; description page is/was here., Public Domain, https://commons.wikimedia.org/w/index.php?curid=2747463, edited by Chris Collins