I’ve noticed the ways in which I set up new Docker images have shifted the more I work with the technology. For example, when I first started with Docker, I put almost all my configurations into the Dockerfile. This is easy – and the way Docker suggests it on their site – and the biggest benefit is how each command ends up being it’s own layer, and they can be cached for quick building if you make a mistake. However, it gets kind of tedious trying to manage tons of bash commands or copy a bunch of files using the RUN and ADD commands. Also, each line counts against the layer limit (though hopefully that will be fixed in a newer version of Docker). The kicker though is that complex bash commands are just hard to pull off inside the Docker file, and lack the real flexibility that running a script offers. Summary: complex bash in a Dockerfile is complex.
So next I moved to having my Dockerfile copy in a large bash script, and run just run that script. This method has the advantage of easy configuration (Hey – 2 whole lines!) and a lot of flexibility. Unfortunately, there’s really only a single checkpoint layer to use with the Docker cache – any single change to the script and the entire image has to be rebuilt from scratch. This makes development time considerably longer, and quick development is one of the big selling points with Docker. Don’t get me wrong – spinning up a container is much quicker than configuring a whole server, but with this method of building, I end up spending a lot of time staring at the screen as my images build. The big, single script also contained a lot of code that ended up being very similar between different images. I’d copy whole swaths of useful base configs out into another giant script to run in another container.
So that brought me to the current way of doing things: copying in the entire image directory (the one that contains the Dockerfile) to /build on the container, and then having the Dockerfile run a few scripts to build the image (3, in fact) Each of THOSE scripts, in turn, run other scripts specifically designed to setup one aspect of the image (one for email, if needed, or syslog, etc). This has the advantage of letting me drop in only those scripts I need for that particular image, and makes the scripts themselves a little friendlier to look at. Unfortunately, it doesn’t solve the long build times, though. The gotcha is that I’m copying everything into /build, so I don’t have to enumerate every file I’m using withing the Dockerfile. But that means that each script I change rewinds Docker back to the ADD where it’s copied in, and every script has to run from scratch without cache.
This is now slowly leading me to have a separate script for every piece except that which distinguishes the container’s main function. This makes for very portable, easy to organize configurations, but requires that each image has dozens of scripts with it. Managing all these scripts is usually accomplished (in other technology worlds) with a central version control repository like git. That way you can clone the scripts you need, and they’re all maintained in one location for easy updating. This method has an inherent drawback too, though. Each Dockerfile and it’s resulting image are dependent on an entirely separate repo for them to work, or even build.
I thought about making a base image, similar to what the Phusion guys are doing with phusion/baseimage-docker. This would allow me to put all the bits that I copy into each image every time (the email, syslog, etc) into a single image, and then add on only the relevant bits for the each container’s main function. This makes managing each image easier, but also makes them less portable, since you have to have both the Dockerfile for the image you want, and the base image for it to pull from when it builds.
It’s not yet clear what the best way to accomplish this will be. I imagine there will be best practices for any given situation.
Fully generic, basic demo image shared with the world? All in the Dockerfile.
Complicated, multi-service images shared with the world? Many little scripts, or a single large one.
Many images with custom configs more easily managed for $WORK? Base Image.
I am looking forward to seeing how other people use Docker, and what evolves as the technology, and the community using it, matures.