Statement of GPG Key Transition

Hash: SHA1,SHA512

Fri Dec 9 11:49:22 EST 2016

Statement of GPG Key Transition

In order to replace my older DSA-1024 key, I have set up a new OpenPGP key, and will be transitioning away from my old key.

The old key will continue to be valid until 2017-06-01, but future correspondince should come to the new key. I would like the new key to be integrated into the web of trust.

This message is signed by both keys to certify the transition.

The old key was:

pub dsa1024/B5EE841627F7BF37 2008-08-26 Christopher Collins
Primary key fingerprint: 69E6 0653 A1A3 0600 ADB2 B3AD B5EE 8416 27F7 BF37

And the new key is:

pub rsa2048/F5752BA146234FD4 2016-12-09 Christopher L. Collins
Primary key fingerprint: 923E 0218 77DB 3F70 F614 6F62 F575 2BA1 4623 4FD4

To fetch my key from a public key server, you can do:

gpg –keyserver –recv-key F5752BA146234FD4

If you have my old key, you can verify the new key is signed by the old one:

gpg –check-sigs F5752BA146234FD4

To double-check the fingerpring against the one above:

gpg –fingerprint F5752BA146234FD4

Finally, once you are satisfied this key represents me and the UIDs match what you expect, please sign my key, if you don’t mind:

gpg –sign-key F5752BA146234FD4

Thank you, and sorry for any inconvenience.




[UPDATE] Disk partition scheme for RHEL Atomic Hosts

Back in February of this year, I wrote a short piece about the partition scheme I was considering for our RHEL Atomic Host systems at work.  Six months on, I’ve got some real-world experience with the scheme – and it’s time to re-think some things.


Most of the assumptions remain valid from the last outing in this area – ie: Atomic hosts using a versioned filesystem tree (OSTree) the recommended practice of using direct-lvm  storage for containers and images on production systems, and the need for persistent data.  Really, all that has changed was a better understanding of the size of the storage and how it should be allocated, based on our usage in production for the last six months.

The biggest incorrect assumption from back in the golden days of blissful pre-production was the assertion that the root partition would only need 6G of storage by default.  At the time, the Atomic hosts were shipping with a little less than 3G of packages and whatnot in the OSTree, so I reasoned that double that amount would be fine, allowing us to download a new tree and reboot into it.  Since the majority of the filesystem is read-only and most of the activity was going to occur in the container Thin Pool storage or the persistent data partition, that’s all I thought we’d need.

That, it turns out, was a naive assumption.  The OSTree used by the Atomic hosts is larger now, and that in and of itself would be enough to tip the scales, but we’ve also had problems with a lack of log space for host logs, and container files that aren’t necessarily stored in the Thin Pool (their own logs, for a start*).

* Note: The default logging service for containers in the latest versions of Atomic default to the host journald now, so individual logs are no longer an issue, but the point stands as they’re logged to the host journals now.

I also assumed that the 4G LVM Thin Pool allocation was enough, since it could be expanded as needed.  At the time, most of our containerized services were small, but we quickly started deploying larger services, and it seemed like the OPs guys were being paged every day to add disks to our thin pools.

The only thing I really got dead-on was the persistent storage.  Our services VERY rarely need 15G, and they fit comfortably, but not overly spaciously, in that space.  In the original scheme, though, I put this into it’s own Volume Group, which ended up making it less convenient to expand the storage.  Being in it’s own VG prevents us from adding a single disk and expanding both persistent storage and the root or Thin Pool allocation.  This lead to a ridiculous amount of relatively small virtual disks attached to each system.

Finally, a wider, department-wide, decision was made to increase the default storage size of all new virtual machines from 25G to 50G removed the need to justify using larger disks if needed, and let me now design a scheme to make use of the default size.

The Partition Scheme

That experience has lead to our Partition Scheme v2, making better use of LVM and less concerned with the physical disks:

Physical Disks

50G in total

  • /dev/sda (10G)
  • /dev/sdb (40G)


Once single Volume Group – “atomicos” (the default on RHEL Atomic hosts out of the box) – with three logical volumes:

  • 15G atomicos-root ( / )
  • 15G atomicos-srv ( /var/srv, for persistent data)
  • Thin-provisioned atomicos-docker-pool (LVM Thin Pool)

I’m still using the Atomic vSphere image, and as before, the disk size within the image is 10G – where /dev/sda comes from.   It’s easy enough to add a 40G additional disk, and use it to expand the default “atomicos” Volume Group to 50G.

The Method

Ansible supplants Docker-Storage-Setup

I initially used the docker-storage-setup tool to modify the size of the root partition and configure the Thin Pool.  I was focused on using cloud-init for all of the host configuration, and this was the easiest method.  Now, however, I’ve built out an Ansible infrastructure to do the initial configuration of our Atomic hosts, and use cloud-init only to pass in the SSH keys used to run Ansible.  This ended up being much more convenient, as we could re-run the playbooks to update the hosts’ configurations as needed.

The out-of-the-box disk configuration for RHEL Atomic Host takes care of the thin pool setup, so we only need to add /dev/sdb to the VG, and create/expand/format the LVM partitions.  This is easily accomplished with just a few lines of code:


  ## Disk Config
 - name: expand vg with extra disks
   lvg: vg=atomicos pvs=/dev/sda1 /dev/sdb

 - name: expand the lvm
   lvol: vg=atomicos lv=root size=15g

 - name: grow fs for root
   filesystem: fstype=xfs dev=/dev/mapper/atomicos-root resizefs=yes

 - name: create srv lvm
   lvol: vg=atomicos lv=srv size=15g

 - name: format fs for srv
   filesystem: fstype=xfs dev=/dev/mapper/atomicos-srv resizefs=no

 - name: mount srv
   mount: name=/var/srv src=/dev/mapper/atomicos-srv fstype=xfs state=mounted opts='defaults'

 ## This is a workaround for XFS bug (only grows if mounted)
 - name: grow fs for srv
   filesystem: fstype=xfs dev=/dev/mapper/atomicos-srv resizefs=yes


Back to the Future (and back again)

Plus ça change, plus c’est la même chose
 Jean-Baptiste Alphonse Karr

This is the plan for RHEL Atomic hosts for the near future.  At the moment, our services are being deployed on individual,small-sized hosts and managed by directly talking to the remote Docker daemon’s API.  We’re using an orchestration tool we developed in-house early on in our container journey.

However… Orchestration is King now.  Containers are hard to work with as a human being, once you get to complexity or scale, and a variety of orchestration tools have come into their own in the last few years.  And orchestration naturally lends itself to clustering.  And clustering naturally lends itself to GINORMOUS servers running lots of services.

Everything old is new again, and it quite possible in the near future we’ll be dealing with a few hundred clustered servers managed by some more standardized orchestration tool. In this case, it’s likely that a lot of this partitioning becomes less and less important, and more efficient.  We’d still make use of a small 15-ishG root partition, but the thin pools and persistent storage would be considerably larger.

Or, does that even work that way at scale?  If the container images share layers, then at scale, each container’s images would be a much smaller fraction of the total.  100 containers sharing 90% of the layers could still fit into a small-ish size.  Perhaps at scale, the thin pool would be only a dozen or so gigabytes larger in size.

Persistent storage ends up becoming a more important matter, and less and less likely to exist on the host at scale.  This would be the time to explore NFS mounts, or Ceph storage, and remove the persistence from the host entirely.  And realistically, with Gluster or Ceph storage drivers for your container engine, even the Thin Pool may not be necessary.  Are we looking at 25G storage attached to 100GB RAM systems managed by OpenShift/Kubernetes in our near future?  It seems likely.

Like it’s predecessor, v2 of our partition scheme is likely to change.



When Systemd, Docker and Golang Butt Heads


Image by Marius Kallhardt from near Bremen, Germany
Creative Commons Attribution-Share Alike 2.0 Generic License

We ran into a fun little bug this week at work that took a good while to track down to it’s source.  Imagine this scenario:

We start to receive reports that Docker has restarted, causing containers running on the hosts to restart, sporadically across our development and testing environments.  After some investigation, we tie this to when puppet has run on these servers.  It’s not immediately apparent why, and successive puppet runs don’t cause the same behavior.

Eventually we realize that it was related to a change we made for our systemd-journald configuration, that was being pushed out during the problematic puppet runs.

Our RHEL7 servers have not been configured to maintain a persistent journal. That is, by default, the journal is written to /run/systemd/journald, and is refreshed (lost) with every reboot.  We decided to configure the journal to maintain logs for several boots, and did so by setting it up in puppet and pushing out the change, complete with a notify to systemd-journald, to restart the service.  This was pushed to dev, and shortly after, test.

However, despite the fact that we knew it was related to the journald change, we could not reliably cause it to happen.  Converting a box to persistent journals and restarting journald wouldn’t immediately cause Docker to fall over – it would take a few minutes before the service died.

Then it got even weirder.  We realized that no changes actually had to happen – we only needed to restart systemd-journald to cause the issues with Docker.  And interestingly, we could get Docker to crash by sending any three Docker commands.  One `docker ps`?  Everything is fine.  Two?  No problem.  Three? KA-BOOM!

After this behavior was finally identified (and it took a while – it’s to troubleshoot something when it only fails the *third time* you try it), some Googling lead us to a bug report already filed with Docker where Dan Walsh (@rhatdan) explained:

…when you run a unit file systemd hooks up stdout/stderr to the journal, if the journal goes away.<sic> These sockets will get closed, and when you write to them you will get a sigpipe…

…Golang [less than v 1.6] has some wacky code that ignores SIGPIPE 10 times and then dies, no matter what you do in your code.

There’s the three times-ish.  STDOUT and STDERR are written to by Docker when you issue a Docker command, and three commands cause Docker to crash.  And yes, I know my math adds up to nine, not ten.  From what I can tell our automation was also calling to the Docker API during the time we were testing, which was why we were seeing three as the limit.

The good news is there appear to be a plethora of patches making their way into the world.  A fix/workaround was added to Systemd by Lennart Pottering back in January,  Golang 1.6 will not suffer from the issue, and Red Hat has apparently patched Docker 1.9 and will be pushing that out, hopefully, early in April.


Disk partition scheme for RHEL Atomic Hosts

I’ve been working on what will likely be the production disk partition system for our RHEL Atomic Host systems at work.  There’s a bit of a balancing act to this setup, with three things to take into consideration.


First, since these are Atomic hosts, most of the system is made up of a versioned filesystem tree (OSTree).  The OSTree manages all the packages on the system and so there is not much need to mess with the root partition.  It does not take up much space by default – about 1.6 G with the current and last OSTree.

Second, Atomic hosts are designed to run Docker containers.  Docker recommends using direct-lvm on production systems.  An LVM thin pool is created on block devices directly and used to store the image layers.  Each layer is a snapshot created from their parent images, including container layers – they are snapshots of their parent images as well.  Some free space is needed with which to create this thin pool.

Finally, for many services hosted in containers, there has to be a way to store persistent data.  What is considered persistent data varies by the type of service.  Consider, for example, user-uploaded content for a Drupal website, or custom configuration files telling a proxy server how it works, or database data files.  This persistent data needs to live somewhere.

The Partition Scheme

Given all this, it seems the best partition scheme for our use is the following:


  • /dev/sda1 – / (6G)
  • LVM Thin Pool – /var/lib/docker (4G †)


  • /dev/sdb1 – /var/srv (symlinked to /srv in Atomic, 15G †)

† sizes of these disks could be expanded as needed
‡ /dev/sdb could be replaced with an NFS mount at /var/srv

Our environment is based on the Atomic vSphere image and new Atomic hosts are created from this image.  The disk size within the image is 10G, which is where the size of /dev/sda comes from.  This could be expanded using vmkfstools before the VM is powered on, if needed.  In practice however, 10G covers a lot the minor services that are deployed, and if more space is needed, the LVM pool can be expanded onto another disk while the system is online, and provide more space for images.

The default size of the root partition in Atomic is 3G.  With two OSTrees installed, almost half of that is used up.  It’s useful to expand this to provide some headroom to store the last tree and some logs and incidental data.


Luckily a helper tool, docker-storage-setup, is included in the docker rpm to not only expand the root partition, but also set up the thin pool and configure Docker to use direct-lvm. Docker-storage-setup is a service that runs prior to the Docker service.  To expand the root size to 6G, add the following to /etc/sysconfig/docker-storage-setup.

# /etc/sysconfig/docker-storage-setup

This file is read by docker-storage-setup each time it runs.  It can be used to specify the default root size, which block devices or volume groups are to be included in the thin pool, how much space is reserved for data and metadata in the thin pool, etc..

(More information about these options can be found in /usr/bin/docker-storage-setup.)

By only setting ROOT_SIZE, docker-storage-setup is allowed to expand the root partition to 6G, and use the rest of /dev/sda for the thin pool.

Persistent Data

Persistent data is special.  It is arguably the only important data on the entire host.  The host itself is completely throw-away;  a new one can be spun up, configured and put into service in less than 10 minutes.  They are designed for nothing more in life than hosting containers.

Images and containers are similarly unimportant.  New images can be pull quickly from a registry in minutes or seconds, and they contain immutable data in any case.

Containers could be considered more important, but if their ephemeral nature is preserved – ie.  nothing important goes into a container – all persistent data is mounted in or stored elsewhere – then they, too are truly unimportant.

So the persistent data lives on another physical disk, and is mounted as a volume into the Docker containers.  It could go somewhere in the root partition, but since the root partition is managed by the OSTree, it’s essentially generic and disposable.  By mounting a dedicated disk for persistent data, we can treat it separately from the rest of the system.

We use the second physical disk so we can then move the disk around to any other Atomic host and the service can be immediately available on the new host.  We can rip out a damaged or compromised root partition and attach the persistent data disk to a fresh install within a few minutes.  Effectively, the persistent data is completely divorced from the host.

The second physical disk can also be left out completely, and an NFS share (or other file store) mounted in it’s place, allowing for load-balancing and automatic scaling.  The NFS share makes it possible to present the data to customers without giving them access to the host directly.

LVM for Change

No battle plan ever survives contact with the enemy.
Helmuth von Moltke the Elder

As always happens, things change.  What works now may not work in a year.  The root filesystem and Docker image thin pools are created with LVM by Atomic, allowing us to expand them easily as necessary.  The second physical disk is given it’s own volume group and logical volume, to allow it to also be expanded easily if we run out of space for persistent data.  Every part of the Atomic host uses LVMs – it’s a key to making the whole system extremely flexible.

A Word of Caution

So far the system is relatively painless to use with a single exception:  measuring the data usage of the thin pool.  It is  important to track the keep track of how much free space is left in the thin pool for both the data and the metadata.  According to Red Hat:

If the LVM thin pool runs out of space it will lead to a failure because the XFS file system underlying the LVM thin pool will be retrying indefinitely in response to any I/O errors.

You should be able to see the amount of space used by the thin pool with the `lvs` command.  However, with the systems I’ve tried (both Atomic and standard RHEL7), the data is left blank:


I have not yet been able to figure out why this is the case. As a workaround, though, `docker info` can be used to gather the information.  Note the “Data Space Used” and “Metadata Space Used” in the image below.

Screenshot from 2016-02-16 16-49-16




Quick Tip – Docker ENV variables

It took me a little while to notice what was happening here, so I’m writing it down in case someone else needs it.


Consider this example Dockerfile:

FROM centos:centos7
MAINTAINER Chris Collins

ENV VAR1="foo"
ENV VAR2="bar"

It’s common practice to collapse the ENV lines into a single line, to save a layer:

FROM centos:centos7
MAINTAINER Chris Collins

ENV VAR1="foo" \

And after building an image from either of these Dockerfiles, the variables are available inside the container:

[[email protected] envtest]$ docker run -it envtest bash
[[email protected] /]# echo $VAR1
[[email protected] /]# echo $VAR2

I’ve also tried to use ENV vars to create other variables, like you can do with bash:

FROM centos:centos7
MAINTAINER Chris Collins

ENV VAR1="foo" \
 VAR2="Var 1 was set to: ${VAR1}"

This doesn’t work, though.  I assume $VAR1 is not set yet when Docker builds the layer, so it cannot be used in $VAR2.

[[email protected] envtest]$ docker run -it envtest bash
[[email protected] /]# echo $VAR1
[[email protected] /]# echo $VAR2
Var 1 was set to:

Using a single line for each ENV does work, though, as the previous layer has been parsed and added to the environment.

FROM centos:centos7
MAINTAINER Chris Collins
ENV VAR1="foo" 
ENV VAR2="Var 1 was set to: ${VAR1}"

[[email protected] envtest]$ docker run -it envtest bash
[[email protected] /]# echo $VAR1
[[email protected] /]# echo $VAR2
Var 1 was set to: foo

So, while it makes sense to try to collapse ENV lines, to save layers**, there are definitely cases where you’d want to separate them.  I am using this in a Ruby-on-Rails image:

ENV RUBYPKGS='ruby2.1 mod_passenger rubygem-passenger ruby-devel mysql-devel libxml2-devel libxslt-devel gcc gcc-c++' \
    PATH="/opt/ruby-2.1/bin:$PATH" \

ENV APPENV='test' \
    APPDIR='/var/www/current' \
    LOGDIR='/var/log/rails' \


A logical separation of sections is helpful here – the first ENV is for system stuff, the second for generic application setup on the host, and the third to set the application environments themselves.

**I have heard rumblings that in future versions of Docker, the ENV stuff will not be a layer – more like metadata, I think.  If that is the case, the need to collapse the lines will be obsoleted.

Apache HTTPS configuration – June 2015

HTTPS is HTTP over TLS.  It allows you to encrypt traffic to and from your web server, providing privacy and security for your clients.  As of this writing, the world is moving ever closer to HTTPS everywhere: thanks to the Snowden documents, there’s been a big push for more privacy and security.  Major companies like Google and Mozilla are securing traffic by default for all their applications.  Cloudflare is offering free HTTPS encryption between clients and their severs.  Let’sEncrypt, a new Certificate Authority offering free, secure certificates is scheduled to open it’s doors in September.
SSLLabs Test A Grade
If you run a webserver, you should be offering HTTPS, and perhaps even forcing HTTPS-only traffic.  This article is about how to configure Apache for HTTPS authentication, supporting modern cipher suites and TLS protocols.  The goal is an “A” rating by the SSLLabs test (

Note:  These are recommended HTTPS configurations as of June 2015.  If you’re reading this more than six months later, it’s almost certainly out of date.

There are four categories to the SSLLabs test: Certificate, Protocol Support, Key Exchange, Cipher Strength.  We’ll cover best practices for each in order.


Most of the certificate configuration information is relatively well known.  I’ve included it for completeness if you want or need to read it in Appendix 3: General Certificate Information.

One of the more usual stumbling block for the Certificate section is the certificate chain, so I’ll keep this up here:

Have a complete certificate chain

This can be a tricky part for people new to HTTPS.  Due to the nature of certificates, each Certificate Authority (CA) is verified as trusted by their own CA.  This forms a trust chain from the Root CA certificate, down through each intermediate CA certificate, to your own certificate.  If this chain is broken, your browser cannot verify whether or not your certificate is trusted.  Fortunately, most of the Root and many of the intermediate Certificate Authorities’ certificates are usually included in the CA Bundle for your server by default.  Sometimes, however, you may need to add your CA’s intermediate certificate to the chain to complete it.  You can do this by copying the intermediate certificate to your server (your CA can provide it to you), and using the Apache “SSLCACertificateFile” directive:

SSLCACertificateFile /path/to/the/intermediate/cert
This can be added to your SSL configuration file (/etc/httpd/conf.d/ssl.conf on Red Hat-based systems), or individual Virtual Hosts if they have their own separate SSL configurations.

Protocol Support

Protocol Support is relatively straightforward.  Each TLS Protocol describes how the cryptographic algorithms are used between the client and the server.  As time has gone by, some of these protocols have been found to be insecure, so in order to protect your data in transit, and also receive a good score on the SSLLabs test, you must enable the “good” protocols and disable the insecure ones.
To do this with Apache, use the “SSLProtocol” directive, and add it to your SSL configuration file:
SSLProtocol +TLSv1 +TLSv1.1 +TLSv1.2 -SSLv2 -SSLv3
This enables TLS versions 1.0, 1.1 and 1.2, and disables the known-insecure SSLv2 and SSLv3 protocols.

Note: It’s possible to get a higher score on the SSLLabs test, and remove the slightly less secure TLS 1.0 protocol by changing +TLSv1 to -TLSv1.  However, as of June 2015, about 30% of browsers out there still support only TLS 1.0, namely Android < 4.4, and IE < 11.  This means users with those browsers will be unable to connect to your server if you disable TLS 1.0.  Hopefully the use of those older browsers will be reduced quickly.

Key Exchange

The best way to get a good score for the Key Exchange category and add security to your HTTPS connection is to use a key with a length of at least 4096 bits, not allow anonymous key exchange and not use a weak (Debian OpenSSL flaw) key.

4096 Bit Key

This is easy.  Generate your key with 4096 bits.  If you’re doing it manually, with the OpenSSL command, you’d simply specify 4096 as the key length.
openssl genrsa -out  <name for your key file> 4096

Disable Anonymous Key Exchange

Covered in the Cipher Strength below

No Weak Key (Debian OpenSSL flaw)

This is an older bug in Debian’s OpenSSL package. If you’re using a Debian-based system, update to the latest OpenSSL package before generating your key, and you’re good to go.

Cipher Strength

There are dozens of Ciphers supported by the OpenSSL packages.  In order to secure your traffic, you should enable only the most secure ciphers available in your OpenSSL package.  The easiest way to get a list of trusted ciphers is to follow’s recommendations for the Modern Compatibility Cipher Suites (
In order to configure Apache to use the recommended ciphers as of June 2015, modify the “SSLCipherSuite” directive in your SSL configuration as follows:
This supports the majority of modern browsers.  As with the SSLProtocol above, you can take it a step further and remove some of the less secure ciphers from this list to get a better score and better protect your traffic, but a larger portion of browsers will be unable to connect to your server.   If you’ve already disabled TLS 1.0, then that may not be an issue for you.


This information covers the basic configurations for setting up an Apache server with HTTPS support, and making sure it’s acceptably secure.  Using insecure HTTPS settings is effectively just as bad as using no HTTPS – maybe more so if you lull your clients into a false sense of security – so making sure you stay up-to-date with vulnerabilities is extremely important.
As mentioned previously, this is valid as of June 2015.  The older this article gets, the more out of date these recommendations are.  By 2016, you should probably verify the information here to make sure it’s accurate.

Appendix 1: Perfect Forward Secrecy

Another beneficial security, and one required for an “A” grade from SSLLabs, is Perfect Forward Secrecy (PFS).  PFS is a protocol that protects data transmission in the event that one of the keys used is compromised in the future.  (Check the “Further Reading” section for more specific details).
Until relatively recently, the version of OpenSSL shipped with some of the modern distributions of Linux did not support the ciphers required for PFS.  The list of cipher suites in the Cipher Strength section includes ciphers that support PFS, but in order to make sure it’s used, you have to require that the cipher order is honored (ie: use the best first; lesser only if the client cannot interact with the best).  To do that with Apache, set the SSLHonorCipherOrder directive in your SSL configuration file:
SSLHonorCipherOrder on
If your version of OpenSSL does not support the more secure ciphers, this will not break anything – they just will not be used.  However, your server will not support Perfect Forward Secrecy either.

Appendix 2: Server Name Indication

Server Name Indication (SNI) is an extension of the TLS protocol that allows a client to send a request to the server that informs the server of the hostname the browser is attempting to connect to, without the server having to find a TLS key with which to decrypt the traffic first.
Before SNI, there was no way to differentiate what host the client was attempting to connect to before the TLS decryption  occurred, so Apache could not tell which host to direct traffic to.  This meant each HTTPS enabled site had to have it’s own IP address, so traffic was routed via IP instead.
Functionally, this allows Apache to host more than a single HTTPS enabled site per IP address.
If you are using SNI, it’s worth noting that SSLLabs does a check for “Incorrect SNI Alerts”.  These alerts are sent by the server if an SNI-enabled server sends a certificate which contains Subject or Subject Alternative Names for which the server or or it’s virtual hosts are not configured.
For example:  If your certificate included “” and “”, and was used with a Virtual Host with no ServerName or ServerAlias directives setup for “” or “”, this would trigger the “Incorrect SNI Alert”.
The same thing would happen if your host was configured with a ServerName for just one of the two Subject names included in the certificate.

Note: This is not the same thing as the certificate not matching the domain.  That is a separate issue, and discussed in Appendix 3: General Certificate Information.

To fix Incorrect SNI Alerts, the Virtual Host or server responding to the SSL request MUST have the ServerName directive set for the primary Subject name, and ServerAlias directives for ALL of the other Subject Alternative Names in the certificate.

Appendix 3:  General Certificate Information

The certificate section is probably the easiest to get setup correctly.  To score well, you need to meet a couple of criteria.  The certificate must:

Match the domain name of the site it’s used on

This simply means you must use a certificate that matches your domain name.  A certificate for “” does NOT match the “” domain, and vice versa.
A certificate CAN have multiple subjects, through the use of Subject Alternative Names, so your cert can include both “” and “”, or more.

Not be expired, revoked, or not yet valid

This is easy.  When you get a cert, it will be valid for a specific period of time.  Chances are it won’t be valid starting in the future, so you’re OK there.  As long as you replace it with a new one before it expires, and don’t use a certificate that’s been revoked, that should cover the rest.

Be signed by a trusted Certificate Authority

A trusted Certificate Authority is one that’s included in trust stores by general community consent.  Your Certificate Authority derives it’s trust from it’s Certificate Authority, and on up the line.  If you are unsure how to find a trusted Certificate Authority, use Let’sEncrypt – their certificates are also signed by IdenTRUST (

Use a secure certificate signature

Your Certificate Authority should sign your certificate with a secure signature  (not MD2 or MD5, etc).   If they do not, find another CA.

Using Docker and AWS to Survive an Outage

Last week at $WORK, we suffered from an outage that slowed down a large part of our network and took down our main website for both internal and external customers.  We were under a distributed denial of service attack focused on the website itself.  The site is load-balanced, and this resulted in slowdowns or outages for all the services behind the load balancers, as well.

While folks were bouncing ideas around on how to bring the site up again while still struggling with the outage, I mentioned that I could pretty quickly migrate the site over to Amazon Web Services and run it in Docker containers there. The higher-ups gave me the go-ahead and a credit card (very important, heh) and told me to get it setup.  The idea was to have it there so we could fail over to the cloud if we were unable to resolve the outage in a reasonable time.

TL;DR – I did, it was easy, and we failed over all external traffic to the cloud. Details below.

Amazon Web Services

DockerDespite having a credit card and a pretty high blanket “OK”, I wanted to make sure we didn’t spend any money unless it was absolutely necessary. To that end, I created three of the “free tier” EC2 instances (1GB RAM, 1 CPU, 10GB Storage) rather than one or more larger instances. After all, these servers were going to be doing one thing and one thing only – running Docker. I took all the defaults, except two. First, I opted to use RHEL7 as the OS. We use Red Hat at work, so I’m familiar with it (and let’s be honest, it works really well), especially where setting up Docker comes in. Second, I set up a security group that allowed only HTTP/HTTPS traffic to the EC2 instances, and SSH access only from $WORK. Security groups are like a logical firewall, I guess – run by Amazon in front of the servers themselves.

The EC2 instances started almost immediately, and I logged in via SSH using the key pair I created for this project. The first thing I did was augment the security group by setting the IPTables firewall on the hosts themselves to match: SSH from $WORK only, drop everything else, even pings.  You know, just in case.

Note: Since I was planning to use Docker to run the website, I didn’t need to add IPTables rules for HTTP/HTTPS. Docker uses the FORWARD chain, since it NATs from the host IP to the containers, and Docker has the ability to add and remove rules from the chain itself as needed.

Next, I ran a quick *yum update* to get the latest patches on the EC2 instance. It wasn’t terribly out of date, so this was quick.

Now to the meat of things. I didn’t really want to muck about with repos or try to find which one was required to install the Docker RPM, so I just copied the RPM for Docker from our local repository. The RPM is packaged upstream by Red Hat, and includes Docker 1.2.1. Even though I wanted to use Docker 1.4.1, the older RPM version is no big deal – I just installed it to get the basic config files – systemd service files, sysconfig, etc. Once the RPM was installed, I downloaded the Docker 1.4.1 binary from, and replaced the 1.2.1 binary from the RPM. Presto! Latest Docker with the handy *docker exec* command! At this point, the server itself was basically done, and I moved on to setting up the Docker image.

Time spent so far: About 5 minutes


Now, I didn’t have an image for our website ready to go or anything – I was going to have to build it from scratch.  However, I’ve been lucky enough to be allowed to play around with Docker at $WORK, and had already done some generic Images for web stacks for our public DockerDemos project (, so I was familiar with what I’d need to build the image for our site. I wrote a Dockerfile and built the image on my local laptop to test it. I went through a few revisions to get it perfect, but it only took about 15 minutes to write it from scratch. Once that was ready, I copied the Dockerfile and supporting files up to the EC2 servers, and built the images there. With the magic that is Docker and Linux containers, everything functioned exactly as it did on my laptop, and in a few seconds all three EC2 instances had the website image ready to go.

The final step was to run the container from the image. On all three of the EC2 instances, I ran:

docker run --name website -p 80:80 -p 443:443 -d website && \
docker logs -f website

The first command immediately started up the web servers inside the containers and started to sync their content, and the second opened up STDOUT inside the container so I could watch the progress. In a minute or two the sync was done, and the servers were online!

Note: The “sync” I’m talking about is part of how our website works, not something related to Docker itself.

Total time spent: About 25 minutes

So, in one fell swoop – about a half hour – I was able to create three servers running a Docker image to serve our main website, from scratch. It’s a good thing, too. It wasn’t long before we made the call to fail over, and currently all of our external traffic to the site is being served by these three containers.

That seems cool, no?  But check this out:

Sunday night, I needed to add more servers to the rotation. It was late. I was cranky to have been called after hours. I logged into AWS and used the EC2 “Create Image” feature to commit one of the running instances to a custom image (took about a minute). Then, I spun up three more EC2 instances from that image. They started up as quickly as a normal EC2 instance, and contained all the work I’d already done to set up the first servers, including the Docker package, binary, and image. Once they were up, all I had to do was run the *docker run* command again, and they were ready to go. Elapsed time?

2 minutes

It took longer for the 5 minute time-to-live on our DNS entry to expire.

Docker is Awesome. With AWS, it’s Awesome-er. I’m trying to convince folks that we should leave all of our external traffic to be served by Docker in AWS, and to migrate more sites out there. At the very least, it’s extremely flexible and allows us to respond to issues on a whole different timescale than we could before.

Oh, and an added bonus? All of our external monitoring (in multiple sites across the country) report that our page load speeds have improved 3x compared to what they were on the servers hosted in-house with regular non-Docker setups. I’m investigating what is giving us that increase this week.

Oh, and a second added bonus? For the last five days, our bill from Amazon for hosting our main website is a whopping *$4.69*. That’s a cup of crappy venti mochachino soy caramel crumble arabian dark coffee (or whatever) at the local coffee chain. And I can do without the calories.


Well, it’s been six months since this little adventure took place.  Since then, this solution has worked so well that we left all of our external traffic pointing to these instances at AWS.  Arguably, things have gotten even easier the more I work with both AWS and Amazon.  To that point:

  1. The Docker approach worked so well that we replaced all of the servers hosting our website internally with basic RHEL7 servers running Docker containers.  The servers are considerably more lightweight than they used to be, and as such we can get better performance out of them.
  2. I’ve since added the new(-ish) Docker flag –restart=always to the deploy command.  This saves me the step of even having to start the containers on reboot.
  3. I setup all the hosts to use the Docker API, and TLS authentication, so I can upload new images and start and stop containers on each host without even needing to login to them.  (This required the opening of port 2376 to $WORK in the security group and host firewall, fyi.)
  4. I wrote a couple of simple bash scripts to re-build the image as needed, and deploy locally for testing.  With the portable nature of Docker images, it’s extremely easy for me to test all changes before I push them out.
  5. Rotating the instances in and out of production at AWS is extremely simple with Amazon’s Elastic IP Addresses.  We are able to rotate a host out of service and instantly replace it with another, allowing us to patch them all with zero downtime.
  6. Amazon’s API is a wonderful thing.  I can manage the entire thing with some python scripts on my laptop, or the convenient Amazon CLI package.

Docker and AWS have proven themselves to me, and though this process, the higher-ups at $WORK.  We’re embracing Docker whole-heartedly in our datacenters here, and we’ve moved a number of services to AWS now, as well.  The ease and flexibility of both is a boon to us, and to our clients, and it’s starting to transform the way we do things in IT – the way we do everything in IT.