I recently upgraded our OpenShift Origin dev cluster from version 3.7 to 3.9, and for the most part things went smoothly. Version 3.9 brings with it a new version of Kubernetes, a whole host of bugfixes, and other goodies. A little while after the upgrade, however, one of the engineers doing testing for us noticed that he was unable to deploy any applications, noting:
Nothing is running. They all have events that say:
12:38:58 PM Warning Failed Scheduling 0/6 nodes are available: 6 MatchNodeSelector.
I’m unsure if this was a consequence of the actual upgrade or something else, but I was able to confirm with my own test application. Each build failed with the same error.
After diving in a bit and doing some research, I found that there was a
defaultNodeSelector set in
projectConfig section of the the master-config.yml:
projectConfig: defaultNodeSelector: node-role.kubernetes.io/compute=true
defaultNodeSelector is does exactly what it says: it sets the default node selector (think: label) for each project. This means all pods without a
nodeSelector will be deployed onto an OpenShift node with labels that match the
Unfortunately for us, none of our nodes had a label that matched
node-role.kubernetes.io/compute=true. During our initial install of OpenShift Origin 3.7, we used the suggested example labels in the Configuring Node Host Labels section of the Advanced Install documentation, eg:
region=primary for our standard nodes, and
region=infra for infrastructure nodes, with the intention that we’d change the region for nodes in other datacenters when we deployed them, or that we’d add extra labels to define special nodes (for compliance, etc).
I was able to verify the labels we did have applied to our nodes with the
oc get nodes --show-labels command.
[[email protected] Projects] $ oc get nodes --show-labels NAME STATUS AGE VERSION LABELS master-dev-01 Ready 54d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=master-dev-01,node-role.kubernetes.io/master=true,openshift-infra=apiserver,region=primary node-dev-01 Ready 11d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=node-dev-01,region=infra node-dev-02 Ready 11d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=node-dev-02,region=infra node-dev-03 Ready 11d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=node-dev-03,region=primary node-dev-04 Ready 11d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=node-dev-04,region=primary node-dev-05 Ready 11d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datacenter=lab,kubernetes.io/hostname=node-dev-05,region=primary
Some posts I found regarding the
defaultNodeSelector suggested setting it to a blank string, but I decided I’d rather go with the
region=primary label so we don’t accidentally get pods deployed onto new nodes that we want to spin up in the future. (Disclaimer: I am not 100% sure that’s how the empty string works – I need to do further research.)
After changing the master-config.yml file to use our chosen value, it was just a matter of restarting the Origin master service:
systemctl restart origin-master-controllers origin-master-api
With that done, I was able to kick off a new deploy, and watch as pods were scheduled onto the nodes with the
I am a little uneasy following all of this. I never was able to find out what caused the change in the master-config.yml. That value was unlikely to be set already (though I cannot say for sure), and there’s nowhere in the OpenShift-Ansible playbook referencing that value. One possibility is that the master-config.yml was replaced during the upgrade with a default from the Origin master container image2 , and then updated by the Ansible playbooks.
One of the drawbacks to the otherwise excellent OpenShift-Ansible advanced install process is that it’s not conducive to configuration management for the config files, as it generates new ones. I suppose one should use the actual Ansible playbooks as your configuration management – that’s what would be done with a standard Ansible-managed host – but it feels different somehow. Maybe that’s just me.
Finally, a semi-related point that was brought up by all of this is the need to have some better, more descriptive labels.
region=primary works for now, but we’ll be better off in the long run with labels that reflect more about the hosts themselves. Chalk that up to just getting a dev cluster up and running. Now we know what we need for production.
2. We’re running a fully containerized install of Origin on RHEL Atomic hosts. The install process copies files from inside the container to the host filesystem, and then mounts those files into the container so they can be managed like a traditional host.