Initial node becomes a single point of failure.

As of this writing, two scheduler pods and two controller pods are spun up on the first node and if that node fails, the entire cluster is unusable. Found that out the hard way.

I propose removing the master node after everything is booted. Something like:

kubectl drain "${MASTER_IP}"  # prevent new pods from getting scheduled on the old master
PODS_ON_OLD_MASTER=$(kubectl -n kube-system get pods -o json | jq -r '.items[] | select(.status.hostIP == "${MASTER_IP}") | .metadata.name')  # yeahh check out that jq query

# First delete just one scheduler, the other scheduler will re-schedule it on another node
SCHEDULER=$(echo $PODS_ON_MASTER | grep scheduler | head -n 1)
kubectl -n kube-system delete pod $SCHEDULER
# Wait for the kube-scheduler deployment to come back fully
kubectl -n kube-system rollout status -w kube-scheduler 

# Same thing with controllers
CONTROLLER=$(echo $PODS_ON_MASTER | grep controller | head -n 1)
kubectl -n kube-system delete $CONTROLLER
# Wait for kube-controller-manager deployment to come back fully
kubectl -n kube-system rollout status -w kube-controller-manager

# Delete all the other pods on the node - is this needed? Every time I try to delete a node it's a mess.
kubectl -n kube-system delete pod "${PODS_ON_MASTER}"

kubectl delete node "${MASTER_IP}"

# Delete the DNS entries
# Remove the VM in vultr (may have to wait if the node hasn't been running for 5 minutes)

Admittedly, it would be perfectly reasonable to just do kubectl uncordon on the node after moving one scheduler and one controller and not delete the node. Consider doing that.

Edited Oct 03, 2018 by Finn

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information