TL;DR

here's a handy one-liner to pin a running pod to the node it's currently on:

kubectl patch deployment -n $NAMESPACE $DEPLOYMENT -p '{"spec": {"template": {"spec": {"nodeSelector": {"kubernetes.io/hostname": "'$(kubectl get pods -n $NAMESPACE -o jsonpath='{ ..nodeName }')'"}}}}}' || (echo Failed to identify current node of $DEPLOYMENT pod; exit 1)

The long version

I've been supporting with the Portainer team with a helm chart for their new v2, Kubernetes-supporting version. Recently the boss told me:

"Sometimes, when using one of these small/development, multi-node Kubernetes clusters like k3s or microk8s, Kubernetes will schedule the pod to a particular node, but when the pod moves to a different node, the data is lost. Find a way to ensure that the pod always remains on the same node"!

"Nonsense", I replied. "The Kubernetes storage provisioner will be smart enough to ensure that an allocated PV doesn't just move to a different node". And to prove how smart I was, I illustrated by creating a multi-node KinD cluster:

❯ cat kind.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker

❯ kind create cluster --config kind.yaml
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.19.1) ?
 ✓ Preparing nodes ? ? ?
 ✓ Writing configuration ?
 ✓ Starting control-plane ?️
 ✓ Installing CNI ?
 ✓ Installing StorageClass ?
 ✓ Joining worker nodes ?
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! ?

I created the namespace, added the helm repo, and deployed the chart:


❯ kubectl create namespace portainer
namespace/portainer created
❯ helm repo add portainer https://portainer.github.io/k8s/
❯ helm repo update
❯ helm upgrade --install -n portainer portainer portainer/portainer
Release "portainer" does not exist. Installing it now.
NAME: portainer
LAST DEPLOYED: Wed Dec  9 21:08:09 2020
NAMESPACE: portainer
STATUS: deployed
REVISION: 1
NOTES:
1. Get the application URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace portainer -o jsonpath="{.spec.ports[0].nodePort}" services portainer)
  export NODE_IP=$(kubectl get nodes --namespace portainer -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT

I examined the PV created by the deployment and saw, as expected, a nodeSelector:

> kubectl get pv -o yaml
<snip>
nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - kind-worker

"Boom!", I said. "There's no problem, because Kubernetes won't let the pod run on a different node, due to the nodeSelector".

Not so fast!

"Try microk8s", the boss said, "it happens all the time..."

So I did. Grumbling about how much harder it is to setup a multi-node microk8s environment, I used Multipass to create 2 Ubuntu 20.04 VMs, and then followed the instructions re setting up a microk8s cluster.

Sure enough, as it turns out, when I examined the microk8s PV, there was no nodeSelector. Microk8s, it turns out, uses a simple hostPath-type provisioner!

Where's my data?

So this presents a problem for any application deployed on a multi-node microk8s cluster, as well as any other cluster using a hostPath-based storage provisioner. We came up with what I think is an elegant solution though..

This command will return the current node of a pod (provided that pod has been scheduled):

kubectl get pods <podname> -o jsonpath='{ ..nodeName }'

And this command will patch a deployment, adding a nodeSelector:

kubectl patch deployments <deploymentname> -p '{"spec": {"template": {"spec": {"nodeSelector": {"kubernetes.io/hostname": "<nodename>"}}}}}'

Combined, we get this neat little command, a variation which is now featured on the Portainer install docs:

kubectl patch deployment -n $NAMESPACE $DEPLOYMENT -p '{"spec": {"template": {"spec": {"nodeSelector": {"kubernetes.io/hostname": "'$(kubectl get pods -n $NAMESPACE -o jsonpath='{ ..nodeName }')'"}}}}}' || (echo Failed to identify current node of $DEPLOYMENT pod; exit 1)

It should be noted that pinning a pod to a node obviously reduces resiliency in the event that a node fails, and something like this shouldn't be attempted seriously in production. If you're using microk8s though, you're probably not in serious production, so go wild!

BTW, this is what I do, all day, every day. I enjoy it, and I'm good at it. If this sort of stuff is what you need, I'd be interested to work with you.

You’ve successfully subscribed to 🧑‍💻 Funky Penguin
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Your link has expired
Success! Check your email for magic link to sign-in.