Kubic:Upgrading kubeadm clusters

This page explains how to upgrade a Kubic Kubernetes cluster created with kubeadm from version 1.21.x to version 1.22.x.

This process is based on the upstream documentation but has been modified to take into account specific functionality only available in openSUSE Tumbleweed & Kubic

Before you begin

You need to have a kubeadm Kubernetes cluster running version 1.21.0 or later.
Swap must be disabled.
The cluster should use a static control plane and etcd pods or external etcd.
Make sure you read the release notes carefully.
Make sure to back up any important components, such as app-level state stored in a database. kubeadm upgrade does not touch your workloads, only components internal to Kubernetes, but backups are always a best practice.

Additional Information

All containers are restarted after upgrade, because the container spec hash value is changed.
You only can upgrade from one MINOR version to the next MINOR version, or between PATCH versions of the same MINOR. That is, you cannot skip MINOR versions when you upgrade. For example, you can upgrade from 1.y to 1.y+1, but not from 1.y to 1.y+2.

Upgrading control plane (aka master) nodes

Upgrade the first control plane node

Any Kubic Control Plane nodes running 1.21.x running Kubic snapshot version 20210901 or later will already have kubeadm 1.22.x installed

1. Verify kubeadm is 1.22.1 or later:

kubeadm version

2. Drain the control plane node:

# replace <cp-node-name> with the name of your control plane node
kubectl drain <cp-node-name> --ignore-daemonsets

3. On the control plane node run:

sudo kubeadm upgrade plan

You should see an output similar to this:

[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.21.4
[upgrade/versions] kubeadm version: v1.22.1
[upgrade/versions] Latest stable version: v1.22.1
[upgrade/versions] Latest version in the v1.21 series: v1.21.4

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT             AVAILABLE
Kubelet     1 x v1.21.4   v1.22.1

Upgrade to the latest version in the v1.18 series:

COMPONENT            CURRENT   AVAILABLE
API Server           v1.21.4   v1.22.1
Controller Manager   v1.21.4   v1.22.1
Scheduler            v1.21.4   v1.22.1
Kube Proxy           v1.21.4   v1.22.1
CoreDNS              1.7.0     1.8.4
Etcd                 3.4.13-0     3.5.0

You can now apply the upgrade by executing the following command:

    kubeadm upgrade apply v1.22.1

_____________________________________________________________________

This command checks that your cluster can be upgraded, and fetches the versions you can upgrade to.

4. Choose a version to upgrade to, and run the appropriate command. For example:

# replace x with the patch version you picked for this upgrade
sudo kubeadm upgrade apply v1.22.x

You should see output similar to this:

[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.22.1"
[upgrade/versions] Cluster version: v1.21.4
[upgrade/versions] kubeadm version: v1.22.1
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.22.1"...
Static pod: kube-apiserver-myhost hash: 2cc222e1a577b40a8c2832320db54b46
Static pod: kube-controller-manager-myhost hash: f7ce4bc35cb6e646161578ac69910f18
Static pod: kube-scheduler-myhost hash: e3025acd90e7465e66fa19c71b916366
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests308527012"
W0308 18:48:14.535122    3082 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-03-08-18-48-14/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-myhost hash: 2cc222e1a577b40a8c2832320db54b46
Static pod: kube-apiserver-myhost hash: 609429acb0d71dce6725836dd97d8bf4
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-03-08-18-48-14/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-myhost hash: f7ce4bc35cb6e646161578ac69910f18
Static pod: kube-controller-manager-myhost hash: c7a1232ba2c5dc15641c392662fe5156
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-03-08-18-48-14/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-myhost hash: e3025acd90e7465e66fa19c71b916366
Static pod: kube-scheduler-myhost hash: b1b721486ae0ac504c160dcdc457ab0d
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.22" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.22.1". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

5. Manually upgrade your CNI provider plugin. (This step is not required on additional control plane nodes)

6. Uncordon the control plane node:

# replace <cp-node-name> with the name of your control plane node
kubectl uncordon <cp-node-name>

Upgrade additional control plane nodes

If you have additional master nodes, follow the same steps as above but there is no need to use kubeadm upgrade plan and run kubeadm upgrade node instead of kubeadm upgrade apply

Upgrade kubelet

Now on each control plane node, configure kubelet to use the new version and then restart kubelet

sed -i 's/KUBELET_VER=1.21/KUBELET_VER=1.22/' /etc/sysconfig/kubelet
systemctl restart kubelet

Upgrading worker nodes

The upgrade procedure on worker nodes should be executed one node at a time or few nodes at a time, without compromising the minimum required capacity for running your workloads.

kubeadm

Any Kubic nodes running Kubic snapshot version 20210901 or later will already have kubeadm 1.22.x installed

kubeadm version

Drain the Node

# replace <node-name> with the name of your worker node
kubectl drain <node-name> --ignore-daemonsets

Upgrade the kubelet config

kubeadm upgrade node

Run new kubelet

sed -i 's/KUBELET_VER=1.21/KUBELET_VER=1.22/' /etc/sysconfig/kubelet
systemctl restart kubelet

Uncorden the node

# replace <node-name> with the name of your node
kubectl uncordon <node-name>

Verify the status of the cluster

After the kubelet is upgraded on all nodes verify that all nodes are available again by running the following command from anywhere kubectl can access the cluster:

kubectl get nodes

The STATUS column should show Ready for all your nodes, and the version number should be updated.

Recovering from a failure state

If kubeadm upgrade fails and does not roll back, for example because of an unexpected shutdown during execution, you can run kubeadm upgrade again. This command is idempotent and eventually makes sure that the actual state is the desired state you declare.

To recover from a bad state, you can also run kubeadm upgrade apply --force without changing the version that your cluster is running.

During upgrade kubeadm writes the following backup folders under /etc/kubernetes/tmp

kubeadm-backup-etcd-<date>-
kubeadm-backup-manifests-<date>-

kubeadm-backup-etcd contains a backup of the local etcd member data for this control-plane Node. In case of an etcd upgrade failure and if the automatic rollback does not work, the contents of this folder can be manually restored in /var/lib/etcd. In case external etcd is used this backup folder will be empty.

kubeadm-backup-manifests contains a backup of the static Pod manifest files for this control-plane Node. In case of a upgrade failure and if the automatic rollback does not work, the contents of this folder can be manually restored in /etc/kubernetes/manifests. If for some reason there is no difference between a pre-upgrade and post-upgrade manifest file for a certain component, a backup file for it will not be written.