In the following post, we are going to talk on how to perform a node replacement (control or worker node) of OCPv4.10.25.
Prerequisites
OCPv4.10.X installed using AssistedInstaller or IPI.
Step 1. How to change the unmanaged
state of the nodes after the Assisted Installer finished
Those operations are required to be performed as a day2 to bring the BareMetalHost object status of each nodes of the OCPv4.10 cluster from unmanaged
to externally provisioned
.
To highlight this, we are going to use a compact cluster (3 control nodes + 0 worker nodes).
oc get nodes
NAME STATUS ROLES AGE VERSION
cu-master1 Ready master,worker 138m v1.23.5+012e945
cu-master2 Ready master,worker 152m v1.23.5+012e945
cu-master3 Ready master,worker 159m v1.23.5+012e945
The BMH object status:
oc get bmh -n openshift-machine-api
NAME STATE CONSUMER ONLINE ERROR AGE
cu-master1 unmanaged test-cluster-xkmdh-master-0 true 3h12m
cu-master2 unmanaged test-cluster-xkmdh-master-1 true 3h12m
cu-master3 unmanaged test-cluster-xkmdh-master-2 true 3h12m
As you can observe in the above BMH object the STATE
of each node its in the unmanaged
, we will have to update this object to be managed in order to have the cluster ready for the node replacement.
We will have to create the secret and update the address
, credentialsName
and set disableCertificateVerification: true
for each individual nodes.
---
apiVersion: v1
data:
password: Y2Fsdmlu
username: cm9vdA ==
kind: Secret
metadata:
name: cu-master1-bmc-secret
namespace: openshift-machine-api
type : Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: cu-master1
namespace: openshift-machine-api
spec:
automatedCleaningMode: metadata
bmc:
address: idrac-virtualmedia://192.168.34.230/redfish/v1/Systems/System.Embedded.1
credentialsName: cu-master1-bmc-secret
disableCertificateVerification: true
bootMACAddress: b0:7b:25:d4:e8:20
bootMode: UEFI
online: true
---
apiVersion: v1
data:
password: Y2Fsdmlu
username: cm9vdA ==
kind: Secret
metadata:
name: cu-master2-bmc-secret
namespace: openshift-machine-api
type : Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: cu-master2
namespace: openshift-machine-api
spec:
automatedCleaningMode: metadata
bmc:
address: idrac-virtualmedia://192.168.34.231/redfish/v1/Systems/System.Embedded.1
credentialsName: cu-master3-bmc-secret
disableCertificateVerification: true
bootMACAddress: b0:7b:25:dd:ce:be
bootMode: UEFI
online: true
---
apiVersion: v1
data:
password: Y2Fsdmlu
username: cm9vdA ==
kind: Secret
metadata:
name: cu-master3-bmc-secret
namespace: openshift-machine-api
type : Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: cu-master3
namespace: openshift-machine-api
spec:
automatedCleaningMode: metadata
bmc:
address: idrac-virtualmedia://192.168.34.232/redfish/v1/Systems/System.Embedded.1
credentialsName: cu-master3-bmc-secret
disableCertificateVerification: true
bootMACAddress: b0:7b:25:d4:59:80
bootMode: UEFI
online: true
NOTE: In the above example, the the Hardware type its very important:
for DELL servers on the address
its used idrac-virtualmedia://ip_address_of_the_bmh_interface/redfish/v1/Systems/System.Embedded.1
for HPE servers on the address
its used redfish-virtualmedia://ip_address_of_the_bmh_interface/redfish/v1/Systems/1
For more information on its available on Redhat official documentation page .
Apply the objects for each nodes:
oc apply -f update-masterX-bmh.yaml
secret/cu-masterX-bmc-secret created
Warning: resource baremetalhosts/cu-masterX is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
baremetalhost.metal3.io/cu-masterX configured
Step 2. Verify that the BMH object has been updated
Verify that the secrets associated to each node has been created:
oc get secrets -n openshift-machine-api
NAME TYPE DATA AGE
...
cu-master1-bmc-secret Opaque 2 104m
cu-master2-bmc-secret Opaque 2 102m
cu-master3-bmc-secret Opaque 2 15s
...
Verify that the BMH object state has been updated:
oc get bmh -n openshift-machine-api
NAME STATE CONSUMER ONLINE ERROR AGE
cu-master1 externally provisioned test-cluster-xkmdh-master-0 true 3h25m
cu-master2 externally provisioned test-cluster-xkmdh-master-1 true 3h25m
cu-master3 externally provisioned test-cluster-xkmdh-master-2 true 3h25m
Once those steps are fully completed without errors, you can proceed further on the cluster usage.
Step 3. ETCD cluster back-up procedure
For more informations regarding the ETCD cluster back-up procedure from Redhat .
Please, note that the ETCD cluster back-up can be proceed when the cluster is healthy and also when one of the node is been lost. In the following example, we are going to cover when one of the node has been lost, but the steps wont be changed in the other scenario.
Display the nodes in the cluster:
oc get nodes
NAME STATUS ROLES AGE VERSION
kni1-master-0.cloud.lab.eng.bos.redhat.com Ready master,worker 8d v1.21.11+6b3cbdd
kni1-master-1.cloud.lab.eng.bos.redhat.com Ready master,worker 8d v1.21.11+6b3cbdd
kni1-master-2.cloud.lab.eng.bos.redhat.com NotReady master,worker 99m v1.21.11+6b3cbdd
Connect to one of the nodes in the cluster:
oc debug node/kni1-master-0.cloud.lab.eng.bos.redhat.com
Starting pod/kni1-master-0cloudlabengbosredhatcom-debug ...
To use host binaries, run ` chroot /host`
Pod IP: 10.19.138.11
If you don't see a command prompt, try pressing enter.
sh-4.4#
sh-4.4#
sh-4.4# chroot /host
sh-4.4# /bin/bash
[systemd]
Failed Units: 1
NetworkManager-wait-online.service
[root@kni1-master-0 /]#
At this point we can run the back-up script available on the node:
[ root@kni1-master-0 / ] # /usr/local/bin/cluster-backup.sh /home/core/assets/backup
found latest kube-apiserver : /etc/kubernetes/static-pod-resources/kube-apiserver-pod-15
found latest kube-controller-manager : /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-9
found latest kube-scheduler : /etc/kubernetes/static-pod-resources/kube-scheduler-pod-8
found latest etcd : /etc/kubernetes/static-pod-resources/etcd-pod-11
2355b836ed2da7166a4deada628681d110131ccf58e6695e30a2e005a075d041
etcdctl version : 3.4.14
API version : 3.4
{ " level" : " info" , " ts" : 1654790640.5072594 , " caller" : " snapshot/v3_snapshot.go:119" , " msg" : " created temporary db file" , " path" : " /home/core/assets/backup/snapshot_2022-06-09_160359.db.part" }
{ " level" : " info" , " ts" : " 2022-06-09T16:04:00.515Z" , " caller" : " clientv3/maintenance.go:200" , " msg" : " opened snapshot stream; downloading" }
{ " level" : " info" , " ts" : 1654790640.515958 , " caller" : " snapshot/v3_snapshot.go:127" , " msg" : " fetching snapshot" , " endpoint" : " https://10.19.138.13:2379" }
{ " level" : " info" , " ts" : " 2022-06-09T16:04:01.395Z" , " caller" : " clientv3/maintenance.go:208" , " msg" : " completed snapshot read; closing" }
{ " level" : " info" , " ts" : 1654790641.4325762 , " caller" : " snapshot/v3_snapshot.go:142" , " msg" : " fetched snapshot" , " endpoint" : " https://10.19.138.13:2379" , " size" : " 138 MB" , " took" : 0.925249543 }
{ " level" : " info" , " ts" : 1654790641.4326785 , " caller" : " snapshot/v3_snapshot.go:152" , " msg" : " saved" , " path" : " /home/core/assets/backup/snapshot_2022-06-09_160359.db" }
Snapshot saved at /home/core/assets/backup/snapshot_2022-06-09_160359.db
{ " hash" : 2784831220 , " revision" : 9953578 , " totalKey" : 14626 , " totalSize" : 137506816 }
snapshot db and kube resources are successfully saved to /home/core/assets/backup
Validating that the backup files have been created and they do exist:
[ root@kni1-master-0 /]# ls -l /home/core/assets/backup/
total 134364
-rw------- . 1 root root 137506848 Jun 9 16:04 snapshot_2022-06-09_160359.db
-rw------- . 1 root root 76154 Jun 9 16:03 static_kuberesources_2022-06-09_160359.tar.gz
At this stage we made sure that we have a current ETCD cluster backup available on a running node, this can be externally exported to your laptop or a external storage for data persistance. In the next steps, we are going to proceed with removing the unhealthy ETCD node in the cluster.
Step 4. Remove unhealthy ETCD cluster member
For more informations regarding the remove of unhealthy ETCD cluster member procedure from Redhat .
oc get nodes
NAME STATUS ROLES AGE VERSION
kni1-master-0.cloud.lab.eng.bos.redhat.com Ready master,worker 8d v1.21.11+6b3cbdd
kni1-master-1.cloud.lab.eng.bos.redhat.com Ready master,worker 8d v1.21.11+6b3cbdd
kni1-master-2.cloud.lab.eng.bos.redhat.com NotReady master,worker 113m v1.21.11+6b3cbdd
Checking the ETCD pods on the cluster:
oc get pods -n openshift-etcd | grep -v etcd-quorum-guard | grep etcd
etcd-kni1-master-0.cloud.lab.eng.bos.redhat.com 4/4 Running 0 53m
etcd-kni1-master-1.cloud.lab.eng.bos.redhat.com 4/4 Running 0 56m
etcd-kni1-master-2.cloud.lab.eng.bos.redhat.com 4/4 Running 0 56m
Connect to the pod that is not assigned of the node with STATUS
NotReady
:
oc project openshift-etcd
Already on project "openshift-etcd" on server "https://api.kni1.cloud.lab.eng.bos.redhat.com:6443" .
oc rsh etcd-kni1-master-0.cloud.lab.eng.bos.redhat.com
Defaulted container "etcdctl" out of: etcdctl, etcd, etcd-metrics, etcd-health-monitor, setup ( init) , etcd-ensure-env-vars ( init) , etcd-resources-copy ( init)
sh-4.4#
Check the ETCD member list:
sh-4.4# etcdctl member list -w table
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
| 5b080c81fee5526d | started | kni1-master-0.cloud.lab.eng.bos.redhat.com | https://10.19.138.11:2380 | https://10.19.138.11:2379 | false |
| a863d615322849cb | started | kni1-master-1.cloud.lab.eng.bos.redhat.com | https://10.19.138.12:2380 | https://10.19.138.12:2379 | false |
| c05aa2adbc319032 | started | kni1-master-2.cloud.lab.eng.bos.redhat.com | https://10.19.138.13:2380 | https://10.19.138.13:2379 | false |
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
sh-4.4#
Take note of the ID and the name of the unhealthy etcd member, because those values are needed later in the procedure.
sh-4.4# etcdctl endpoint health
{ "level" :"warn" ,"ts" :"2022-06-09T16:23:51.165Z" ,"caller" :"clientv3/retry_interceptor.go:62" ,"msg" :"retrying of unary invoker failed" ,"target" :"endpoint://client-1a51aaca-cf9d-4421-855c-fb19cfbfe07d/10.19.138.13:2379" ,"attempt" :0,"error" :"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \" transport: Error while dialing dial tcp 10.19.138.13:2379: connect: connection refused \" " }
https://10.19.138.11:2379 is healthy: successfully committed proposal: took = 12.718254ms
https://10.19.138.12:2379 is healthy: successfully committed proposal: took = 12.879825ms
https://10.19.138.13:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
Remove the unhealthy member:
sh-4.4# etcdctl member remove c05aa2adbc319032
View the member list again and verify that the member was removed:
sh-4.4# etcdctl member list -w table
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
| 5b080c81fee5526d | started | kni1-master-0.cloud.lab.eng.bos.redhat.com | https://10.19.138.11:2380 | https://10.19.138.11:2379 | false |
| a863d615322849cb | started | kni1-master-1.cloud.lab.eng.bos.redhat.com | https://10.19.138.12:2380 | https://10.19.138.12:2379 | false |
+------------------+---------+--------------------------------------------+---------------------------+---------------------------+------------+
Once this is validated, please exit the node shell.
Remove the old secrets for the unhealthy etcd member that was removed.
List the secrets for the unhealthy etcd member that was removed:
oc get secrets -n openshift-etcd | grep kni1-master-2.cloud.lab.eng.bos.redhat.com
etcd-peer-kni1-master-2.cloud.lab.eng.bos.redhat.com kubernetes.io/tls 2 81m
etcd-serving-kni1-master-2.cloud.lab.eng.bos.redhat.com kubernetes.io/tls 2 81m
etcd-serving-metrics-kni1-master-2.cloud.lab.eng.bos.redhat.com kubernetes.io/tls 2 81m
Force the etcd redeployment:
oc patch etcd cluster -p = '{"spec": {"forceRedeploymentReason": "single-master-recovery-' " $( date --rfc-3339 = ns ) " '"}}' --type = merge
Step 5. Removing the unhealthy control-node
oc get clusteroperator baremetal
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
baremetal 4.10.25 True False False 7h24m
Check the BareMetalHost object:
oc get bmh -n openshift-machine-api
Remove the old BareMetalHost:
oc delete bmh -n openshift-machine-api kni1-master-2
Remove the old Machine objects:
oc delete machine -n openshift-machine-api kni1-master-2
Check the Machine objects status:
oc get machine -n openshift-machine-api
Error from server ( InternalError) : an error on the server ( "" ) has prevented the request from succeeding ( get machines.machine.openshift.io)
NOTE: You should wait for 5-10 minutes until the cluster is transioning to a more stable state to proceed further.
Create the new BareMetalHost object and the secret to store the BMC credentials:
---
apiVersion: v1
kind: Secret
metadata:
name: kni1-master-2
namespace: openshift-machine-api
data:
username: "cm9vdA=="
password: "MTAwTWdtdC0="
type : Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: kni1-master-2
namespace: openshift-machine-api
spec:
automatedCleaningMode: disabled
bootMACAddress: ec:f4:bb:ed:6f:e8
rootDeviceHint:
deviceName: "/dev/sdb"
bmc:
address: idrac-virtualmedia+https://10.19.136.24/redfish/v1/Systems/System.Embedded.1
credentialsName: kni1-master-2 -> match
disableCertificateVerification: true
online: true
Applying the BareMetalHost object:
oc create -f new-master-bmh.yaml
secret/kni1-master-2 created
baremetalhost.metal3.io/kni1-master-2 created