ETCD is a distributed key/value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.
Each control plane node creates a local etcd member and this etcd member communicates only with the
kube-apiserver of this node.
ETCD BACKUP -
Two ways we can perform the backup and restoration.
- Query kubeapi server or using kubectl and save the resources configuration
e.g:- $ kubectl get all — all-namespaces -o yaml > /tmp/All_resource.yaml
2. Backup etcd and restore.
To make use of etcdctl for tasks such as back up and restore, make sure that you set the ETCDCTL_API to 3.
This can be achieve by exporting the variable ETCDCTL_API prior to using the etcdctl client. sample command as below.
etcdctl snapshot save -h and keep a note of the mandatory global options.
If the ETCD database is TLS-Enabled, the below options are mandatory:
— cacert verify certificates of TLS-enabled secure servers using this CA bundle
— cert identify secure client using this TLS certificate file
— endpoints=[127.0.0.1:2379] This is the default as ETCD is running on master node and exposed on localhost 2379.
— key identify secure client using this TLS key file
On the Master Node:
## TAKE A BACKUP OF ETCD Database using Etcdctl command. save the snapshot in path - /opt/XX.....
$ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379
--key=/etc/kubernetes/pki/etcd/server.key snapshot save /opt/snapshot-pre-boot.db## Check the etcd backup status.
$ ETCDCTL_API=3 Etcdctl snapshot status <snapshot name>## Stop the Api server before restoring.
## RESTORE THE ETCD Database BACKUP using the etcdctl command. Restore from the backup path /opt/snapshot*$ ETCDCTL_API=3 etcdctl --data-dir /var/lib/etcd-from-backup \
snapshot restore /opt/snapshot-pre-boot.db/opt/snapshot-pre-boot.db --> This will be the new data directory.## After restore is completed- restart the Kube api service.** Now the cluster should be back to Original state.
BACKUP STEPS SAMPLE COMMAND EXECUTION :
RESTORE STEPS SAMPLE COMMANDS EXECUTION :-
Restore the OLD backup etcd from /opt/snapshot-pre-boot.db. Here, we are restoring to new dir — /var/lib/etcd-from-backup
Now, we need to update the /etc/kubernetes/manifests/etcd.yaml:
We have now restored the etcd snapshot to a new path on the controlplane — /var/lib/etcd-from-backup. Therefore, the only change to be made in the YAML file, is to change the hostPath for the volume called etcd-data from old directory (/var/lib/etcd) to the new directory /var/lib/etcd-from-backup.
BEFORE IMAGE (etcd.yaml)-
AFTER IMAGE (etcd.yaml):
With this change, /var/lib/etcd on the container points to /var/lib/etcd-from-backup on the controlplane.
When this file is updated, the ETCD pod is automatically re-created as this is a static pod placed under the
1. ETCD pod has changed it will automatically restart, and also kube-controller-manager and kube-scheduler. Wait 1–2 to mins for this pods to restart. You can run a watch “docker ps | grep etcd” command to see when the ETCD pod is restarted.
2. If the etcd pod is not getting Ready 1/1, then restart it by kubectl delete pod -n kube-system etcd-controlplane and wait 1 minute.
At what address can you reach the ETCD cluster from the controlplane node?
- CHECK ETCD LOGS -