etcd组件和etcdctl常命令整理

1、kubernetes组件-etcd:

etcd 是CoreOS公司开发目前是Kubernetes默认使用的key-value数据存储系统,用于保存

kubernetes的所有集群数据,etcd支持分布式集群功能,生产环境使用时需要为etcd数据提

供定期备份机制

etcd具有下面这些属性:

    完全复制:集群中的每个节点都可以使用完整的存档

    高可用性:Etcd可用于避免硬件的单点故障或网络问题

    一致性:每次读取都会返回跨多主机的最新写入

    简单:包括一个定义良好、面向用户的API(gRPC)

    安全:实现了带有可选的客户端证书身份验证的自动化TLS 快速:每秒10000次写入的基准速度

    可靠:使用Raft算法实现了存储的合理分布Etcd的工作原理

查看ETCD service文件

root@k8s-master:~# cat /etc/systemd/system/etcd.service[Unit]Description=Etcd ServerAfter=network.targetAfter=network-online.targetWants=network-online.com/coreos[Service]Type=notifyWorkingDirectory=/var/lib/etcd #数据保存目录ExecStart=/usr/local/bin/etcd #二进制文件路径ExecStart=/usr/local/bin/etcd --name=etcd-172.31.7.2 #当前node名称--cert-file=/etc/kubernetes/ssl/etcd.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem --peer-cert-file=/etc/kubernetes/ssl/etcd.pem --peer-key-file=/etc/kubernetes/ssl/etcd-key.pem --trusted-ca-file=/etc/kubernetes/ssl/ca.pem --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem --initial-advertise-peer-urls=https://172.31.7.2:2380 #通告自己的集群端口--listen-peer-urls=https://172.31.7.2:2380 #集群之间通讯端口--listen-client-urls=https://172.31.7.0.0.1:2379 #客户端访问地址--advertise-client-urls=https://172.31.7.2:2379 #通告自己的客户端端口--initial-cluster-token=etcd-cluster-0 #创建集群使用的token,一个集群内的节点保持一致--initial-cluster=etcd-172.31.7.31.7.2:2380 #集群所有的节点信息--initial-cluster-state=new #新建集群的时候的值为new,如果是已经存在的集群为existing。--data-dir=/var/lib/etcd #数据目录路径--wal-dir= --snapshot-count=50000 --auto-compaction-retention=1 --auto-compaction-mode=periodic --max-request-bytes=10485760 --quota-backend-bytes=8589934592Restart=alwaysRestartSec=15LimitNOFILE=65536OOMScoreAdjust=-999[Install]WantedBy=multi-user.target

2、etcdctl常用命令查看成员信息

etcd有多个不同的API访问版本,v1版本已经废弃,etcd v2 和 v3 本质上是共享同一套 raft 协议代码的两个独立的应用,接口不一样,存储不一样,数据互相隔离。也就是说如果从 Etcd v2 升级到 Etcd v3,原来v2 的数据还是只能用 v2 的接口访问。v3 的接口创建的数据也只能访问通过 v3 的接口访问。

WARNING:

Environment variable ETCDCTL_API is not set; defaults to etcdctl v2\. #默认使用V2版本Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API. #设置API版本

--help查看帮助文档

root@k8s-master:~# etcdctl --helproot@k8s-master:~# etcdctl member --helpNAME:member - Membership related commandsUSAGE:etcdctl member <subcommand> [flags]API VERSION:3.5COMMANDS:addAdds a member into the clusterlistLists all members in the clusterpromote Promotes a non-voting member in the clusterremoveRemoves a member from the clusterupdateUpdates a member in the clusterOPTIONS:-h, --help[=false]help for memberroot@k8s-master:~# etcdctl member list8f8475b6e63056ee, started, etcd-172.31.7.31.7.31.7.2:2379, false#多ip#export NODE_IPS="172.31.7.101 172.31.7.102 172.31.7.103"# for ip in ${NODE_IPS}; doETCDCTL_API=3 /usr/local/bin/etcdctl--endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem--cert=/etc/kubernetes/ssl/etcd.pem--key=/etc/kubernetes/ssl/etcd-key.pemendpoint health; done

etcd集群成员列表

以表格方式显示节点详细状态

root@k8s-master:~# ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table member list--endpoints=https://172.31.7.2:2379 --cacert=/etc/kubernetes/ssl/ca.pem--cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem

查看etcd数据信息:

~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only #以路径的方式所有key信息

pod信息

~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | greppod

namespace信息:

root@k8s-master:~# etcdctl get / --prefix --keys-only | grep namespaces/registry/namespaces/default/registry/namespaces/kube-node-lease/registry/namespaces/kube-public/registry/namespaces/kube-system/registry/namespaces/kubernetes-dashboard

控制器信息:

root@k8s-master:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep deployment

calico组件信息:

root@k8s-master:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep calico

3、etcd增删改查数据:

添加数据

root@k8s-master:~# etcdctl put /name "etcd_test"OK

查询数据

root@k8s-master:~# etcdctl get /name/nameetcd_test

改动数据,

#直接覆盖就是更新数据root@k8s-master:~# etcdctl put /name "etcd_test_01"OKroot@k8s-master:~# etcdctl get /name/nameetcd_test_01

删除数据

root@k8s-master:~# etcdctl del /name1root@k8s-master:~# etcdctl get /nameroot@k8s-master:~#

root@k8s-master:~# kubectl get podsNAMEREADYSTATUSRESTARTSAGEnet-test11/1Running3 (35h ago)7d23hnet-test21/1Running3 (35h ago)7d23hroot@k8s-master:~# etcdctl del /registry/pods/default/net-test11root@k8s-master:~# kubectl get podsNAMEREADYSTATUSRESTARTSAGEnet-test21/1Running3 (35h ago)7d23hroot@k8s-master:~#

4、etcd数据watch机制:

基于不断监看数据,发生变化就主动触发通知客户端,Etcd v3 的watch机制支持watch某个固定的key,也支持watch一个范围。在etcd node1上watch一个key,没有此key也可以执行watch,后期可以再创建:etcdctl watch /data再另开一个terminal修改数据,验证etcd node1是否能够发现数据变化

5、etcd V3 API版本数据备份与恢复:

WAL是write ahead log的缩写,顾名思义,也就是在执行真正的写操作之前先写一个日志,预写日志。wal: 存放预写式日志,最大的作用是记录了整个数据变化的全部历程。在etcd中,所有数据的修改在提交前,都要先写入到WAL中。

注意:这里所谓的备份就是把整个kubernetes集群做备份,哪果只需要备份单个名称空间的数据请参考文章:Velero结合minio实现kubernetes业务数据备份与恢复

V3版本备份数据:

root@k8s-master:~# etcdctl snapshot save --helpNAME:snapshot save - Stores an etcd node backend snapshot to a given fileUSAGE:etcdctl snapshot save <filename> [flags]root@k8s-master:~# etcdctl snapshot save /data/etcd_backup/etcd_backup_202212102227.db{"level":"info","ts":1670682626.0701733,"caller":"snapshot/v3_snapshot.go:68","msg":"created temporary db file","path":"/data/etcd_backup/etcd_backup_202212102227.db.part"}{"level":"info","ts":1670682626.071042,"logger":"client","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}{"level":"info","ts":1670682626.0721836,"caller":"snapshot/v3_snapshot.go:76","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}{"level":"info","ts":1670682626.0888252,"logger":"client","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}{"level":"info","ts":1670682626.0915186,"caller":"snapshot/v3_snapshot.go:91","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"3.4 MB","took":"now"}{"level":"info","ts":1670682626.0915732,"caller":"snapshot/v3_snapshot.go:100","msg":"saved","path":"/data/etcd_backup/etcd_backup_202212102227.db"}Snapshot saved at /data/etcd_backup/etcd_backup_202212102227.db

#自动备份数据~# mkdir /data/etcd-backup-dir/ -p~# catscript.sh #!/bin/bashsource /etc/profileDATE=`date +%Y-%m-%d_%H-%M-%S`ETCDCTL_API=3 /usr/bin/etcdctlsnapshot save/data/etcd-backup-dir/etcd-snap-${DATE}.db

V3版本恢复数据

root@k8s-master:~# etcdctl snapshot restore --helpNAME:snapshot restore - Restores an etcd member snapshot to an etcd directoryUSAGE:etcdctl snapshot restore <filename> [options] [flags]DESCRIPTION:Moved to `etcdctl snapshot restore ...`#将数据恢复到一个新的不存在的目录中root@k8s-master:~# etcdctl snapshot restore /data/etcd_backup/etcd_backup_202212102227.db --data-dir="/data/etcddir/"

数据还原完后记得修改如下etcd.service的数所目录地址

6、使用kubeasz项目自带的etcd集群还原功能

查看pods资源

root@k8s-master:~/yaml/1202# kubectl get pods -ANAMESPACENAMEREADYSTATUSRESTARTSAGEdefaultnet-test11/1Running06h46mdefaultnet-test21/1Running010mkube-systemcalico-kube-controllers-754966f84c-mb7rr1/1Running022mkube-systemcalico-node-m2q6c1/1Running022mkube-systemcalico-node-qg57t1/1Running022mkube-systemcalico-node-vf6cp1/1Running022mkube-systemcoredns-745884d567-b8xtx1/1Running015mkubernetes-dashboarddashboard-metrics-scraper-77d96d457f-crkjn1/1Running015mkubernetes-dashboardkubernetes-dashboard-659fb5dc9-b9qjj1/1Running015mroot@k8s-master:~/yaml/1202#

使用94号剧本备份etcd集群

/-rw-rw-r--1 root root422 Dec2 22:19 01.prepare.yml-rw-rw-r--1 root root58 Jan52022 02.etcd.yml-rw-rw-r--1 root root209 Jan52022 03.runtime.yml-rw-rw-r--1 root root482 Jan52022 04.kube-master.yml-rw-rw-r--1 root root218 Jan52022 05.kube-node.yml-rw-rw-r--1 root root408 Jan52022 06.network.yml-rw-rw-r--1 root root77 Jan52022 07.cluster-addon.yml-rw-rw-r--1 root root34 Jan52022 10.ex-lb.yml-rw-rw-r--1 root root 3893 Jan52022 11.harbor.yml-rw-rw-r--1 root root 1567 Jan52022 21.addetcd.yml-rw-rw-r--1 root root 1520 Jan52022 22.addnode.yml-rw-rw-r--1 root root 1050 Jan52022 23.addmaster.yml-rw-rw-r--1 root root 3344 Jan52022 31.deletcd.yml-rw-rw-r--1 root root 2018 Jan52022 32.delnode.yml-rw-rw-r--1 root root 2071 Jan52022 33.delmaster.yml-rw-rw-r--1 root root 1891 Jan52022 90.setup.yml-rw-rw-r--1 root root 1054 Jan52022 91.start.yml-rw-rw-r--1 root root934 Jan52022 92.stop.yml-rw-rw-r--1 root root 1042 Jan52022 93.upgrade.yml-rw-rw-r--1 root root 1786 Jan52022 94.backup.yml-rw-rw-r--1 root root999 Jan52022 95.restore.yml-rw-rw-r--1 root root337 Jan52022 99.clean.yml#查看帮助文档root@k8s-master:/etc/kubeasz# ./ezctl --helpUsage: ezctl COMMAND [args]-------------------------------------------------------------------------------------Cluster setups:listto list all of the managed clusterscheckout<cluster>to switch default kubeconfig of the clusternew<cluster>to start a new k8s deploy with name 'cluster'setup<cluster><step>to setup a cluster, also supporting a step-by-step waystart<cluster>to start all of the k8s services stopped by 'ezctl stop'stop<cluster>to stop all of the k8s services temporarilyupgrade<cluster>to upgrade the k8s clusterdestroy<cluster>to destroy the k8s clusterbackup<cluster>to backup the cluster state (etcd snapshot)restore<cluster>to restore the cluster state from backupsstart-aioto quickly setup an all-in-one cluster with 'default' settingsCluster ops:add-etcd<cluster><ip>to add a etcd-node to the etcd clusteradd-master<cluster><ip>to add a master node to the k8s clusteradd-node<cluster><ip>to add a work node to the k8s clusterdel-etcd<cluster><ip>to delete a etcd-node from the etcd clusterdel-master<cluster><ip>to delete a master node from the k8s clusterdel-node<cluster><ip>to delete a work node from the k8s clusterExtra operation:kcfg-adm<cluster><args>to manage client kubeconfig of the k8s clusterUse "ezctl help <command>" for more information about a given command.#开始备份root@k8s-master:/etc/kubeasz# ./ezctl backup k8s-cluster-01ansible-playbook -i clusters/k8s-cluster-01/hosts -e @clusters/k8s-cluster-......PLAY RECAP ***************************************************************************************************************************************************************************************************************************localhost: ok=10changed=6unreachable=0failed=0skipped=0rescued=0ignored=0

此时在kubeasz k8s集群下有一个backup目录,etcd备份目录就放在这里

/-rw------- 1 root root 2764832 Dec 10 23:18 snapshot_202212102318.db-rw------- 1 root root 2764832 Dec 10 23:18 snapshot.db

删除pods做还原试验

root@k8s-master:/etc/kubeasz# kubectl delete pods -n default net-test1pod "net-test1" deletedroot@k8s-master:/etc/kubeasz# kubectl get pods -ANAMESPACENAMEREADYSTATUSRESTARTSAGEdefaultnet-test21/1Running06h54mkube-systemcalico-kube-controllers-754966f84c-c8d2g1/1Running07h7mkube-systemcalico-node-csnl71/1Running07h7mkube-systemcalico-node-czwwf1/1Running07h7mkube-systemcalico-node-smmk41/1Running07h7mkube-systemcalico-node-wlpl91/1Running07h7mkube-systemcoredns-79688b6cb4-kqpgs1/1Running03h57mkubernetes-dashboarddashboard-metrics-scraper-799d786dbf-l52xb1/1Running0166mkubernetes-dashboardkubernetes-dashboard-fb8648fd9-p7qt61/1Running0166m

使用95号剧本还原etcd集群

/-rw-rw-r--1 root root422 Dec2 22:19 01.prepare.yml-rw-rw-r--1 root root58 Jan52022 02.etcd.yml-rw-rw-r--1 root root209 Jan52022 03.runtime.yml-rw-rw-r--1 root root482 Jan52022 04.kube-master.yml-rw-rw-r--1 root root218 Jan52022 05.kube-node.yml-rw-rw-r--1 root root408 Jan52022 06.network.yml-rw-rw-r--1 root root77 Jan52022 07.cluster-addon.yml-rw-rw-r--1 root root34 Jan52022 10.ex-lb.yml-rw-rw-r--1 root root 3893 Jan52022 11.harbor.yml-rw-rw-r--1 root root 1567 Jan52022 21.addetcd.yml-rw-rw-r--1 root root 1520 Jan52022 22.addnode.yml-rw-rw-r--1 root root 1050 Jan52022 23.addmaster.yml-rw-rw-r--1 root root 3344 Jan52022 31.deletcd.yml-rw-rw-r--1 root root 2018 Jan52022 32.delnode.yml-rw-rw-r--1 root root 2071 Jan52022 33.delmaster.yml-rw-rw-r--1 root root 1891 Jan52022 90.setup.yml-rw-rw-r--1 root root 1054 Jan52022 91.start.yml-rw-rw-r--1 root root934 Jan52022 92.stop.yml-rw-rw-r--1 root root 1042 Jan52022 93.upgrade.yml-rw-rw-r--1 root root 1786 Jan52022 94.backup.yml-rw-rw-r--1 root root999 Jan52022 95.restore.yml-rw-rw-r--1 root root337 Jan52022 99.clean.yml#开始还原root@k8s-master1-etcd1:/etc/kubeasz# ./ezctl restore k8s-cluster-01#验证root@k8s-master:~/yaml/1202# kubectl get pods -ANAMESPACENAMEREADYSTATUSRESTARTSAGEdefaultnet-test11/1Running018mdefaultnet-test21/1Running018mkube-systemcalico-kube-controllers-754966f84c-mb7rr1/1Running030mkube-systemcalico-node-m2q6c1/1Running030mkube-systemcalico-node-qg57t1/1Running030mkube-systemcalico-node-vf6cp1/1Running030mkube-systemcoredns-745884d567-b8xtx1/1Running023mkubernetes-dashboarddashboard-metrics-scraper-77d96d457f-crkjn1/1Running023mkubernetes-dashboardkubernetes-dashboard-659fb5dc9-b9qjj1/1Running023m

总结:

当etcd集群宕机数量超过集群总节点数一半以上的时候(如总数为三台宕机两台),就会导致整个集群宕机,后期需要重新恢复数据,则恢复流程如下:

    恢复服务器系统

    重新部署ETCD集群

    停止kube-apiserver/controller-manager/scheduler/kubelet/kube-proxy

    停止ETCD集群

    各ETCD节点恢复同一份备份数据

    启动各节点并验证ETCD集群

    启动kube-apiserver/controller-manager/scheduler/kubelet/kube-proxy

    验证k8s master状态及pod数据

  • 版权声明:etcd组件和etcdctl常命令整理 内容由互联网用户自发贡献,该文观点仅代表作者本人。
  • 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。
  • 如发现本站有涉嫌抄袭侵权/违法违规的内容,请联系 删除。
  • 本文地址:https://www.jx027.com/article/666659.html

猜你喜欢

深度学习发展史和26个神经网络模型

深度学习发展史和26个神经网络模型

本文首先从4个方面(张量、生成模型、序列学习、深度强化学习)追踪深度学习几十年的发展史,然后再介绍主流的26个深度学习模型。1.深...

深度学习僧 2022年11月17日
华为新专利公开,谁再窥屏,我有证据

华为新专利公开,谁再窥屏,我有证据

随着通信技术的发展,尤其是诸如智能手机等终端的普及,终端的功能也日益增多,比如,用户可以通过终端实现与其他用户之间的数据交互、转...

SeenPat新专利 2022年11月17日