ETCD状态检查以及碎片压缩

在安装了etcd 3.10版本之后,使用etcdctl和etcd交互时需要将api版本设置为3, 默认,为了向后兼容 etcdctl 使用 v2 API 来和 etcd 服务器通讯。为了让 etcdctl 使用 v3 API 来和etcd通讯,API 版本必须通过环境变量 ETCDCTL_API 设置为版本3,今天记录这篇笔记的原因是看到etcd日志中有“Error: etcdserver: mvcc: database space exceeded”的报错,于是进行记录;默认etcd存储大小是2GB,最大可以设置为8GB,etcd超过2GB导致前面提到的报错

1.etcd版本查询

export ETCDCTL_API=3
etcd --version
#返回
etcd Version: 3.4.13
Git SHA: ae9734ed2
Go Version: go1.12.17
Go OS/Arch: linux/amd64

2.etcd所有节点状态查询

ETCDCTL_API=3 etcdctl --write-out=table member list --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt
+------------------+---------+-----------------+------------------------------+------------------------------+------------+
|        ID        | STATUS  |      NAME       |          PEER ADDRS          |         CLIENT ADDRS         | IS LEARNER |
+------------------+---------+-----------------+------------------------------+------------------------------+------------+
| 42d99f3f857a5b15 | started | 172.168.102.119 | https://172.168.102.119:2380 | https://172.168.102.119:2379 |      false |
| 5e5770766d40c343 | started | 172.168.102.120 | https://172.168.102.120:2380 | https://172.168.102.120:2379 |      false |
| 91f48965ce76fb52 | started | 172.168.102.118 | https://172.168.102.118:2380 | https://172.168.102.118:2379 |      false |
+------------------+---------+-----------------+------------------------------+------------------------------+------------+

3.etcd集群中的leader查询

ETCDCTL_API=3 etcdctl --write-out=table endpoint status --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 127.0.0.1:2379 | 91f48965ce76fb52 |   3.4.7 |  361 MB |     false |      false |      6480 |   34152053 |           34152053 |        |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

4.etcd成员的健康状况查询

ETCDCTL_API=3 etcdctl --write-out=table endpoint health --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt
+----------------+--------+-------------+-------+
|    ENDPOINT    | HEALTH |    TOOK     | ERROR |
+----------------+--------+-------------+-------+
| 127.0.0.1:2379 |   true | 28.632652ms |       |
+----------------+--------+-------------+-------+

5.etcd节点状态查询

export ENDPOINTS=10.24.21.89:2379,10.24.21.90:2379,10.24.21.91:2379   
ETCDCTL_API=3 etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt

6.获取当前KV的版本

export ETCDCTL_API=3 
rev=$(etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt    --cert /etc/kubernetes/pki/etcd/peer.crt   --key /etc/kubernetes/pki/etcd/peer.key  --endpoints=https://127.0.0.1:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')
#压缩掉所有旧版本KV,注意压缩旧版本不会释放磁盘空间到操作系统,磁盘空间仍被etcd占用
etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt    --cert /etc/kubernetes/pki/etcd/peer.crt   --key /etc/kubernetes/pki/etcd/peer.key  --endpoints=https://127.0.0.1:2379 compact $rev

7.etcd碎片整理

在一般磁盘db存储空间不足的情况下可以先查看告警

etcdctl alarm list
memberID:91f48965ce76fb52 alarm:NOSPACE

接着进行碎片整理

ETCDCTL_API=3 etcdctl defrag
Finished defragmenting etcd member[127.0.0.1:2379]

然后清除告警

ETCDCTL_API=3 etcdctl alarm disarm
memberID:91f48965ce76fb52 alarm:NOSPACE

上述操作是针对单台,整个member将处于不可用状态,如果是多台可以进行逐个清理,最后检查etcd是否恢复正常

8.etcd删除指定数据

ETCDCTL_API=3 etcdctl --endpoints=$ENDPOINTS --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key del --prefix "/registry/events"

9.etcd配置中条目数量

ETCDCTL_API=3 etcdctl --endpoints=$ENDPOINTS --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
 get  /   --prefix   --keys-only   --command-timeout=20s  |   awk -F/ '{print $3}' | sort -n | uniq -c

10.查看所有key的方法

ETCDCTL_API=3 etcdctl  --endpoints=$ENDPOINTS --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt get "" --from-key

11.etcd存储大小变更

etcd为了防止性能下降或无意中使键值存储超载,强制将可配置的存储大小配额默认设置为2GB。
etcd允许的最大存储大小在正常环境下建议为8GB。如果配置的值超过该值,etcd会在启动时发出警告

修改为8G存储大小,在k8s场景下主要修改/etc/kubernetes/manifests/etcd.yaml文件,添加如下配置

- --quota-backend-bytes=8589934592  # 设置存储大小为8GB

调整以后可以使用如下命令进行状态查询

ETCDCTL_API=3 etcdctl endpoint status --cluster -w table


内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://sulao.cn/post/835

评论列表

0%