kubelet日志报错提示Failed to update stats for container

在kubelet日志中查看到很多报错如下:

Mar 22 21:09:15 k8s-node2 kubelet[9258]: W0322 21:09:15.343439    9258 pod_container_manager_linux.go:200] failed to delete cgroup paths for [kubepods burstable podf08457a1-be25-41ac-b10c-96ea195513cd] : unable to destroy cgroup paths for cgroup [kubepods burstable podf08457a1-be25-41ac-b10c-96ea195513cd] : Failed to remove paths: map[blkio:/sys/fs/cgroup/blkio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice cpu:/sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice cpuacct:/sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice cpuset:/sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice devices:/sys/fs/cgroup/devices/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice freezer:/sys/fs/cgroup/freezer/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice hugetlb:/sys/fs/cgroup/hugetlb/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice memory:/sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice net_cls:/sys/fs/cgroup/net_cls,net_prio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice net_prio:/sys/fs/cgroup/net_cls,net_prio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice perf_event:/sys/fs/cgroup/perf_event/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice pids:/sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice rdma:/sys/fs/cgroup/rdma/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice systemd:/sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice]
Mar 22 21:09:16 k8s-node2 kubelet[9258]: W0322 21:09:16.090057    9258 container.go:549] Failed to update stats for container "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08457a1_be25_41ac_b10c_96ea195513cd.slice/docker-45a20393fcc7d5af687f3213489221bcfa2d5b71a0b525fe20a5aad3eef77ca6.scope": unable to determine device info for dir: /mnt/data1/docker/overlay2/56d111b2425c1e412c1d9d1e68ea35380e05df2af63098f0fc5cf78a5c0cd18b/diff: stat failed on /mnt/data1/docker/overlay2/56d111b2425c1e412c1d9d1e68ea35380e05df2af63098f0fc5cf78a5c0cd18b/diff with error: no such file or directory, continuing to push stats

微信截图_20230323161417.png

同时去查看docker也有很多如下报错信息:

Mar 22 21:09:11 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:11.228331641+08:00" level=info msg="Container 88026023366b07e9a63ad3f364813475571301aaed6ad09819de9c6d0d7f0f54 failed to exit within 30 seconds of signal 15 - using the force"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.231436373+08:00" level=info msg="Container fafdb0a9bd4c failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.231455063+08:00" level=info msg="Container bd1c3e58bb2a failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.232014231+08:00" level=info msg="Container e0ac3168fafa failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.233111786+08:00" level=info msg="Container de56cb3f9802 failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.233486045+08:00" level=info msg="Container 7719ede4a164 failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.233658664+08:00" level=info msg="Container ce49925c8e6a failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.239959958+08:00" level=info msg="Container b24e6752c6d0 failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.244268760+08:00" level=info msg="Container 88026023366b failed to exit within 10 seconds of kill - trying direct SIGKILL"
Mar 22 21:09:21 k8s-node2 dockerd[2644]: time="2023-03-22T21:09:21.247945605+08:00" level=info msg="Container 75a1ef5bfd74 failed to exit within 10 seconds of kill - trying direct SIGKILL"

通过上述日志判断可能是没有正常更新容器统计信息,导致杀容器退出失败。

我们需要在kubelet配置中添加以下配置来解决,由于配置每个场景下的K8s配置不同我们可以先检查下10-kubeadm.conf文件

cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

根据上述配置我们可以修/etc/sysconfig/kubelet文件或者是/var/lib/kubelet/kubeadm-flags.env文件都可以。

在上述两个文件中打开一个添加配置即可

cat /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"

添加配置以后像这样即可

KUBELET_EXTRA_ARGS="--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

最后我们重启下kubelet服务

systemctl daemon-reload
systemctl restart kubelet

再次检查日志,发现没有上述报错了。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://sulao.cn/post/913.html

我要评论

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。