问题介绍:
正在做实验
突然删除Pod卡死(Terminating)
kubectl top报错Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
查看节点状态:发现k8s-node3节点失联
[root@k8s-master1 ~]#kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 69d v1.19.4
k8s-master2 Ready master 69d v1.19.4
k8s-master3 Ready master 69d v1.19.4
k8s-node1 Ready <none> 69d v1.19.4
k8s-node2 Ready <none> 69d v1.19.4
k8s-node3 NotReady <none> 69d v1.19.4
查看metrics-server所在的节点,发现也是在k8s-node3,事件信息显示节点不是准备状态,怪不得top坏了
[root@k8s-master1 ~]#kubectl describe pods -n kube-system metrics-server-b989695d4-wx5wx
Name: metrics-server-b989695d4-wx5wx
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: k8s-node3/42.51.80.136
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning NodeNotReady 8m41s node-controller Node is not ready
登录k8s-node3节点,查看kubelet状态,发现是关闭状态
启动kubelet服务并查看状态,起来了
再次在master节点查看节点状态,发现已经Reay了
我们删除metrics-server的Pod,让它重新调度
[root@k8s-master1 ~]#kubectl delete pods -n kube-system metrics-server-b989695d4-wx5wx
pod "metrics-server-b989695d4-wx5wx" deleted
查看metrics-server信息,发现已经调度到了k8s-master2节点上了
使用top命令,也都恢复了
总结:
出问题确实很难受,但是不要被吓倒,一步一步排查,看不懂的翻译,总能查到解决办法。