我使用kubeadm设置了一个Kubernetes集群。我在上面安装了prometheus和节点导出程序,其基础是:
吊舱似乎运行正常:
kubectl get pods --namespace=monitoring -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-exporter-jk2sd 1/1 Running 0 90m 192.168.5.20 work03 <none> <none>
node-exporter-jldrx 1/1 Running 0 90m 192.168.5.17 work04 <none> <none>
node-exporter-mgtld 1/1 Running 0 90m 192.168.5.15 work01 <none> <none>
node-exporter-tq7bx 1/1 Running 0 90m 192.168.5.41 work02 <none> <none>
prometheus-deployment-5d79b5f65b-tkpd2 1/1 Running 0 91m 192.168.5.40 work02 <none> <none>
我也能看到端点:
kubectl get endpoints -n monitoring
NAME ENDPOINTS AGE
node-exporter 192.168.5.15:9100,192.168.5.17:9100,192.168.5.20:9100 + 1 more... 5m3s
我也做了:kubectl port-forward prometheus-deployment-5d79b5f65b-tkpd2 8080:9090 -n monitoring
,当我访问prometheus > Status >目标值时,我没有在那里找到节点出口商。当我开始为节点导出器报告的度量输入查询时,它不会自动显示在查询编辑器中。
来自普罗米修斯舱的日志似乎有很多错误:
kubectl logs prometheus-deployment-5d79b5f65b-tkpd2 -n monitoring
level=info ts=2021-08-11T16:24:21.743Z caller=main.go:428 msg="Starting Prometheus" version="(version=2.29.1, branch=HEAD, revision=dcb07e8eac34b5ea37cd229545000b857f1c1637)"
level=info ts=2021-08-11T16:24:21.743Z caller=main.go:433 build_context="(go=go1.16.7, user=root@364730518a4e, date=20210811-14:48:27)"
level=info ts=2021-08-11T16:24:21.743Z caller=main.go:434 host_details="(Linux 5.4.0-70-generic #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 x86_64 prometheus-deployment-5d79b5f65b-tkpd2 (none))"
level=info ts=2021-08-11T16:24:21.743Z caller=main.go:435 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2021-08-11T16:24:21.743Z caller=main.go:436 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2021-08-11T16:24:21.745Z caller=web.go:541 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2021-08-11T16:24:21.745Z caller=main.go:812 msg="Starting TSDB ..."
level=info ts=2021-08-11T16:24:21.748Z caller=tls_config.go:191 component=web msg="TLS is disabled." http2=false
level=info ts=2021-08-11T16:24:21.753Z caller=head.go:815 component=tsdb msg="Replaying on-disk memory mappable chunks if any"
level=info ts=2021-08-11T16:24:21.753Z caller=head.go:829 component=tsdb msg="On-disk memory mappable chunks replay completed" duration=4.15µs
level=info ts=2021-08-11T16:24:21.753Z caller=head.go:835 component=tsdb msg="Replaying WAL, this may take a while"
level=info ts=2021-08-11T16:24:21.754Z caller=head.go:892 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
level=info ts=2021-08-11T16:24:21.754Z caller=head.go:898 component=tsdb msg="WAL replay completed" checkpoint_replay_duration=75.316µs wal_replay_duration=451.769µs total_replay_duration=566.051µs
level=info ts=2021-08-11T16:24:21.756Z caller=main.go:839 fs_type=EXT4_SUPER_MAGIC
level=info ts=2021-08-11T16:24:21.756Z caller=main.go:842 msg="TSDB started"
level=info ts=2021-08-11T16:24:21.756Z caller=main.go:969 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2021-08-11T16:24:21.757Z caller=kubernetes.go:282 component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
level=info ts=2021-08-11T16:24:21.759Z caller=kubernetes.go:282 component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
level=info ts=2021-08-11T16:24:21.762Z caller=kubernetes.go:282 component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
level=info ts=2021-08-11T16:24:21.764Z caller=main.go:1006 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=7.940972ms db_storage=607ns remote_storage=1.251µs web_handler=283ns query_engine=694ns scrape=227.668µs scrape_sd=6.081132ms notify=27.11µs notify_sd=16.477µs rules=648.58µs
level=info ts=2021-08-11T16:24:21.764Z caller=main.go:784 msg="Server is ready to receive web requests."
level=error ts=2021-08-11T16:24:51.765Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:24:51.765Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:24:51.765Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:24:51.766Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:24:51.766Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:22.587Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:22.855Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:23.153Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:23.261Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:23.335Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:54.814Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:55.282Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:55.516Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:55.934Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:25:56.442Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:26:30.058Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:26:30.204Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:26:30.246Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:26:30.879Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:26:31.479Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:09.673Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:09.835Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:10.467Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:11.170Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:12.684Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:27:55.324Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:28:01.550Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:28:01.621Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:28:04.801Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:28:05.598Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:28:57.256Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
level=error ts=2021-08-11T16:29:04.688Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0\": dial tcp 10.96.0.1:443: i/o timeout"
有没有办法解决这个问题,让节点出口商出现在目标中?
版本详细信息:
kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.9", GitCommit:"7a576bc3935a6b555e33346fd73ad77c925e9e4a", GitTreeState:"clean", BuildDate:"2021-07-15T20:56:38Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}
编辑:群集的设置如下:
sudo kubeadm reset
sudo rm $HOME/.kube/config
sudo kubeadm init --pod-network-cidr=192.168.5.0/24
mkdir -p $HOME/.kube; sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config; sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
它使用法兰绒。
法兰绒吊舱在运行:
kube-flannel-ds-45qwf 1/1 Running 0 31h x.x.x.41 work01 <none> <none>
kube-flannel-ds-4rwzj 1/1 Running 0 31h x.x.x.40 mast01 <none> <none>
kube-flannel-ds-8fdtt 1/1 Running 24 31h x.x.x.43 work03 <none> <none>
kube-flannel-ds-8hl5f 1/1 Running 23 31h x.x.x.44 work04 <none> <none>
kube-flannel-ds-xqtrd 1/1 Running 0 31h x.x.x.42 work02 <none> <none>
发布于 2021-08-11 18:22:34
这一问题与SDN未能正常工作有关。
作为一般规则,排除故障,我们将检查SDN豆荚(棉布,编织,或在本例中法兰绒),他们是否健康,在他们的日志中的任何错误,.
检查iptables (iptables -nL
)和ipv (ipvsadm -l n
)配置节点。
重新启动SDN荚,以及kube,如果您仍然没有找到任何东西.
现在,在这个特定的案例中,我们没有受到中断的影响:集群是新部署的,SDN很可能根本不起作用--尽管这一点并不明显,对于kubeadm部署来说,除了默认情况之外,它不会随其他端口一起运行,其中大多数都使用主机网络。
kubeadm init命令提到pod CIDR大约为192.168.5.0/24,其中包含两个注释:
带有所有SDN的
时,每个区域都被静态地分配给它们。
运行法兰绒SDN: kubeadm init的
kube-flannel-cfg
ConfigMap中配置的子网匹配的--pod-network-cidr
参数,请参阅net-conf.json
key.。
虽然我不熟悉修复这个问题的过程,但在ServerFault上似乎有一个给出一些说明的答案,这听起来是正确的:https://serverfault.com/a/977401/293779
https://stackoverflow.com/questions/68745901
复制相似问题