首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Kubernetes Calico节点'XXXXXXXXXXX‘已经使用IPv4地址XXXXXXXXX,CrashLoopBackOff

Kubernetes Calico节点'XXXXXXXXXXX‘已经使用IPv4地址XXXXXXXXX,CrashLoopBackOff
EN

Stack Overflow用户
提问于 2018-09-18 15:57:37
回答 2查看 3.2K关注 0票数 2

我使用AWS在VPC和私有子网https://aws-quickstart.s3.amazonaws.com/quickstart-heptio/doc/heptio-kubernetes-on-the-aws-cloud.pdf中创建了Kubernetes集群。一段时间内一切都很好。我把Calico安装在我的Kubernetes集群上了。我有两个节点和一个主节点。主机上的棉荚运行良好,节点上的则处于崩溃回退状态:

代码语言:javascript
运行
复制
NAME                                                               READY     STATUS             RESTARTS   AGE
calico-etcd-ztwjj                                                  1/1       Running            1          55d
calico-kube-controllers-685755779f-ftm92                           1/1       Running            2          55d
calico-node-gkjgl                                                  1/2       CrashLoopBackOff   270        22h
calico-node-jxkvx                                                  2/2       Running            4          55d
calico-node-mxhc5                                                  1/2       CrashLoopBackOff   9          25m

描述其中一个崩溃的吊舱:

代码语言:javascript
运行
复制
ubuntu@ip-10-0-1-133:~$ kubectl describe pod calico-node-gkjgl -n kube-system
Name:           calico-node-gkjgl
Namespace:      kube-system
Node:           ip-10-0-0-237.us-east-2.compute.internal/10.0.0.237
Start Time:     Mon, 17 Sep 2018 16:56:41 +0000
Labels:         controller-revision-hash=185957727
                k8s-app=calico-node
                pod-template-generation=1
Annotations:    scheduler.alpha.kubernetes.io/critical-pod=
Status:         Running
IP:             10.0.0.237
Controlled By:  DaemonSet/calico-node
Containers:
  calico-node:
    Container ID:   docker://d89979ba963c33470139fd2093a5427b13c6d44f4c6bb546c9acdb1a63cd4f28
    Image:          quay.io/calico/node:v3.1.1
    Image ID:       docker-pullable://quay.io/calico/node@sha256:19fdccdd4a90c4eb0301b280b50389a56e737e2349828d06c7ab397311638d29
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 18 Sep 2018 15:14:44 +0000
      Finished:     Tue, 18 Sep 2018 15:14:44 +0000
    Ready:          False
    Restart Count:  270
    Requests:
      cpu:      250m
    Liveness:   http-get http://:9099/liveness delay=10s timeout=1s period=10s #success=1 #failure=6
    Readiness:  http-get http://:9099/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ETCD_ENDPOINTS:                     <set to the key 'etcd_endpoints' of config map 'calico-config'>  Optional: false
      CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
      CLUSTER_TYPE:                       kubeadm,bgp
      CALICO_DISABLE_FILE_LOGGING:        true
      CALICO_K8S_NODE_REF:                 (v1:spec.nodeName)
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      CALICO_IPV4POOL_CIDR:               192.168.0.0/16
      CALICO_IPV4POOL_IPIP:               Always
      FELIX_IPV6SUPPORT:                  false
      FELIX_IPINIPMTU:                    1440
      FELIX_LOGSEVERITYSCREEN:            info
      IP:                                 autodetect
      FELIX_HEALTHENABLED:                true
    Mounts:
      /lib/modules from lib-modules (ro)
      /var/lib/calico from var-lib-calico (rw)
      /var/run/calico from var-run-calico (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-cni-plugin-token-b7sfl (ro)
  install-cni:
    Container ID:  docker://b37e0ec7eba690473a4999a31d9f766f7adfa65f800a7b2dc8e23ead7520252d
    Image:         quay.io/calico/cni:v3.1.1
    Image ID:      docker-pullable://quay.io/calico/cni@sha256:dc345458d136ad9b4d01864705895e26692d2356de5c96197abff0030bf033eb
    Port:          <none>
    Host Port:     <none>
    Command:
      /install-cni.sh
    State:          Running
      Started:      Mon, 17 Sep 2018 17:11:52 +0000
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 17 Sep 2018 16:56:43 +0000
      Finished:     Mon, 17 Sep 2018 17:10:53 +0000
    Ready:          True
    Restart Count:  1
    Environment:
      CNI_CONF_NAME:       10-calico.conflist
      ETCD_ENDPOINTS:      <set to the key 'etcd_endpoints' of config map 'calico-config'>      Optional: false
      CNI_NETWORK_CONFIG:  <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-cni-plugin-token-b7sfl (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:
  var-lib-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/calico
    HostPathType:
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  calico-cni-plugin-token-b7sfl:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-cni-plugin-token-b7sfl
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     :NoSchedule
                 :NoExecute
                 :NoSchedule
                 :NoExecute
                 CriticalAddonsOnly
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason   Age                  From                                               Message
  ----     ------   ----                 ----                                               -------
  Warning  BackOff  4m (x6072 over 22h)  kubelet, ip-10-0-0-237.us-east-2.compute.internal  Back-off restarting failed container

同一吊舱的原木:

代码语言:javascript
运行
复制
ubuntu@ip-10-0-1-133:~$ kubectl logs calico-node-gkjgl -n kube-system -c calico-node
2018-09-18 15:14:44.605 [INFO][8] startup.go 251: Early log level set to info
2018-09-18 15:14:44.605 [INFO][8] startup.go 269: Using stored node name from /var/lib/calico/nodename
2018-09-18 15:14:44.605 [INFO][8] startup.go 279: Determined node name: ip-10-0-0-237.us-east-2.compute.internal
2018-09-18 15:14:44.609 [INFO][8] startup.go 101: Skipping datastore connection test
2018-09-18 15:14:44.610 [INFO][8] startup.go 352: Building new node resource Name="ip-10-0-0-237.us-east-2.compute.internal"
2018-09-18 15:14:44.610 [INFO][8] startup.go 367: Initialize BGP data
2018-09-18 15:14:44.614 [INFO][8] startup.go 564: Using autodetected IPv4 address on interface ens3: 10.0.0.237/19
2018-09-18 15:14:44.614 [INFO][8] startup.go 432: Node IPv4 changed, will check for conflicts
2018-09-18 15:14:44.618 [WARNING][8] startup.go 861: Calico node 'ip-10-0-0-237' is already using the IPv4 address 10.0.0.237.
2018-09-18 15:14:44.618 [WARNING][8] startup.go 1058: Terminating
Calico node failed to start

因此,似乎存在寻找节点IP地址的冲突,或者Calico似乎认为IP已经分配给了另一个节点。通过快速搜索,我找到了这个线程:https://github.com/projectcalico/calico/issues/1628。我看到,应该通过将IP_AUTODETECTION_METHOD设置为can-达达=目的来解决这一问题,我假设这将是“can-达达=10.0.0.237”。此配置是在calico/node容器上设置的环境变量。我一直试图将外壳放入容器本身,但是kubectl告诉我没有找到容器:

代码语言:javascript
运行
复制
ubuntu@ip-10-0-1-133:~$ kubectl exec calico-node-gkjgl --stdin --tty /bin/sh -c calico-node -n kube-system
error: unable to upgrade connection: container not found ("calico-node")

我怀疑这是因为Calico无法分配IP。因此,我登录到主机上,并尝试使用docker对容器进行shell:

代码语言:javascript
运行
复制
root@ip-10-0-0-237:~# docker exec -it k8s_POD_calico-node-gkjgl_kube-system_a6998e98-ba9a-11e8-a9fa-0a97f5a48ef4_1 /bin/bash
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: \"/bin/bash\": stat /bin/bash: no such file or directory"

因此,我想容器中没有要执行的shell。库伯内特斯为什么不能执行这个任务。我尝试在外部运行命令以列出环境变量,但是我没有找到任何命令,但是我可能运行这些命令时出错了:

代码语言:javascript
运行
复制
root@ip-10-0-0-237:~# docker inspect -f '{{range $index, $value := .Config.Env}}{{$value}} {{end}}' k8s_POD_calico-node-gkjgl_kube-system_a6998e98-ba9a-11e8-a9fa-0a97f5a48ef4_1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

root@ip-10-0-0-237:~# docker exec -it k8s_POD_calico-node-gkjgl_kube-system_a6998e98-ba9a-11e8-a9fa-0a97f5a48ef4_1 printenv IP_AUTODETECTION_METHOD
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: \"printenv\": executable file not found in $PATH"

root@ip-10-0-0-237:~# docker exec -it k8s_POD_calico-node-gkjgl_kube-system_a6998e98-ba9a-11e8-a9fa-0a97f5a48ef4_1 /bin/env
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: \"/bin/env\": stat /bin/env: no such file or directory"

好吧,也许我走错路了。我是否应该尝试使用Kubernetes更改Calico配置文件并重新部署它?我在哪里能在我的系统上找到这些?我还没有找到设置环境变量的位置。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-09-18 16:33:45

如果您查看卡里科博士IP_AUTODETECTION_METHOD已经默认为first-round

我的猜测是,上一次的“运行”并没有释放某些内容或IP地址,或者仅仅是v3.1.1版本中的一个bug。

尝试:

  1. 删除CrashBackOff循环中的Calico荚 kubectl -n kube-系统删除calico-node-gkjgl calico-node-mxhc5 5 你的吊舱将被重新创建,并希望初始化。
  2. 将Calico升级到v3.1.3或最新版本。遵循这些文档,我的猜测是,Heptio的Calico安装正在使用etcd数据存储。
  3. 试着了解Heptio的AMIs是如何工作的,看看他们是否有任何问题。这可能需要一些时间,这样你也可以联系他们的支持。
  4. 尝试另一种方法来用Calico安装Kubernetes。在https://kubernetes.io上有很好的文档
票数 2
EN

Stack Overflow用户

发布于 2022-06-16 16:21:07

对我来说,起作用的是移除节点上的码头网络。

我必须列出每个Node:docker network list上的当前网络,然后删除不需要的网络:docker network rm <networkName>

在做完之后,calico部署荚运行得很好。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/52390481

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档