前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >第10课 Kubernetes之Service不能访问排查流程实践

第10课 Kubernetes之Service不能访问排查流程实践

作者头像
辉哥
发布2021-11-24 13:38:51
2.1K0
发布2021-11-24 13:38:51
举报
文章被收录于专栏:区块链入门区块链入门

摘要

在学习Kubernetes过程中,经常会遇到Service无法访问,这篇文章总结了可能导致的情况,希望能帮助你找到问题所在。

内容

为了完成本次演练,先运行部署一个应用:

代码语言:javascript
复制
# kubectl create deployment web --image=nginx --replicas=3
deployment.apps/web created
# kubectl expose deployment web --port=8082 --type=NodePort
service/web exposed

确保Pod运行:

代码语言:javascript
复制
#  kubectl get pods,svc
NAME                      READY   STATUS    RESTARTS   AGE
pod/dnsutils              1/1     Running   25         25h
pod/mysql-5ws56           1/1     Running   0          20h
pod/mysql-fwpgc           1/1     Running   0          25h
pod/mysql-smggm           1/1     Running   0          20h
pod/myweb-8dc2n           1/1     Running   0          25h
pod/myweb-mfbpd           1/1     Running   0          25h
pod/myweb-zn8z2           1/1     Running   0          25h
pod/web-96d5df5c8-8fwsb   1/1     Running   0          69s
pod/web-96d5df5c8-g6hgp   1/1     Running   0          69s
pod/web-96d5df5c8-t7xzv   1/1     Running   0          69s

NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
service/kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP          25h
service/mysql        ClusterIP   10.99.230.190    <none>        3306/TCP         25h
service/myweb        NodePort    10.105.77.88     <none>        8080:31330/TCP   25h
service/web          NodePort    10.103.246.193   <none>        8082:31303/TCP   17s

问题1:无法通过 Service 名称访问

如果你是访问的Service名称,需要确保CoreDNS服务已经部署:

代码语言:javascript
复制
# kubectl get pods -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-74ff55c5b-8q44c              1/1     Running   0          26h
coredns-74ff55c5b-f7j5g              1/1     Running   0          26h
etcd-k8s-master                      1/1     Running   2          26h
kube-apiserver-k8s-master            1/1     Running   2          26h
kube-controller-manager-k8s-master   1/1     Running   0          26h
kube-flannel-ds-f5tn6                1/1     Running   0          21h
kube-flannel-ds-ftfgf                1/1     Running   0          26h
kube-proxy-hnp7c                     1/1     Running   0          26h
kube-proxy-njw8l                     1/1     Running   0          21h
kube-scheduler-k8s-master            1/1     Running   0          26h

确认CoreDNS已部署,如果状态不是Running,请检查容器日志进一步查找问题。 采用dnsutils来测试域名解析。 dnsutils.yaml

代码语言:javascript
复制
apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
spec:
  containers:
  - name: dnsutils
    image: mydlqclub/dnsutils:1.3
    imagePullPolicy: IfNotPresent
    command: ["sleep","3600"]

运行并进入容器

代码语言:javascript
复制
# kubectl create -f dnsutils.yaml

# kubectl exec -it dnsutils sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

/ # nslookup web
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   web.default.svc.cluster.local
Address: 10.103.246.193

如果解析失败,可以尝试限定命名空间:

代码语言:javascript
复制
/ # nslookup web.default
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   web.default.svc.cluster.local
Address: 10.103.246.193

如果解析成功,需要调整应用使用跨命名空间的名称访问Service。

如果仍然解析失败,尝试使用完全限定的名称:

代码语言:javascript
复制
/ # nslookup web.default.svc.cluster.local
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   web.default.svc.cluster.local
Address: 10.103.246.193

说明:其中“default”表示正在操作的命名空间,“svc”表示是一个Service,“cluster.local”是集群域。

再集群中的Node尝试指定DNS IP(你的可能不同,可以通过kubectl get svc -n kube-system查看)解析下:

代码语言:javascript
复制
#  nslookup web.default.svc.cluster.local
Server:     103.224.222.222
Address:    103.224.222.222#53

** server can't find web.default.svc.cluster.local: REFUSED

发现查找不到。检查 /etc/resolv.conf 文件是否正确,增加coreDNS的IP和查找路径。 增加:

代码语言:javascript
复制
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

改为: vim /etc/resolv.conf

代码语言:javascript
复制
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 103.224.222.222
nameserver 103.224.222.223
nameserver 8.8.8.8
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

说明:

nameserver:行必须指定CoreDNS Service,它通过在kubelet设置 --cluster-dns 参加自动配置。

search :行必须包含一个适当的后缀,以便查找 Service 名称。在本例中,它在本地 Namespace(default.svc.cluster.local)、所有 Namespace 中的 Service(svc.cluster.local)以及集群(cluster.local)中查找服务。

options :行必须设置足够高的 ndots,以便 DNS 客户端库优先搜索路径。在默认情况下,Kubernetes 将这个值设置为 5。

问题2:无法通过 Service IP访问

假设可以通过Service名称访问(CoreDNS正常工作),那么接下来要测试的 Service 是否工作正常。从集群中的一个节点,访问 Service 的 IP:

代码语言:javascript
复制
# curl -I 10.103.246.193
HTTP/1.1 200 OK
Server: Tengine
Date: Sun, 22 Aug 2021 13:04:15 GMT
Content-Type: text/html
Content-Length: 1326
Last-Modified: Wed, 26 Apr 2017 08:03:47 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "59005463-52e"
Accept-Ranges: bytes

本集群异常,连接超时:

代码语言:javascript
复制
# curl -I 10.103.246.193
curl: (7) Failed to connect to 10.103.246.193 port 8082: Connection timed out

思路1:Service 端口配置是否正确?

检查 Service 配置和使用的端口是否正确:

代码语言:javascript
复制
# kubectl get svc web -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2021-08-22T04:04:11Z"
  labels:
    app: web
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:app: {}
      f:spec:
        f:externalTrafficPolicy: {}
        f:ports:
          .: {}
          k:{"port":8082,"protocol":"TCP"}:
            .: {}
            f:port: {}
            f:protocol: {}
            f:targetPort: {}
        f:selector:
          .: {}
          f:app: {}
        f:sessionAffinity: {}
        f:type: {}
    manager: kubectl-expose
    operation: Update
    time: "2021-08-22T04:04:11Z"
  name: web
  namespace: default
  resourceVersion: "118039"
  uid: fa5bbc6b-7a79-45a4-b6ba-e015340d2bab
spec:
  clusterIP: 10.103.246.193
  clusterIPs:
  - 10.103.246.193
  externalTrafficPolicy: Cluster
  ports:
  - nodePort: 31303
    port: 8082
    protocol: TCP
    targetPort: 8082
  selector:
    app: web
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

说明:

  • spec.ports[]:访问ClusterIP带的端口,8082
  • targetPort :目标端口,是容器中服务提供的端口,8082
  • spec.nodePort :集群外部访问端口,http://NodeIP:31303

思路2:Service 是否正确关联到Pod?

检查 Service 关联的 Pod 是否正确:

代码语言:javascript
复制
# kubectl get pods  -o wide -l app=web
NAME                  READY   STATUS    RESTARTS   AGE    IP           NODE        NOMINATED NODE   READINESS GATES
web-96d5df5c8-8fwsb   1/1     Running   0          4h9m   10.244.1.5   k8s-node2   <none>           <none>
web-96d5df5c8-g6hgp   1/1     Running   0          4h9m   10.244.1.6   k8s-node2   <none>           <none>
web-96d5df5c8-t7xzv   1/1     Running   0          4h9m   10.244.1.4   k8s-node2   <none>           <none>

-l app=hostnames 参数是一个标签选择器。

在 Kubernetes 系统中有一个控制循环,它评估每个 Service 的选择器,并将结果保存到 Endpoints 对象中。

在k8s-node2上却是可以通的。

代码语言:javascript
复制
root@k8s-node2:/data/k8s# curl -I 10.244.1.4
HTTP/1.1 200 OK
Server: nginx/1.21.1
Date: Sun, 22 Aug 2021 08:16:16 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 06 Jul 2021 14:59:17 GMT
Connection: keep-alive
ETag: "60e46fc5-264"
Accept-Ranges: bytes

这3个POD都部署在k8s-node2上,不是查询的k8s-master节点。 说明本集群的2个节点不同,大概率是flannel出问题了。

在 Kubernetes 系统中有一个控制循环,它评估每个 Service 的选择器,并将结果保存到 Endpoints 对象中。

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl get endpoints web
NAME   ENDPOINTS                                         AGE
web    10.244.1.4:8082,10.244.1.5:8082,10.244.1.6:8082   4h14m

结果所示, endpoints 控制器已经为 Service 找到了 Pods。但并不说明关联的Pod就是正确的,还需要进一步确认Service 的 spec.selector 字段是否与Deployment中的 metadata.labels 字段值一致。

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl get svc web -o yaml
...
  selector:
    app: web
...

获取deployment的信息;

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl get deployment web -o yaml

...
  selector:
    matchLabels:
      app: web
...

思路3:Pod 是否正常工作?

检查Pod是否正常工作,绕过Service,直接访问Pod IP:

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE     IP           NODE         NOMINATED NODE   READINESS GATES
dnsutils              1/1     Running   29         29h     10.244.0.4   k8s-master   <none>           <none>
mysql-5ws56           1/1     Running   0          24h     10.244.1.3   k8s-node2    <none>           <none>
mysql-fwpgc           1/1     Running   0          29h     10.244.0.5   k8s-master   <none>           <none>
mysql-smggm           1/1     Running   0          24h     10.244.1.2   k8s-node2    <none>           <none>
myweb-8dc2n           1/1     Running   0          29h     10.244.0.7   k8s-master   <none>           <none>
myweb-mfbpd           1/1     Running   0          29h     10.244.0.6   k8s-master   <none>           <none>
myweb-zn8z2           1/1     Running   0          29h     10.244.0.8   k8s-master   <none>           <none>
web-96d5df5c8-8fwsb   1/1     Running   0          4h21m   10.244.1.5   k8s-node2    <none>           <none>
web-96d5df5c8-g6hgp   1/1     Running   0          4h21m   10.244.1.6   k8s-node2    <none>           <none>
web-96d5df5c8-t7xzv   1/1     Running   0          4h21m   10.244.1.4   k8s-node2    <none>           <none>

部署在另一个节点的pods不可以通信。

代码语言:javascript
复制
root@k8s-master:/data/k8s# curl -I 10.244.1.3:3306
curl: (7) Failed to connect to 10.244.1.4 port 3306: Connection timed out

部署在本节点的pods可以通信。

代码语言:javascript
复制
root@k8s-master:/data/k8s# curl -I 10.244.0.5:3306
5.7.35=H9A_)cÿÿ󿿕b.>,q#99~/~mysql_native_password!ÿ#08S01Got packets out of order

此处问题在此指向2个节点pod无法通信的问题。

注: 使用的是 Pod 端口(3306),而不是 Service 端口(3306)。

如果不能正常响应,说明容器中服务有问题, 这个时候可以用kubectl logs查看日志或者使用 kubectl exec 直接进入 Pod检查服务。

除了本身服务问题外,还有可能是CNI网络组件部署问题,现象是:curl访问10次,可能只有两三次能访问,能访问时候正好Pod是在当前节点,这并没有走跨主机网络。 如果是这种现象,检查网络组件运行状态和容器日志:

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl get pods -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-74ff55c5b-8q44c              1/1     Running   0          29h
coredns-74ff55c5b-f7j5g              1/1     Running   0          29h
etcd-k8s-master                      1/1     Running   2          29h
kube-apiserver-k8s-master            1/1     Running   2          29h
kube-controller-manager-k8s-master   1/1     Running   0          29h
kube-flannel-ds-f5tn6                1/1     Running   0          24h
kube-flannel-ds-ftfgf                1/1     Running   0          29h
kube-proxy-hnp7c                     1/1     Running   0          29h
kube-proxy-njw8l                     1/1     Running   0          24h
kube-scheduler-k8s-master            1/1     Running   0          29h

思路4:kube-proxy 组件正常工作吗?

如果到了这里,你的 Service 正在运行,也有 Endpoints, Pod 也正在服务。 接下来就该检查负责 Service 的组件kube-proxy是否正常工作。 确认 kube-proxy 运行状态:

代码语言:javascript
复制
root@k8s-master:/data/k8s# ps -ef |grep kube-proxy
root      8494  8469  0 Aug21 ?        00:00:15 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=k8s-master
root     24323 25972  0 16:34 pts/1    00:00:00 grep kube-proxy

如果有进程存在,下一步确认它有没有工作中有错误,比如连接主节点失败。 要做到这一点,必须查看日志。查看日志方式取决于K8s部署方式,如果是kubeadm部署。 检查k8s-master的日志

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl logs kube-proxy-hnp7c  -n kube-system
I0821 02:41:24.705408       1 node.go:172] Successfully retrieved node IP: 192.168.0.3
I0821 02:41:24.705709       1 server_others.go:142] kube-proxy node IP is an IPv4 address (192.168.0.3), assume IPv4 operation
W0821 02:41:24.740886       1 server_others.go:578] Unknown proxy mode "", assuming iptables proxy
I0821 02:41:24.740975       1 server_others.go:185] Using iptables Proxier.
I0821 02:41:24.742224       1 server.go:650] Version: v1.20.5
I0821 02:41:24.742656       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0821 02:41:24.742680       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0821 02:41:24.742931       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0821 02:41:24.742990       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0821 02:41:24.747556       1 config.go:315] Starting service config controller
I0821 02:41:24.748858       1 shared_informer.go:240] Waiting for caches to sync for service config
I0821 02:41:24.748901       1 config.go:224] Starting endpoint slice config controller
I0821 02:41:24.748927       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0821 02:41:24.849006       1 shared_informer.go:247] Caches are synced for endpoint slice config 
I0821 02:41:24.849071       1 shared_informer.go:247] Caches are synced for service config 

检查k8s-node2的日志

代码语言:javascript
复制
root@k8s-master:/data/k8s# kubectl logs kube-proxy-njw8l  -n kube-system
I0821 07:43:39.092419       1 node.go:172] Successfully retrieved node IP: 192.168.0.5
I0821 07:43:39.092475       1 server_others.go:142] kube-proxy node IP is an IPv4 address (192.168.0.5), assume IPv4 operation
W0821 07:43:39.108196       1 server_others.go:578] Unknown proxy mode "", assuming iptables proxy
I0821 07:43:39.108294       1 server_others.go:185] Using iptables Proxier.
I0821 07:43:39.108521       1 server.go:650] Version: v1.20.5
I0821 07:43:39.108814       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0821 07:43:39.109295       1 config.go:315] Starting service config controller
I0821 07:43:39.109304       1 shared_informer.go:240] Waiting for caches to sync for service config
I0821 07:43:39.109323       1 config.go:224] Starting endpoint slice config controller
I0821 07:43:39.109327       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0821 07:43:39.209418       1 shared_informer.go:247] Caches are synced for endpoint slice config 
I0821 07:43:39.209418       1 shared_informer.go:247] Caches are synced for service config 

发现一个信息,Unknown proxy mode "", assuming iptables proxy,表明采用的是iptables模式。

如果是二进制方式部署:

代码语言:javascript
复制
journalctl -u kube-proxy

思路5:kube-proxy 是否在写 iptables 规则?

kube-proxy 的主要负载 Services 的 负载均衡 规则生成,默认情况下使用iptables实现,检查一下这些规则是否已经被写好了。 检查k8s-master的iptables记录:

代码语言:javascript
复制
root@k8s-master:/data/k8s# iptables-save |grep web
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/myweb" -m tcp --dport 31330 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/myweb" -m tcp --dport 31330 -j KUBE-SVC-FCM76ICS4D7Y4C5Y
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/web" -m tcp --dport 31303 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/web" -m tcp --dport 31303 -j KUBE-SVC-LOLE4ISW44XBNF3G
-A KUBE-SEP-KYOPKKRUSGN4EPOL -s 10.244.0.8/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-KYOPKKRUSGN4EPOL -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.8:8080
-A KUBE-SEP-MOKUSSRWIVOFT5Y7 -s 10.244.0.7/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-MOKUSSRWIVOFT5Y7 -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.7:8080
-A KUBE-SEP-V6Q53FEPJ64J3EJW -s 10.244.1.6/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-V6Q53FEPJ64J3EJW -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.6:8082
-A KUBE-SEP-YCBVNDXW4SG5UDC3 -s 10.244.1.5/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-YCBVNDXW4SG5UDC3 -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.5:8082
-A KUBE-SEP-YQ4MLBG6JI5O2LTN -s 10.244.0.6/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-YQ4MLBG6JI5O2LTN -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.6:8080
-A KUBE-SEP-ZNATZ23XMS7WU546 -s 10.244.1.4/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZNATZ23XMS7WU546 -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.4:8082
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.105.77.88/32 -p tcp -m comment --comment "default/myweb cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.77.88/32 -p tcp -m comment --comment "default/myweb cluster IP" -m tcp --dport 8080 -j KUBE-SVC-FCM76ICS4D7Y4C5Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.103.246.193/32 -p tcp -m comment --comment "default/web cluster IP" -m tcp --dport 8082 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.246.193/32 -p tcp -m comment --comment "default/web cluster IP" -m tcp --dport 8082 -j KUBE-SVC-LOLE4ISW44XBNF3G
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-YQ4MLBG6JI5O2LTN
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-MOKUSSRWIVOFT5Y7
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -j KUBE-SEP-KYOPKKRUSGN4EPOL
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-ZNATZ23XMS7WU546
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YCBVNDXW4SG5UDC3
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -j KUBE-SEP-V6Q53FEPJ64J3EJW

检查k8s-node2的iptables记录:

代码语言:javascript
复制
root@k8s-node2:/data/k8s# iptables-save |grep web
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/myweb" -m tcp --dport 31330 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/myweb" -m tcp --dport 31330 -j KUBE-SVC-FCM76ICS4D7Y4C5Y
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/web" -m tcp --dport 31303 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/web" -m tcp --dport 31303 -j KUBE-SVC-LOLE4ISW44XBNF3G
-A KUBE-SEP-KYOPKKRUSGN4EPOL -s 10.244.0.8/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-KYOPKKRUSGN4EPOL -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.8:8080
-A KUBE-SEP-MOKUSSRWIVOFT5Y7 -s 10.244.0.7/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-MOKUSSRWIVOFT5Y7 -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.7:8080
-A KUBE-SEP-V6Q53FEPJ64J3EJW -s 10.244.1.6/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-V6Q53FEPJ64J3EJW -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.6:8082
-A KUBE-SEP-YCBVNDXW4SG5UDC3 -s 10.244.1.5/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-YCBVNDXW4SG5UDC3 -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.5:8082
-A KUBE-SEP-YQ4MLBG6JI5O2LTN -s 10.244.0.6/32 -m comment --comment "default/myweb" -j KUBE-MARK-MASQ
-A KUBE-SEP-YQ4MLBG6JI5O2LTN -p tcp -m comment --comment "default/myweb" -m tcp -j DNAT --to-destination 10.244.0.6:8080
-A KUBE-SEP-ZNATZ23XMS7WU546 -s 10.244.1.4/32 -m comment --comment "default/web" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZNATZ23XMS7WU546 -p tcp -m comment --comment "default/web" -m tcp -j DNAT --to-destination 10.244.1.4:8082
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.105.77.88/32 -p tcp -m comment --comment "default/myweb cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.77.88/32 -p tcp -m comment --comment "default/myweb cluster IP" -m tcp --dport 8080 -j KUBE-SVC-FCM76ICS4D7Y4C5Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.103.246.193/32 -p tcp -m comment --comment "default/web cluster IP" -m tcp --dport 8082 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.246.193/32 -p tcp -m comment --comment "default/web cluster IP" -m tcp --dport 8082 -j KUBE-SVC-LOLE4ISW44XBNF3G
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-YQ4MLBG6JI5O2LTN
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-MOKUSSRWIVOFT5Y7
-A KUBE-SVC-FCM76ICS4D7Y4C5Y -m comment --comment "default/myweb" -j KUBE-SEP-KYOPKKRUSGN4EPOL
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-ZNATZ23XMS7WU546
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YCBVNDXW4SG5UDC3
-A KUBE-SVC-LOLE4ISW44XBNF3G -m comment --comment "default/web" -j KUBE-SEP-V6Q53FEPJ64J3EJW

如果你已经讲代理模式改为IPVS了,可以通过这种方式查看。正确情况下信息如:

代码语言:javascript
复制
[root@k8s-node1 ~]# ipvsadm -ln
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port Forward Weight ActiveConn InActConn
...
TCP 10.104.0.64:80 rr
  -> 10.244.169.135:80 Masq 1 0 0
  -> 10.244.36.73:80 Masq 1 0 0
  -> 10.244.169.136:80 Masq 1 0 0...

使用ipvsadm查看ipvs相关规则,若是没有这个命令能够直接yum安装

代码语言:javascript
复制
apt-get  install -y ipvsadm

目前k8s-master的情况如下:

代码语言:javascript
复制
root@k8s-master:/data/k8s# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn

正常会得到上面结果,如果没有对应规则,说明kube-proxy组件没工作或者与当前操作系统不兼容导致生成规则失败。

附:Service工作流程图(附图为示意,非实际IP地址。)

问题2解决:无法通过 Service IP访问

查看iptables-save的结果没有发现异常,还是对iptalbes方式不够熟悉。采用kube-proxy开启ipvs代替iptables的方案看看。

在k8s-master,k8s-node2这2个节点执行以下操作。

加载内核模快

查看内核模块是否加载负

代码语言:javascript
复制
# lsmod|grep ip_vs
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  0
ip_vs                 147456  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          106496  7 ip_vs,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
libcrc32c              16384  2 raid456,ip_vs

若是没有加载,使用以下命令加载ipvs相关模块性能

代码语言:javascript
复制
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4

更改kube-proxy配置

代码语言:javascript
复制
# kubectl edit configmap kube-proxy -n kube-system

找到以下部分的内容.net

代码语言:javascript
复制
    ipvs:
      minSyncPeriod: 0s
      scheduler: ""
      syncPeriod: 30s
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: "ipvs"
    nodePortAddresses: null

其中mode原来是空,默认为iptables模式,改成ipvs日志 scheduler默认是空,默认负载均衡算法为轮训code 编辑完,保存退出。

删除全部kube-proxy的pod

代码语言:javascript
复制
# kubectl get pods -n kube-system |grep kube-proxy
kube-proxy-hnp7c                     1/1     Running   0          30h
kube-proxy-njw8l                     1/1     Running   0          25h

root@k8s-node2:/data/k8s# kubectl delete pod   kube-proxy-hnp7c  -n kube-system
pod "kube-proxy-hnp7c" deleted
root@k8s-node2:/data/k8s# kubectl delete pod   kube-proxy-njw8l  -n kube-system 
pod "kube-proxy-njw8l" deleted

root@k8s-node2:/data/k8s#  kubectl get pods -n kube-system |grep kube-proxy
kube-proxy-4sv2c                     1/1     Running   0          36s
kube-proxy-w7kpm                     1/1     Running   0          16s

# kubectl logs kube-proxy-4sv2c  -n kube-system

root@k8s-node2:/data/k8s# kubectl logs kube-proxy-4sv2c  -n kube-system
I0822 09:36:38.757662       1 node.go:172] Successfully retrieved node IP: 192.168.0.3
I0822 09:36:38.757707       1 server_others.go:142] kube-proxy node IP is an IPv4 address (192.168.0.3), assume IPv4 operation
I0822 09:36:38.772798       1 server_others.go:258] Using ipvs Proxier.
W0822 09:36:38.774131       1 proxier.go:445] IPVS scheduler not specified, use rr by default
I0822 09:36:38.774388       1 server.go:650] Version: v1.20.5
I0822 09:36:38.774742       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0822 09:36:38.775051       1 config.go:224] Starting endpoint slice config controller
I0822 09:36:38.775127       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0822 09:36:38.775245       1 config.go:315] Starting service config controller
I0822 09:36:38.775290       1 shared_informer.go:240] Waiting for caches to sync for service config
I0822 09:36:38.875365       1 shared_informer.go:247] Caches are synced for endpoint slice config 
I0822 09:36:38.875616       1 shared_informer.go:247] Caches are synced for service config 

.有.....Using ipvs Proxier......便可.

运行ipvsadm

使用ipvsadm查看ipvs相关规则,若是没有这个命令能够直接使用apt-get安装。

代码语言:javascript
复制
root@k8s-master:/data/k8s# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.17.0.1:31330 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  192.168.0.3:31303 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
TCP  192.168.0.3:31330 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  10.96.0.1:443 rr
  -> 192.168.0.3:6443             Masq    1      0          0         
TCP  10.96.0.10:53 rr
  -> 10.244.0.2:53                Masq    1      0          0         
  -> 10.244.0.3:53                Masq    1      0          0         
TCP  10.96.0.10:9153 rr
  -> 10.244.0.2:9153              Masq    1      0          0         
  -> 10.244.0.3:9153              Masq    1      0          0         
TCP  10.99.230.190:3306 rr
  -> 10.244.0.5:3306              Masq    1      0          0         
  -> 10.244.1.2:3306              Masq    1      0          0         
  -> 10.244.1.3:3306              Masq    1      0          0         
TCP  10.103.246.193:8082 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
TCP  10.105.77.88:8080 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  10.244.0.0:31303 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
TCP  10.244.0.0:31330 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  10.244.0.1:31303 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
TCP  10.244.0.1:31330 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  127.0.0.1:31303 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
TCP  127.0.0.1:31330 rr
  -> 10.244.0.6:8080              Masq    1      0          0         
  -> 10.244.0.7:8080              Masq    1      0          0         
  -> 10.244.0.8:8080              Masq    1      0          0         
TCP  172.17.0.1:31303 rr
  -> 10.244.1.4:8082              Masq    1      0          0         
  -> 10.244.1.5:8082              Masq    1      0          0         
  -> 10.244.1.6:8082              Masq    1      0          0         
UDP  10.96.0.10:53 rr
  -> 10.244.0.2:53                Masq    1      0          564       
  -> 10.244.0.3:53                Masq    1      0          563
代码语言:javascript
复制
root@k8s-master:/data/k8s# curl -I 10.103.246.193:8082
^C
root@k8s-master:/data/k8s# curl -I 114.67.107.240:8082
^C

还是没有解决。

底层的iptables设置

百度收到了以下一篇文章,解决flannel下k8s pod及容器无法跨主机互通问题,参考其完成在k8s-master和k8s-node2的配置。

代码语言:javascript
复制
# iptables -P INPUT ACCEPT
# iptables -P FORWARD ACCEPT
# iptables -F

# iptables -L -n

root@k8s-master:/data/k8s#  iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
JDCLOUDHIDS_IN_LIVE  all  --  0.0.0.0/0            0.0.0.0/0           
JDCLOUDHIDS_IN  all  --  0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
KUBE-FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */
ACCEPT     all  --  10.244.0.0/16        0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            10.244.0.0/16       

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
JDCLOUDHIDS_OUT_LIVE  all  --  0.0.0.0/0            0.0.0.0/0           
JDCLOUDHIDS_OUT  all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (0 references)
target     prot opt source               destination         

Chain JDCLOUDHIDS_IN (1 references)
target     prot opt source               destination         

Chain JDCLOUDHIDS_IN_LIVE (1 references)
target     prot opt source               destination         

Chain JDCLOUDHIDS_OUT (1 references)
target     prot opt source               destination         

Chain JDCLOUDHIDS_OUT_LIVE (1 references)
target     prot opt source               destination         

Chain KUBE-EXTERNAL-SERVICES (0 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (0 references)
target     prot opt source               destination         

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-SERVICES (0 references)
target     prot opt source               destination

然后重新操作,发现服务节点能直接访问了。但是8082端口还是不能访问,跨节点ping包还是不成功的。

代码语言:javascript
复制
root@k8s-master:/data/k8s# curl -I 10.103.246.193:8082
^C
root@k8s-master:/data/k8s# curl -I 114.67.107.240:8082
^C

root@k8s-master:/data/k8s# ping 10.244.1.3
PING 10.244.1.3 (10.244.1.3) 56(84) bytes of data.
^C
--- 10.244.1.3 ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 10999ms

root@k8s-master:/data/k8s# ping 10.244.0.5
PING 10.244.0.5 (10.244.0.5) 56(84) bytes of data.
64 bytes from 10.244.0.5: icmp_seq=1 ttl=64 time=0.089 ms
64 bytes from 10.244.0.5: icmp_seq=2 ttl=64 time=0.082 ms
^C
--- 10.244.0.5 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.082/0.085/0.089/0.009 ms


# curl -I 10.103.246.193
HTTP/1.1 200 OK
Server: Tengine
Date: Sun, 22 Aug 2021 13:10:02 GMT
Content-Type: text/html
Content-Length: 1326
Last-Modified: Wed, 26 Apr 2017 08:03:47 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "59005463-52e"
Accept-Ranges: bytes

参考

(1)K8s常见问题:Service 不能访问排查流程 https://mp.weixin.qq.com/s/oCRWkBquUnRLC36CPwoZ1Q

(2)kube-proxy开启ipvs代替iptables https://www.shangmayuan.com/a/8fae7d6c18764194a8adce91.html

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2021/8/22 下,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 摘要
  • 内容
    • 问题1:无法通过 Service 名称访问
      • 问题2:无法通过 Service IP访问
        • 思路1:Service 端口配置是否正确?
        • 思路2:Service 是否正确关联到Pod?
        • 思路3:Pod 是否正常工作?
        • 思路4:kube-proxy 组件正常工作吗?
        • 思路5:kube-proxy 是否在写 iptables 规则?
      • 问题2解决:无法通过 Service IP访问
        • 加载内核模快
        • 更改kube-proxy配置
        • 删除全部kube-proxy的pod
        • 运行ipvsadm
        • 底层的iptables设置
    • 参考
    相关产品与服务
    容器服务
    腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档