前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Flannel host-gw 和 vxlan

Flannel host-gw 和 vxlan

作者头像
runzhliu
发布2022-08-28 13:59:49
8740
发布2022-08-28 13:59:49
举报
文章被收录于专栏:容器计算容器计算

Overview

Flannel 是 LCK 默认采用的网络插件方案,默认条件下 LCK 使用的是 vxlan 的模式,私有化场景下,如果确定客户的主机都在一个子网内,可以使用 host-gw 模式提高网络性能

安装

Flannel 的安装逻辑如下,通过安装的 yaml 文件里有两个 initContainer,专门就是用来做 CNI 和 Flannel 配置的安装,所以命名也是叫 install-cni-plugin 以及 install-cni

那么这两个容器主要是怎么安装的呢,其实很简单,可以看看 args 字段,实际上就是把 flannel 的二进制,以及 cni-conf.json 和 10-flannel.conflist 通过 cp 复制到指定的目录

代码语言:javascript
复制
initContainers:
- name: install-cni-plugin
 #image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
  image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
  command:
  - cp
  args:
  - -f
  - /flannel
  - /opt/cni/bin/flannel
  volumeMounts:
  - name: cni-plugin
    mountPath: /opt/cni/bin
- name: install-cni
 #image: flannelcni/flannel:v0.18.1 for ppc64le and mips64le (dockerhub limitations may apply)
  image: rancher/mirrored-flannelcni-flannel:v0.18.1
  command:
  - cp
  args:
  - -f
  - /etc/kube-flannel/cni-conf.json
  - 
  volumeMounts:
  - name: cni
    mountPath: /etc/cni/net.d
  - name: flannel-cfg
    mountPath: /etc/kube-flannel/

这些配置文件又是从哪里来的呢,实际上是来自于 configMap

代码语言:javascript
复制
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

这些配置文件不会像 initContainer 那样把文件落到宿主机的,而是通过 volumeMount 的方式提供给运行 Flannel 二进制的容器,所以这些文件在宿主机上的 /etc/kube-flannel/ 目录是找不到的,进入到 Flannel 的容器才能看到

代码语言:javascript
复制
# kiexec
Namespace: kube-system | Pod: ✔ kube-flannel-ds-82mww
/ # ls /etc/kube-flannel/
cni-conf.json  net-conf.json

默认配置

vxlan 是 Flannel 默认采用的模式,此模式下的节点路由如下:

代码语言:javascript
复制
# ip r
default via 172.22.0.1 dev eth0
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
10.244.3.0/24 via 10.244.3.0 dev flannel.1 onlink
10.244.4.0/24 via 10.244.4.0 dev flannel.1 onlink
10.244.5.0/24 via 10.244.5.0 dev flannel.1 onlink
169.254.0.0/16 dev eth0 scope link metric 1002

通过修改配置,也可以让 Flannel 切换到 host-gw 上,此模式下的节点路由变成:

代码语言:javascript
复制
# ip r
default via 172.22.0.1 dev eth0
10.4.0.0/24 dev nerdctl0 proto kernel scope link src 10.4.0.1
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.1.0/24 via 172.22.1.176 dev eth0
10.244.2.0/24 via 172.22.0.117 dev eth0
10.244.3.0/24 via 172.22.0.76 dev eth0
10.244.4.0/24 via 172.22.0.212 dev eth0
10.244.5.0/24 via 172.22.0.64 dev eth0
169.254.0.0/16 dev eth0 scope link metric 1002
172.22.0.0/20 dev eth0 proto kernel scope link src 172.22.0.239

切换后,Flannel 的日志如下:

代码语言:javascript
复制
I0826 03:22:37.551391       1 main.go:463] Found network config - Backend type: host-gw
I0826 03:22:37.551432       1 match.go:195] Determining IP address of default interface
I0826 03:22:37.551838       1 match.go:248] Using interface with name eth0 and address 172.22.1.176
I0826 03:22:37.551860       1 match.go:270] Defaulting external address to interface address (172.22.1.176)
I0826 03:22:37.569614       1 kube.go:351] Setting NodeNetworkUnavailable
I0826 03:22:37.579433       1 main.go:341] Setting up masking rules
I0826 03:22:37.758215       1 main.go:362] Changing default FORWARD chain policy to ACCEPT
I0826 03:22:37.758315       1 main.go:375] Wrote subnet file to /run/flannel/subnet.env
I0826 03:22:37.758326       1 main.go:379] Running backend.
I0826 03:22:37.758343       1 main.go:400] Waiting for all goroutines to exit
I0826 03:22:37.761081       1 route_network.go:55] Watching for new subnet leases
I0826 03:22:37.761153       1 route_network.go:92] Subnet added: 10.244.0.0/24 via 172.22.0.239
W0826 03:22:37.761524       1 route_network.go:151] Replacing existing route to {Ifindex: 5 Dst: 10.244.0.0/24 Src: <nil> Gw: 10.244.0.0 Flags: [onlink] Table: 254 Realm: 0} with {Ifindex: 2 Dst: 10.244.0.0/24 Src: <nil> Gw: 172.22.0.239 Flags: [] Table: 0 Realm: 0}
I0826 03:22:37.848961       1 route_network.go:92] Subnet added: 10.244.2.0/24 via 172.22.0.117
W0826 03:22:37.849059       1 route_network.go:151] Replacing existing route to {Ifindex: 5 Dst: 10.244.2.0/24 Src: <nil> Gw: 10.244.2.0 Flags: [onlink] Table: 254 Realm: 0} with {Ifindex: 2 Dst: 10.244.2.0/24 Src: <nil> Gw: 172.22.0.117 Flags: [] Table: 0 Realm: 0}
I0826 03:22:37.849360       1 route_network.go:92] Subnet added: 10.244.3.0/24 via 172.22.0.76
W0826 03:22:37.849454       1 route_network.go:151] Replacing existing route to {Ifindex: 5 Dst: 10.244.3.0/24 Src: <nil> Gw: 10.244.3.0 Flags: [onlink] Table: 254 Realm: 0} with {Ifindex: 2 Dst: 10.244.3.0/24 Src: <nil> Gw: 172.22.0.76 Flags: [] Table: 0 Realm: 0}
I0826 03:22:37.850273       1 route_network.go:92] Subnet added: 10.244.4.0/24 via 172.22.0.212
W0826 03:22:37.850377       1 route_network.go:151] Replacing existing route to {Ifindex: 5 Dst: 10.244.4.0/24 Src: <nil> Gw: 10.244.4.0 Flags: [onlink] Table: 254 Realm: 0} with {Ifindex: 2 Dst: 10.244.4.0/24 Src: <nil> Gw: 172.22.0.212 Flags: [] Table: 0 Realm: 0}
I0826 03:22:37.850675       1 route_network.go:92] Subnet added: 10.244.5.0/24 via 172.22.0.64
W0826 03:22:37.850758       1 route_network.go:151] Replacing existing route to {Ifindex: 5 Dst: 10.244.5.0/24 Src: <nil> Gw: 10.244.5.0 Flags: [onlink] Table: 254 Realm: 0} with {Ifindex: 2 Dst: 10.244.5.0/24 Src: <nil> Gw: 172.22.0.64 Flags: [] Table: 0 Realm: 0}

其中 Subnet added: 10.244.0.0/24 via 172.22.0.239 的日志已经说的非常明白了,这里调整的路由是将某个节点的 ip 作为某个子网的网关,因此数据包不需要封包,就可以直接路由到这个节点上,另外就是由于 host-gw 不需要封包解包,所以 MTU 的值会被 Flannel 自动改为1500

代码语言:javascript
复制
# cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1500
FLANNEL_IPMASQ=true

关于修改配置后,其他容器需要重启吗?正常情况是不需要的,因为容器的网络栈只会让容器的数据包发到 cni0 这个设备上,至于后面是走 vxlan 还是 host-gw 完全取决于路由的配置,但是不排除某些组件会对路由、网络方案的改变敏感,是否进行变更,请仔细测试再实行,另外 host-gw 虽然性能上更好,但是使用上是需要满足一定的条件的,最基本的是 worker 节点需要在同一个子网下,也就是二层可以通信

性能对比

benchmark 工具使用的是 k8s-bench-suite, 具体命令是 knb --verbose --client-node node2 --server-node node3, 在同样的机器上进行测试,实测结果vxlan 模式对比 host-gw 模式,大概会有10%左右的额外消耗(数据取决于硬件和网络质量)

vxlan 模式

代码语言:javascript
复制
=========================================================
 Benchmark Results
=========================================================
 Name            : knb-12885
 Date            : 2022-08-26 07:11:41 UTC
 Generator       : knb
 Version         : 1.5.0
 Server          : node2
 Client          : node3
 UDP Socket size : auto
=========================================================
  Discovered CPU         : Intel Xeon Processor (Skylake, IBRS)
  Discovered Kernel      : 5.4.127-1.el7.elrepo.x86_64
  Discovered k8s version : v1.21.7
  Discovered MTU         : 1450
  Idle :
    bandwidth = 0 Mbit/s
    client cpu = total 6.97% (user 2.53%, nice 0.05%, system 4.21%, iowait 0.03%, steal 0.15%)
    server cpu = total 8.09% (user 2.73%, nice 0.05%, system 5.18%, iowait 0.00%, steal 0.13%)
    client ram = 1233 MB
    server ram = 1198 MB
  Pod to pod :
    TCP :
      bandwidth = 845 Mbit/s
      client cpu = total 5.06% (user 1.35%, nice 0.05%, system 3.49%, iowait 0.07%, steal 0.10%)
      server cpu = total 10.78% (user 1.76%, nice 0.02%, system 8.98%, iowait 0.02%, steal 0.00%)
      client ram = 1235 MB
      server ram = 1197 MB
    UDP :
      bandwidth = 877 Mbit/s
      client cpu = total 26.54% (user 2.83%, nice 0.05%, system 23.57%, iowait 0.07%, steal 0.02%)
      server cpu = total 13.43% (user 3.74%, nice 0.03%, system 9.56%, iowait 0.00%, steal 0.10%)
      client ram = 1234 MB
      server ram = 1198 MB
  Pod to Service :
    TCP :
      bandwidth = 856 Mbit/s
      client cpu = total 5.25% (user 1.40%, nice 0.05%, system 3.68%, iowait 0.05%, steal 0.07%)
      server cpu = total 10.31% (user 1.92%, nice 0.02%, system 8.37%, iowait 0.00%, steal 0.00%)
      client ram = 1233 MB
      server ram = 1199 MB
    UDP :
      bandwidth = 835 Mbit/s
      client cpu = total 27.90% (user 2.94%, nice 0.02%, system 24.82%, iowait 0.07%, steal 0.05%)
      server cpu = total 13.29% (user 3.74%, nice 0.03%, system 9.49%, iowait 0.00%, steal 0.03%)
      client ram = 1236 MB
      server ram = 1203 MB
=========================================================

host-gw 模式

代码语言:javascript
复制
=========================================================
 Benchmark Results
=========================================================
 Name            : knb-8657
 Date            : 2022-08-26 07:08:07 UTC
 Generator       : knb
 Version         : 1.5.0
 Server          : node2
 Client          : node3
 UDP Socket size : auto
=========================================================
  Discovered CPU         : Intel Xeon Processor (Skylake, IBRS)
  Discovered Kernel      : 5.4.127-1.el7.elrepo.x86_64
  Discovered k8s version : v1.21.7
  Discovered MTU         : 1500
  Idle :
    bandwidth = 0 Mbit/s
    client cpu = total 3.35% (user 1.56%, nice 0.02%, system 1.70%, iowait 0.07%, steal 0.00%)
    server cpu = total 2.45% (user 1.14%, nice 0.09%, system 1.22%, iowait 0.00%, steal 0.00%)
    client ram = 1258 MB
    server ram = 1194 MB
  Pod to pod :
    TCP :
      bandwidth = 875 Mbit/s
      client cpu = total 4.53% (user 1.37%, nice 0.00%, system 3.00%, iowait 0.09%, steal 0.07%)
      server cpu = total 7.61% (user 1.49%, nice 0.07%, system 5.98%, iowait 0.02%, steal 0.05%)
      client ram = 1250 MB
      server ram = 1197 MB
    UDP :
      bandwidth = 944 Mbit/s
      client cpu = total 34.08% (user 4.70%, nice 0.03%, system 28.94%, iowait 0.03%, steal 0.38%)
      server cpu = total 18.45% (user 4.81%, nice 0.02%, system 13.11%, iowait 0.02%, steal 0.49%)
      client ram = 1245 MB
      server ram = 1197 MB
  Pod to Service :
    TCP :
      bandwidth = 931 Mbit/s
      client cpu = total 4.01% (user 1.25%, nice 0.05%, system 2.62%, iowait 0.09%, steal 0.00%)
      server cpu = total 8.14% (user 1.59%, nice 0.02%, system 6.48%, iowait 0.00%, steal 0.05%)
      client ram = 1242 MB
      server ram = 1197 MB
    UDP :
      bandwidth = 896 Mbit/s
      client cpu = total 26.61% (user 2.79%, nice 0.02%, system 23.73%, iowait 0.07%, steal 0.00%)
      server cpu = total 11.16% (user 3.18%, nice 0.03%, system 7.89%, iowait 0.00%, steal 0.06%)
      client ram = 1236 MB
      server ram = 1197 MB
=========================================================

Reference

  1. Flannel的两种模式解析(VXLAN、host-gw)
  2. Benchmark results of Kubernetes network plugins (CNI) over 10Gbit/s network (Updated: August 2020)
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2022-08-26,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Overview
  • 安装
  • 默认配置
  • 性能对比
  • Reference
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档