安装和配置 Kubernetes 集群的过程是比较繁琐的,这里阐述在 Mac 上利用 virtualbox 配置 CentOS 7 上的 Kubernetes 集群的过程。
我们需要搭建的 Kubernetes 集群目标和规格如下:
4 个节点的规划如下
主机名 | IP 地址 | Host-Only IP 地址 | 用途 |
---|---|---|---|
k8s-node1 | 192.168.56.11 | 192.168.7.11 | master |
k8s-node2 | 192.168.56.12 | 192.168.7.12 | worker |
k8s-node3 | 192.168.56.13 | 192.168.7.13 | worker |
k8s-node4 | 192.168.56.14 | 192.168.7.14 | worker |
请按照如下要求准备环境
本文使用 VirtualBox 6 配置虚拟机,请自行安装。
现在 NAT 网络就设置好了。
这里设置最小地址为 192.168.7.11, 单纯是为了和 NAT 服务器的地址的最后一位对应上,没有其他的意义。
现在已经设置好了 Host-Only 网络
请在 http://mirrors.163.com/centos/7.6.1810/isos/x86_64/CentOS-7-x86_64-Minimal-1810.iso 下载 CentOS 7.6 镜像
此时虚拟机已经创建完毕,宿主如果想和虚拟机通信,需要通过 Host-Only 网络的 IP 地址。
可以通过一下命令查看 Host-Only 网络的 IP 地址
ip addr
结果如下:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:14:21:b0 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s3
valid_lft forever preferred_lft forever
inet6 fe80::7734:1bd6:9da6:5d1f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:1b:66:a7 brd ff:ff:ff:ff:ff:ff
inet 192.168.7.11/24 brd 192.168.7.255 scope global noprefixroute dynamic enp0s8
valid_lft 1153sec preferred_lft 1153sec
inet6 fe80::5f85:8418:37a4:f428/64 scope link noprefixroute
valid_lft forever preferred_lft forever
则接口 enp0s8 为 Host-Only 的接口,ip 地址为 192.168.7.11 。
由于以后安装的需要,这里要做一些基础的配置。
在 VirtualBox 中复制 k8s-node1 节点为其他节点,其他节点的名称分别为 k8s-node2, k8s-node3, k8s-node4。然后分别修改各个节点的如下项:
至此基础环境已经安装完毕,下一步进入到 docker 和 k8s 的安装。
此步骤要在所有的 4 个节点执行。
此步骤要在所有的 4 个节点执行。
执行如下命令来初始化 master 节点。
kubeadm init \
--apiserver-advertise-address=192.168.56.11 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.15.0 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.2.0.0/16 \
--service-dns-domain=cluster.local \
--ignore-preflight-errors=Swap \
--ignore-preflight-errors=NumCPU
先看一下几个重点的参数
整个过程可能会持续 5 分钟左右,整个输出的结果如下:
[init] Using Kubernetes version: v1.15.0
[preflight] Running pre-flight checks
[WARNING NumCPU]: the number of available CPUs 1 is less than the required 2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-node1 localhost] and IPs [192.168.56.11 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-node1 localhost] and IPs [192.168.56.11 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-node1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.1.0.1 192.168.56.11]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 41.503341 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-node1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-node1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 5wf7mp.v61tv0s23ewbun1l
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.56.11:6443 --token 5wf7mp.v61tv0s23ewbun1l \
--discovery-token-ca-cert-hash sha256:ca524d88dbcc9a79c70c4cf21fba7252c0f12e5ab0fe9674e7f6998ab9fb5901
上面输出的最后部分提示我们连个信息: - 需要执行几个命令来在用户目录下建立配置文件 - 告诉我们其他节点加入集群的命令
按照上面的执行结果中的要求,执行以下命令。
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
在配置文件中,记录了 API Server 的访问地址,所以后面直接执行 kubectl 命令就可以正常连接到 API Server 中。
使用以下命令查看组件的状态
kubectl get cs
输出结果如下
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
这里能够正常返回结果,说明 API server 已经正常运行
获取 Node 信息
kubectl get node
输出如下
NAME STATUS ROLES AGE VERSION
k8s-node1 NotReady master 6m48s v1.15.0
可以看出 k8s-node1 还是 NotReady 的状态,这是因为还未安装网络插件。现在进入网络插件的安装。
插件的部署通过 kubectl 命令应用 yaml 配置文件。分别运行以下两个命令。
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
输出
clusterrole.rbac.authorization.k8s.io/calico created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-calico created
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml
输出
configmap/canal-config created
daemonset.extensions/canal created
serviceaccount/canal created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
运行以下查看启动的 Pod
kubectl get pods --all-namespaces
输出为
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system canal-rj2fm 0/3 ContainerCreating 0 44s
kube-system coredns-bccdc95cf-rgtbx 0/1 Pending 0 11m
kube-system coredns-bccdc95cf-x6j8l 0/1 Pending 0 11m
kube-system etcd-k8s-node1 1/1 Running 0 11m
kube-system kube-apiserver-k8s-node1 1/1 Running 0 10m
kube-system kube-controller-manager-k8s-node1 1/1 Running 0 10m
kube-system kube-proxy-zcssq 1/1 Running 0 11m
kube-system kube-scheduler-k8s-node1 1/1 Running 0 10m
可以看出 canal 正在创建容器, 而 coredns 处于 pending 状态。 由于需要下载 canal 镜像,所以需要一些时间,等镜像下载完成后,则 coredns 的状态变温 Running 。
需要注意的是,如果出现 ErrImagePull 等错误,则可能是由于 canal 镜像由于在 google 服务器访问不到的缘故,此时需要开启 VPN 才能正常下载。
等镜像下载完成后,再次运行 kubectl get pods --all-namespaces , 则状态都正常了,如下所示:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system canal-rj2fm 3/3 Running 0 35m
kube-system coredns-bccdc95cf-rgtbx 1/1 Running 0 46m
kube-system coredns-bccdc95cf-x6j8l 1/1 Running 0 46m
kube-system etcd-k8s-node1 1/1 Running 1 46m
kube-system kube-apiserver-k8s-node1 1/1 Running 1 45m
kube-system kube-controller-manager-k8s-node1 1/1 Running 1 45m
kube-system kube-proxy-zcssq 1/1 Running 1 46m
kube-system kube-scheduler-k8s-node1 1/1 Running 1 45m
此时再运行 kubectl get node 查看 master 节点的状态,则状态已经 Ready, 如下
NAME STATUS ROLES AGE VERSION
k8s-node1 Ready master 48m v1.15.0
首先在 master 节点上执行以下命令来获取在集群中添加节点的命令
kubeadm token create --print-join-command
输出为
kubeadm join 192.168.56.11:6443 --token eb0k80.qhqbjon1mh55w803 --discovery-token-ca-cert-hash sha256:ca524d88dbcc9a79c70c4cf21fba7252c0f12e5ab0fe9674e7f6998ab9fb5901
然后在每个 worker 节点上执行上面的命令,这个时候 kubernetes 会使用 DaemonSet 在所有节点上都部署 canal 和 kube-proxy。
需要注意的是,如果出现 ErrImagePull 等错误,则可能是由于镜像由于在 google 服务器访问不到的缘故,此时需要开启 VPN 才能正常下载。
等待全部部署完毕,在 master 节点运行以下命令查看信息。
查看所有 daemonset
kubectl get daemonset --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system canal 4 4 4 4 4 beta.kubernetes.io/os=linux 16h
kube-system kube-proxy 4 4 4 4 4 beta.kubernetes.io/os=linux 17h
可以看到 READY 和 AVAILABLE 都是 4, 也就是 4 个节点都已经可用了。
查看所有 Pod
kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system canal-6w2zb 3/3 Running 12 16h
kube-system canal-jgw4m 3/3 Running 47 16h
kube-system canal-klmfs 3/3 Running 33 16h
kube-system canal-rj2fm 3/3 Running 12 17h
kube-system coredns-bccdc95cf-rgtbx 1/1 Running 3 17h
kube-system coredns-bccdc95cf-x6j8l 1/1 Running 3 17h
kube-system etcd-k8s-node1 1/1 Running 4 17h
kube-system kube-apiserver-k8s-node1 1/1 Running 6 17h
kube-system kube-controller-manager-k8s-node1 1/1 Running 4 17h
kube-system kube-proxy-7bk98 1/1 Running 0 16h
kube-system kube-proxy-cd8xj 1/1 Running 0 16h
kube-system kube-proxy-xfzfp 1/1 Running 0 16h
kube-system kube-proxy-zcssq 1/1 Running 4 17h
kube-system kube-scheduler-k8s-node1 1/1 Running 4 17h
查看所有节点
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-node1 Ready master 17h v1.15.0
k8s-node2 Ready <none> 16h v1.15.0
k8s-node3 Ready <none> 16h v1.15.0
k8s-node4 Ready <none> 16h v1.15.0
现在可以看到所有的节点已经运行 Ready 。
通过上面的步骤,k8s 集群(1个 master 节点和 3 个 worker 节点)环境已经搭建完毕,并且所有的节点都得正常工作,现在我们要通过添加 Nginx 应用来测试集群。
创建单 Pod 的 Nginx 应用
kubectl create deployment nginx --image=nginx:alpine
deployment.apps/nginx created
查看 pod 详情
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-8f6959bd-6pth6 1/1 Running 0 73s 10.2.1.2 k8s-node2 <none> <none>
Pod 的 IP 地址是从 Master 节点初始化的参数 --pod-network-cidr=10.2.0.0/16 的地址段中分配的。
访问 nginx
通过上面获取的 Pod 的 ip 10.2.1.2 地址访问 nginx
curl -I http://10.2.1.2
HTTP/1.1 200 OK
Server: nginx/1.17.1
Date: Thu, 18 Jul 2019 07:53:22 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT
Connection: keep-alive
ETag: "5d122c6c-264"
Accept-Ranges: bytes
扩容为 2 个 节点
kubectl scale deployment nginx --replicas=2
deployment.extensions/nginx scaled
查看 pod
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-8f6959bd-6pth6 1/1 Running 0 6m44s 10.2.1.2 k8s-node2 <none> <none>
nginx-8f6959bd-l56n9 1/1 Running 0 28s 10.2.3.2 k8s-node4 <none> <none>
可以看到 Pod 已经有了两个副本,每个副本都有各自的 IP, 通过 IP 访问新增加的副本,照样是可以提供服务的。
curl -I http://10.2.3.2
HTTP/1.1 200 OK
Server: nginx/1.17.1
Date: Thu, 18 Jul 2019 07:58:27 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT
Connection: keep-alive
ETag: "5d122c6c-264"
Accept-Ranges: bytes
**暴露为服务 **
多个副本需要暴露为一个服务来统一对外提供服务,服务会创建一个Cluster IP,从 Master 节点初始化参数 --service-cidr=10.1.0.0/16 地址段中进行分配。服务会自动在在多个副本之间进行负载均衡。
运行以下命令为 nginx 应用暴露服务,并开启 NodePort 在所有节点上进行端口映射,进行外部访问。
kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed
运行以下命令看一下服务列表
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 19h
nginx NodePort 10.1.59.105 <none> 80:32502/TCP 80s
可以看到,nginx 服务的 vip 为 10.1.59.105, Node 节点上端口 32502 映射到 nginx 的 80 端口。
运行以下命令,通过 vip 访问服务
curl -I http://10.1.59.105
HTTP/1.1 200 OK
Server: nginx/1.17.1
Date: Thu, 18 Jul 2019 08:10:45 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT
Connection: keep-alive
ETag: "5d122c6c-264"
Accept-Ranges: bytes
在主机上运行以下命令通过节点的 IP 访问服务
curl -I http://192.168.7.11:32502
HTTP/1.1 200 OK
Server: nginx/1.17.1
Date: Thu, 18 Jul 2019 08:14:31 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT
Connection: keep-alive
ETag: "5d122c6c-264"
Accept-Ranges: bytes
这里由于宿主机不能直接访问 VirtualBox 的 NAT 网络,采用的 Host-Only 网络的 IP 进行访问。