本文介绍了如何使用 Kubespray 在本地开发测试部署 Kubernetes 集群及其注意事项。
本文整理了一下个人使用 Kubespray 在本地开发测试部署 Kubernetes 集群时需要注意的事项。也推荐大家阅读一下我同事写的私有云 PaaS 场景下的 Kubernetes 集群部署实践一文,在里面有详细介绍集群部署的过程和原理。
K8sMeetup
准备
虽然公司开发测试机房有一大批可用的机器,但由于我的域名 k8s.li 比较特殊,国内无法备案,所以不能将 DNS 解析到这些国内的服务器上。因此我打算将域名解析到一台国外的服务器上,然后再使用 nginx rewrite 重写将请求转发到阿里云的 OSS;Docker Registry 的后端存储也选择使用阿里云 OSS,这样客户端在拉取镜像的时候,只会通过我的域名获取镜像的 manifest 文件,镜像的 blobs 数据将会转发到阿里云 OSS。在集群部署的时候,下载文件和镜像最主要的流量都会通过阿里云 OSS,这样可以节省集群部署耗时,提高部署效率。
域名 SSL 证书制作
域名 SSL 证书主要是给镜像仓库使用的,假如证书是自签的或者镜像仓库使用的是 HTTP 协议,这样会导致 Docker 或者 Containerd 无法拉取镜像,需要为集群所有节点配置 insecure-registries 这个参数。操作起来比较麻烦,因此还是推荐给镜像仓库加一个非自签的 SSL 证书,这样能减少一些不必要的麻烦。如果有现成的镜像仓库并且配置好了 SSL 证书,可以略过此步。
制作域名证书的方式有很多种,个人比较推荐使用 acme.sh。它实现了 acme 协议支持的所有验证协议,并且支持支持数十种域名解析商。由于我的域名是托管在 Cloudflare 上的,使用 acme.sh 来签发证书特别方便,只需要配置两个参数即可。下面就给 k8s.li 这个域名签发一个泛域名证书。
curl https://get.acme.sh | sh
~/.acme.sh/acme.sh --help
export CF_Email="xxx@gmail.com" # cloudflare 账户的邮箱
export CF_Key="xxxxxx" # "cloudflare中查看你的key"
~/.acme.sh/acme.sh --issue --dns dns_cf -d k8s.li -d *.k8s.li
[Tue Apr 27 07:32:52 UTC 2021] Cert success.
[Tue Apr 27 07:32:52 UTC 2021] Your cert is in /root/.acme.sh/k8s.li/k8s.li.cer
[Tue Apr 27 07:32:52 UTC 2021] Your cert key is in /root/.acme.sh/k8s.li/k8s.li.key
[Tue Apr 27 07:32:52 UTC 2021] The intermediate CA cert is in /root/.acme.sh/k8s.li/ca.cer
[Tue Apr 27 07:32:52 UTC 2021] And the full chain certs is there: /root/.acme.sh/k8s.li/fullchain.cer
前面证书生成以后,接下来需要把证书 copy 到真正需要用它的地方。 注意,默认生成的证书都放在安装目录下
~/.acme.sh/``, 请不要直接使用此目录下的文件,例如:不要直接让
nginx/apache` 的配置文件使用这下面的文件。这里面的文件都是内部使用,而且目录结构可能会变化。 正确的使用方法是使用--installcert
命令,并指定目标位置,然后证书文件会被 copy 到相应的位置
acme.sh --install-cert -d k8s.li \
--cert-file /etc/nginx/ssl/k8s.li.cer \
--key-file /etc/nginx/ssl/k8s.li.key \
--fullchain-file /etc/nginx/ssl/fullchain.cer
搭建镜像仓库
需要为 gcr.k8s.li、quay.io、docker.io 搭建三个镜像仓库,使用 registry mirrors 的特性,将要代理的镜像仓库缓存在本地搭建的镜像仓库中,这样在就可以便捷地拉取镜像了。大家也可以直接使用如下镜像仓库:
1 | origin | mirror |
---|---|---|
2 | docker.io | hub.k8s.li |
3 | k8s.gcr.io | gcr.k8s.li |
4 | quay.io | quay.k8s.li |
本地搭建镜像仓库的方法如下,相关配置可以直接使用 github.com/muzi502/registry-mirrors 中提供的配置文件。
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
oss:
accesskeyid: xxxx # 这里配置阿里云 OSS 的 accesskeyid
accesskeysecret: xxxx # 这里配置阿里云 OSS 的 accesskeysecret
region: oss-cn-beijing # 配置 OSS bucket 的区域,比如 oss-cn-beijing
internal: false
bucket: fileserver # 配置存储 bucket 的名称
rootdirectory: /kubespray/registry # 配置路径
delete:
enabled: true
http:
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
version: '3'
services:
gcr-registry:
image: registry:2
container_name: gcr-registry
restart: always
volumes:
- ./config.yml:/etc/docker/registry/config.yml
ports:
- 127.0.0.1:5001:5001
environment:
- REGISTRY_HTTP_ADDR=0.0.0.0:5001
- REGISTRY_PROXY_REMOTEURL=https://k8s.gcr.io
hub-registry:
image: registry:2
container_name: hub-registry
restart: always
volumes:
- ./config.yml:/etc/docker/registry/config.yml
ports:
- 127.0.0.1:5002:5002
environment:
- REGISTRY_HTTP_ADDR=0.0.0.0:5002
- REGISTRY_PROXY_REMOTEURL=https://docker.io
quay-registry:
image: registry:2
container_name: quay-registry
restart: always
volumes:
- ./config.yml:/etc/docker/registry/config.yml
ports:
- 127.0.0.1:5003:5003
environment:
- REGISTRY_HTTP_ADDR=0.0.0.0:5003
- REGISTRY_PROXY_REMOTEURL=https://quay.io
分别为这个三个镜像仓库设置不同的域名,然后反向代理到不同的容器上。
server {
listen 443 ssl;
listen [::]:443;
server_name gcr.k8s.li;
ssl_certificate domain.crt;
ssl_certificate_key domain.key;
gzip_static on;
client_max_body_size 100000m;
if ($request_method !~* GET|HEAD) {
return 403;
}
location / {
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://localhost:5001;
}
}
server {
listen 443 ssl;
listen [::]:443;
server_name hub.k8s.li;
ssl_certificate domain.crt;
ssl_certificate_key domain.key;
gzip_static on;
client_max_body_size 100000m;
if ($request_method !~* GET|HEAD) {
return 403;
}
location / {
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://localhost:5002;
}
}
server {
listen 443 ssl;
listen [::]:443;
server_name quay.k8s.li;
ssl_certificate domain.crt;
ssl_certificate_key domain.key;
gzip_static on;
client_max_body_size 100000m;
if ($request_method !~* GET|HEAD) {
return 403;
}
location / {
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://localhost:5003;
}
}
文件服务器
文件服务器用于存放一些 Kubeadm、Kubectl、Kubelet 等二进制文件,Kubespray 默认的下载地址在国内访问特别慢,因此需要搭建一个 http/https 服务器,用于给集群部署下载这些二进制文件使用。
需要注意,这里的 nginx 配置使用的是 rewrite 而不是 proxy_pass,这样客户端在向服务器请求文件时,会重写客户端的请求,让客户端去请求阿里云 OSS 的地址。
server {
listen 443;
listen [::]:443;
server_name dl.k8s.li;
ssl_certificate /etc/nginx/ssl/fullchain.cer;
ssl_certificate_key /etc/nginx/ssl/k8s.li.key;
location / {
rewrite ^/(.*)$ https://fileserver.oss-cn-beijing.aliyuncs.com/kubespray/files/$1;
proxy_hide_header Content-Disposition;
proxy_hide_header x-oss-request-id;
proxy_hide_header x-oss-object-type;
proxy_hide_header x-oss-hash-crc64ecma;
proxy_hide_header x-oss-storage-class;
proxy_hide_header x-oss-force-download;
proxy_hide_header x-oss-server-time;
}
}
获取部署需要的二进制文件
Kubespray 部署的时候需要到 github.com 或 storage.googleapis.com 下载一些二进制文件,这些地址在国内都都被阻断了,因此需要将部署时依赖的文件上传到自己的文件服务器上。自己写了个脚本用于获取 Kubespray 部署需要的二进制文件,在 Kubespray Repo 的根目录下执行,下载的文件默认会存放在 temp/files 目录下。下载完成之后将该目录下的所有子目录上传到自己的文件服务器上。后面配置一些参数在这个地址的参数前面加上自己文件服务器的 URL 即可。
root@debian:/root# git clone https://github.com/kubernetes-sigs/kubespray && cd kubespray
#!/bin/bash
set -eo pipefail
CURRENT_DIR=$(cd $(dirname $0); pwd)
TEMP_DIR="${CURRENT_DIR}/temp"
REPO_ROOT_DIR="${CURRENT_DIR}"
: ${IMAGE_ARCH:="amd64"}
: ${ANSIBLE_SYSTEM:="linux"}
: ${ANSIBLE_ARCHITECTURE:="x86_64"}
: ${DOWNLOAD_YML:="roles/download/defaults/main.yml"}
: ${KUBE_VERSION_YAML:="roles/kubespray-defaults/defaults/main.yaml"}
mkdir -p ${TEMP_DIR}
# ARCH used in convert {%- if image_arch != 'amd64' -%}-{{ image_arch }}{%- endif -%} to {{arch}}
if [ "${IMAGE_ARCH}" != "amd64" ]; then ARCH="${IMAGE_ARCH}"; fi
cat > ${TEMP_DIR}/generate.sh << EOF
arch=${ARCH}
image_arch=${IMAGE_ARCH}
ansible_system=${ANSIBLE_SYSTEM}
ansible_architecture=${ANSIBLE_ARCHITECTURE}
EOF
# generate all component version by $DOWNLOAD_YML
grep 'kube_version:' ${REPO_ROOT_DIR}/${KUBE_VERSION_YAML} \
| sed 's/: /=/g' >> ${TEMP_DIR}/generate.sh
grep '_version:' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 's/: /=/g;s/{{/${/g;s/}}/}/g' | tr -d ' ' >> ${TEMP_DIR}/generate.sh
sed -i 's/kube_major_version=.*/kube_major_version=${kube_version%.*}/g' ${TEMP_DIR}/generate.sh
sed -i 's/crictl_version=.*/crictl_version=${kube_version%.*}.0/g' ${TEMP_DIR}/generate.sh
# generate all download files url
grep 'download_url:' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 's/: /=/g;s/ //g;s/{{/${/g;s/}}/}/g;s/|lower//g;s/^.*_url=/echo /g' >> ${TEMP_DIR}/generate.sh
# print files.list and images.list
bash ${TEMP_DIR}/generate.sh | grep 'https' | sort > ${TEMP_DIR}/files.list
wget -x -P temp/files -i temp/files.list
最终下载的结果如下,基本上保持了原有的 URL 路径,也方便后续的更新和版本迭代。
temp/files
├── get.helm.sh
│ └── helm-v3.5.4-linux-amd64.tar.gz
├── github.com
│ ├── containerd
│ │ └── nerdctl
│ │ └── releases
│ │ └── download
│ │ └── v0.8.0
│ │ └── nerdctl-0.8.0-linux-amd64.tar.gz
│ ├── containernetworking
│ │ └── plugins
│ │ └── releases
│ │ └── download
│ │ └── v0.9.1
│ │ └── cni-plugins-linux-amd64-v0.9.1.tgz
│ ├── containers
│ │ └── crun
│ │ └── releases
│ │ └── download
│ │ └── 0.19
│ │ └── crun-0.19-linux-amd64
│ ├── coreos
│ │ └── etcd
│ │ └── releases
│ │ └── download
│ │ └── v3.4.13
│ │ └── etcd-v3.4.13-linux-amd64.tar.gz
│ ├── kata-containers
│ │ └── runtime
│ │ └── releases
│ │ └── download
│ │ └── 1.12.1
│ │ └── kata-static-1.12.1-x86_64.tar.xz
│ ├── kubernetes-sigs
│ │ └── cri-tools
│ │ └── releases
│ │ └── download
│ │ └── v1.20.0
│ │ └── crictl-v1.20.0-linux-amd64.tar.gz
│ └── projectcalico
│ ├── calico
│ │ └── archive
│ │ └── v3.17.3.tar.gz
│ └── calicoctl
│ └── releases
│ └── download
│ └── v3.17.3
│ └── calicoctl-linux-amd64
└── storage.googleapis.com
└── kubernetes-release
└── release
└── v1.20.6
└── bin
└── linux
└── amd64
├── kubeadm
├── kubectl
└── kubelet
至此准备工作大致都已经完成了,接下来开始配置 Kubespray 的一些参数和 inventory 文件。
K8sMeetup
配置
按照 Kubespray 文档说明,将 inventory/sample 目录复制一份,然后通过修改里面的参数来控制部署。
root@debian:/root/kubespray git:(master*) # cp -rf inventory/sample deploy
inventory
创建主机 inventory 文件,格式如下:
[all:vars]
ansible_port=22
ansible_user=root
ansible_ssh_private_key_file=/kubespray/.ssh/id_rsa
[all]
kube-control-1 ansible_host=192.168.4.11
kube-control-2 ansible_host=192.168.4.12
kube-control-3 ansible_host=192.168.4.13
kube-node-1 ansible_host=192.168.4.4
[kube_control_plane]
kube-control-1
kube-control-2
kube-control-3
[etcd]
kube-control-1
kube-control-2
kube-control-3
[kube-node]
kube-control-1
kube-control-2
kube-control-3
kube-node-1
[calico-rr]
[k8s-cluster:children]
kube_control_plane
kube-node
calico-rr
Kubespray 用到了 Ansible 的 synchronize 模块来分发文件,基于 rsync 协议所以必须要使用 SSH 密钥对来连接集群节点。inventory 配置的是 Kubespray 容器内的路径,因此需要将 SSH 公钥和私钥复制到 repo 的 .ssh 目录下。如果节点就没有进行 SSH 免密登录,可以用 Ansible 的 authorized_key 模块将 SSH 公钥添加到主机的 authorized_key 中。操作步骤如下:
root@debian:/root/kubespray git:(master*) # mkdir -p .ssh
# 生成 ssh 密钥对
root@debian:/root/kubespray git:(master*) # ssh-keygen -t rsa -f .ssh/id_rsa -P ""
# 将 ssh 公钥添加到所有主机
root@debian:/root/kubespray git:(master*) # ansible -i deploy/inventory all -m authorized_key -a "user={{ ansible_user }} key='{{ lookup('file', '{{ ssh_cert_path }}') }}'" -e ssh_cert_path=./.ssh/id_rsa.pub -e ansible_ssh_pass=passwd
之前也有人提交过 PR #7146 将 SSH 公钥添加到部署节点的 authorized_key 文件中,不过被大佬们毫不客气地给否决了 ?。
vars
创建并修改以下配置文件
---
# 定义一些组件的版本
kube_version: v1.20.6
calico_version: "v3.17.3"
pod_infra_version: "3.2"
nginx_image_version: "1.19"
coredns_version: "1.7.0"
image_arch: "amd64"
# file download server url
download_url: "https://dl.k8s.li"
# docker-ce-repo mirrors
docker_mirrors_url: "https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux"
container_manager: "containerd"
# 由于使用的是 containerd 作为 CRI,目前 etcd 不支持 containerd 容器化部署因此需要将该参数修改为 host ,使用 systemd 来部署
etcd_deployment_type: host
etcd_cluster_setup: true
etcd_events_cluster_setup: true
etcd_events_cluster_enabled: true
# kubernetes CNI type 配置集群 CNI 使用的类型
kube_network_plugin: canal
## Container registry define
kube_image_repo: "gcr.k8s.li"
docker_image_repo: "hub.k8s.li"
quay_image_repo: "quay.k8s.li"
# Download URLs
kubelet_download_url: "{{ download_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubelet"
kubectl_download_url: "{{ download_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubectl"
kubeadm_download_url: "{{ download_url }}/storage.googleapis.com/kubernetes-release/release/{{ kubeadm_version }}/bin/linux/{{ image_arch }}/kubeadm"
etcd_download_url: "{{ download_url }}/github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
cni_download_url: "{{ download_url }}/github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
calicoctl_download_url: "{{ download_url }}/github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "{{ download_url }}/github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
crictl_download_url: "{{ download_url }}/github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
helm_download_url: "{{ download_url }}/get.helm.sh/helm-{{ helm_version }}-linux-{{ image_arch }}.tar.gz"
crun_download_url: "{{ download_url }}/github.com/containers/crun/releases/download/{{ crun_version }}/crun-{{ crun_version }}-linux-{{ image_arch }}"
kata_containers_download_url: "{{ download_url }}/github.com/kata-containers/runtime/releases/download/{{ kata_containers_version }}/kata-static-{{ kata_containers_version }}-{{ ansible_architecture }}.tar.xz"
nerdctl_download_url: "{{ download_url }}/github.com/containerd/nerdctl/releases/download/v{{ nerdctl_version }}/nerdctl-{{ nerdctl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
docker-ce mirrors
Kubespray 安装 Docker 或者 Containerd 容器运行时,需要使用 docker-ce 的源,国内可以使用清华的镜像源。根据不同的 Linux 发行版,在 deploy/group_vars/all/offline.yml 文件中添加这些参数即可。其中 docker_mirrors_url 这个参数就是在 env.yml 里设置的参数。
## CentOS/Redhat
### For EL7, base and extras repo must be available, for EL8, baseos and appstream
### By default we enable those repo automatically
# rhel_enable_repos: false
### Docker / Containerd
docker_rh_repo_base_url: "{{ docker_mirrors_url }}/centos/{{ ansible_distribution_major_version }}/{{ ansible_architecture }}/stable"
docker_rh_repo_gpgkey: "{{ docker_mirrors_url }}/centos/gpg"
## Fedora
### Docker
docker_fedora_repo_base_url: "{{ docker_mirrors_url }}/fedora/{{ ansible_distribution_major_version }}/{{ ansible_architecture }}/stable"
docker_fedora_repo_gpgkey: "{{ docker_mirrors_url }}/fedora/gpg"
### Containerd
containerd_fedora_repo_base_url: "{{ docker_mirrors_url }}/fedora/{{ ansible_distribution_major_version }}/{{ ansible_architecture }}/stable"
containerd_fedora_repo_gpgkey: "{{ docker_mirrors_url }}/fedora/gpg"
## Debian
### Docker
docker_debian_repo_base_url: "{{ docker_mirrors_url }}/debian"
docker_debian_repo_gpgkey: "{{ docker_mirrors_url }}/debian/gpg"
### Containerd
containerd_debian_repo_base_url: "{{ docker_mirrors_url }}/debian"
containerd_debian_repo_gpgkey: "{{ docker_mirrors_url }}/debian/gpg"
# containerd_debian_repo_repokey: 'YOURREPOKEY'
## Ubuntu
### Docker
docker_ubuntu_repo_base_url: "{{ docker_mirrors_url }}/ubuntu"
docker_ubuntu_repo_gpgkey: "{{ docker_mirrors_url }}/ubuntu/gpg"
### Containerd
containerd_ubuntu_repo_base_url: "{{ docker_mirrors_url }}/ubuntu"
containerd_ubuntu_repo_gpgkey: "{{ docker_mirrors_url }}/ubuntu/gpg"
K8sMeetup
部署
经过以上准备好配置工作之后,接下来可以开始正式部署了。在使用 Ansible 进行部署的时候,个人倾向于在 Kubespray 容器里进行操作,而非在本地开发机器上安装 Python3 等环境。对于离线部署而言,提前构建好镜像,使用 Docker 容器更为方便一些。
root@debian:/root/kubespray git:(master*) # docker build -t kubespray:v2.15.1-kube-v1.20.6 .
root@debian:/root/kubespray git:(master*) # docker run --rm -it --net=host -v $PWD:/kubespray kubespray:v2.15.1-kube-v1.20.6 bash
root@debian:/kubespray# ansible -i cluster/inventory all -m ping
kube-control-3 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"ping": "pong"
}
kube-control-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"ping": "pong"
}
kube-node-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"ping": "pong"
}
kube-control-2 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"ping": "pong"
}
root@debian:/kubespray# ansible-playbook -i deploy/inventory -e "@deploy/env.yml" cluster.yml
PLAY RECAP ******************************************************************
kube-control-1 : ok=526 changed=67 unreachable=0 failed=0 skipped=978 rescued=0 ignored=0
kube-control-2 : ok=524 changed=66 unreachable=0 failed=0 skipped=980 rescued=0 ignored=0
kube-control-3 : ok=593 changed=76 unreachable=0 failed=0 skipped=1125 rescued=0 ignored=1
kube-node-1 : ok=366 changed=34 unreachable=0 failed=0 skipped=628 rescued=0 ignored=0
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Wednesday 28 April 2021 10:57:57 +0000 (0:00:00.115) 0:15:21.190 *******
===============================================================================
kubernetes/control-plane : kubeadm | Initialize first master -------------- 65.88s
kubernetes/control-plane : Joining control plane node to the cluster. ----- 50.05s
kubernetes/kubeadm : Join to cluster -------------------------------------- 31.54s
download_container | Download image if required --------------------------- 24.38s
reload etcd --------------------------------------------------------------- 20.56s
Gen_certs | Write etcd member and admin certs to other etcd nodes --------- 19.32s
Gen_certs | Write node certs to other etcd nodes -------------------------- 19.14s
Gen_certs | Write etcd member and admin certs to other etcd nodes --------- 17.45s
network_plugin/canal : Canal | Create canal node manifests ---------------- 15.41s
kubernetes-apps/ansible : Kubernetes Apps | Lay Down CoreDNS Template ----- 13.27s
kubernetes/control-plane : Master | wait for kube-scheduler --------------- 11.97s
download_container | Download image if required --------------------------- 11.76s
Gen_certs | Write node certs to other etcd nodes -------------------------- 10.50s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources ---------------- 8.28s
policy_controller/calico : Create calico-kube-controllers manifests -------- 7.61s
kubernetes/control-plane : set kubeadm certificate key --------------------- 6.32s
download : extract_file | Unpacking archive -------------------------------- 5.51s
Configure | Check if etcd cluster is healthy ------------------------------- 5.41s
Configure | Check if etcd-events cluster is healthy ------------------------ 5.41s
kubernetes-apps/network_plugin/canal : Canal | Start Resources ------------- 4.85s
[root@kube-control-1 ~]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube-control-1 Ready control-plane,master 5m24s v1.20.6 192.168.4.11 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 containerd://1.4.4
kube-control-2 Ready control-plane,master 5m40s v1.20.6 192.168.4.12 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 containerd://1.4.4
kube-control-3 Ready control-plane,master 6m28s v1.20.6 192.168.4.13 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 containerd://1.4.4
kube-node-1 Ready <none> 3m53s v1.20.6 192.168.4.14 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 containerd://1.4.4
[root@kube-control-1 ~]# kubectl get all -n kube-system
NAME READY STATUS RESTARTS AGE
pod/calico-kube-controllers-67d6cdb559-cwf62 1/1 Running 0 4m10s
pod/canal-node-46x2b 2/2 Running 0 4m25s
pod/canal-node-5rkhq 2/2 Running 0 4m25s
pod/canal-node-fcsgn 2/2 Running 0 4m25s
pod/canal-node-nhkp8 2/2 Running 0 4m25s
pod/coredns-5d578c6f84-5nnp8 1/1 Running 0 3m33s
pod/coredns-5d578c6f84-w2kvf 1/1 Running 0 3m39s
pod/dns-autoscaler-6b675c8995-vp282 1/1 Running 0 3m34s
pod/kube-apiserver-kube-control-1 1/1 Running 0 6m51s
pod/kube-apiserver-kube-control-2 1/1 Running 0 7m7s
pod/kube-apiserver-kube-control-3 1/1 Running 0 7m41s
pod/kube-controller-manager-kube-control-1 1/1 Running 0 6m52s
pod/kube-controller-manager-kube-control-2 1/1 Running 0 7m7s
pod/kube-controller-manager-kube-control-3 1/1 Running 0 7m41s
pod/kube-proxy-5dfx8 1/1 Running 0 5m17s
pod/kube-proxy-fvrqk 1/1 Running 0 5m17s
pod/kube-proxy-jd84p 1/1 Running 0 5m17s
pod/kube-proxy-l2mjk 1/1 Running 0 5m17s
pod/kube-scheduler-kube-control-1 1/1 Running 0 6m51s
pod/kube-scheduler-kube-control-2 1/1 Running 0 7m7s
pod/kube-scheduler-kube-control-3 1/1 Running 0 7m41s
pod/nginx-proxy-kube-node-1 1/1 Running 0 5m20s
pod/nodelocaldns-77kq9 1/1 Running 0 3m32s
pod/nodelocaldns-fn5pd 1/1 Running 0 3m32s
pod/nodelocaldns-lfjzb 1/1 Running 0 3m32s
pod/nodelocaldns-xnc6n 1/1 Running 0 3m32s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 3m38s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/canal-node 4 4 4 4 4 <none> 4m25s
daemonset.apps/kube-proxy 4 4 4 4 4 kubernetes.io/os=linux 7m53s
daemonset.apps/nodelocaldns 4 4 4 4 4 <none> 3m32s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/calico-kube-controllers 0/1 1 1 4m12s
deployment.apps/coredns 2/2 2 2 3m39s
deployment.apps/dns-autoscaler 1/1 1 1 3m34s
NAME DESIRED CURRENT READY AGE
replicaset.apps/calico-kube-controllers-67d6cdb559 1 1 1 4m12s
replicaset.apps/coredns-5d578c6f84 2 2 2 3m39s
replicaset.apps/dns-autoscaler-6b675c8995 1 1 1 3m34s
K8sMeetup
常见问题
优化 Kubespray 镜像大小
Kubespray v2.15.1 官方构建的镜像大小为 1.41GB,对于一些场景下希望镜像更小一些,可以通过如下 Dockerfile 构建一个体积较小的镜像。
FROM python:3.6-slim
ENV KUBE_VERSION v1.20.6
RUN apt update -y \
&& apt install -y \
libssl-dev sshpass apt-transport-https jq moreutils vim moreutils iputils-ping \
ca-certificates curl gnupg2 software-properties-common rsync wget tcpdump \
&& rm -rf /var/lib/apt/lists/* \
&& wget -q https://dl.k8s.io/$KUBE_VERSION/bin/linux/amd64/kubectl -O /usr/local/bin/kubectl \
&& chmod a+x /usr/local/bin/kubectl
WORKDIR /kubespray
COPY . .
RUN python3 -m pip install -r requirements.txt
这样构建出来的镜像大小不到 600MB,比之前小了很多。
kubespray v2.15.1 73294562105a 1.41GB
kubespray v2.16-kube-v1.20.6-1.0 80b735995e48 579MB
Docker Registry 禁止 push 镜像
默认直接使用 Docker Registry 来部署镜像仓库的话,比如我的 hub.k8s.li ,因为没有权限限制会导致任何可访问该镜像仓库的客户端可以 push 镜像,这有点不安全,需要安全加固一下。因为 pull 镜像的时候客户端走的都是 HTTP GET 请求,可以通过 nginx 禁止 POST、PUT 这种请求方法,这样就可以禁止 push 镜像。在 nginx 的 server 字段中添加如下内容:
server {
if ($request_method !~* GET|HEAD) {
return 403;
}
}
这样在 push 镜像的时候会返回 403 的错误:
root@debian:/root # docker pull hub.k8s.li/calico/node:v3.17.3
v3.17.3: Pulling from calico/node
282bf12aa8be: Pull complete
4ac1bb9354ad: Pull complete
Digest: sha256:3595a9a945a7ba346a12ee523fc7ae15ed35f1e6282b76bce7fec474d28d68bb
Status: Downloaded newer image for hub.k8s.li/calico/node:v3.17.3
root@debian:/root # docker push !$
root@debian:/root # docker push hub.k8s.li/calico/node:v3.17.3
The push refers to repository [hub.k8s.li/calico/node]
bc19ae092bb4: Preparing
94333d52d45d: Preparing
error parsing HTTP 403 response body: invalid character '<' looking for beginning of value: "<html>\r\n<head><title>403 Forbidden</title></head>\r\n<body bgcolor=\"white\">\r\n<center><h1>403 Forbidden</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n"
那么需要 push 镜像的时候怎么办?
Docker Registry 启动的时候 bind 在 127.0.0.1 上,而不是 0.0.0.0,通过 localhost 来 push 镜像。
镜像仓库自签证书
如果镜像仓库使用的是自签证书,可以跑下面这个 playbook 将自签证书添加到所有节点的 trusted CA dir 中,这样无需配置 insecure-registries 也能拉取镜像。
add-registry-ca.yml
---
- hosts: all
gather_facts: False
tasks:
- name: Gen_certs | target ca-certificate store file
set_fact:
ca_cert_path: |-
{% if ansible_os_family == "Debian" -%}
/usr/local/share/ca-certificates/registry-ca.crt
{%- elif ansible_os_family == "RedHat" -%}
/etc/pki/ca-trust/source/anchors/registry-ca.crt
{%- elif ansible_os_family in ["Flatcar Container Linux by Kinvolk"] -%}
/etc/ssl/certs/registry-ca.pem
{%- elif ansible_os_family == "Suse" -%}
/etc/pki/trust/anchors/registry-ca.pem
{%- elif ansible_os_family == "ClearLinux" -%}
/usr/share/ca-certs/registry-ca.pem
{%- endif %}
tags:
- facts
- name: Gen_certs | add CA to trusted CA dir
copy:
src: "{{ registry_cert_path }}"
dest: "{{ ca_cert_path }}"
register: registry_ca_cert
- name: Gen_certs | update ca-certificates (Debian/Ubuntu/SUSE/Flatcar) # noqa 503
command: update-ca-certificates
when: registry_ca_cert.changed and ansible_os_family in ["Debian", "Flatcar Container Linux by Kinvolk", "Suse"]
- name: Gen_certs | update ca-certificates (RedHat) # noqa 503
command: update-ca-trust extract
when: registry_ca_cert.changed and ansible_os_family == "RedHat"
- name: Gen_certs | update ca-certificates (ClearLinux) # noqa 503
command: clrtrust add "{{ ca_cert_path }}"
when: registry_ca_cert.changed and ansible_os_family == "ClearLinux"
root@debian:/kubespray# ansible-playbook -i deploy/inventory -e registry_cert_path=/kubespray/registry_ca.pem add-registry-ca.yml
优化部署速度
Kubespray 部署的时候有个 task 专门用来下载部署需要的镜像,由于是操作的所有节点,会将一些不需要的镜像拉取到该节点上。比如 kube-apiserver、kube-controller-manager、kube-scheduler 这些在 node 节点上不会用到的镜像也会在 node 节点上拉取,这样会导致 download 的 task 比较耗时。
TASK [download : set_container_facts | Display the name of the image being processed] ********************************************************************************************
ok: [kube-control-3] => {
"msg": "gcr.k8s.li/kube-controller-manager"
}
ok: [kube-control-2] => {
"msg": "gcr.k8s.li/kube-controller-manager"
}
ok: [kube-control-1] => {
"msg": "gcr.k8s.li/kube-controller-manager"
}
ok: [kube-node-1] => {
"msg": "gcr.k8s.li/kube-controller-manager"
}
ok: [kube-control-3] => {
"msg": "gcr.k8s.li/kube-scheduler"
}
ok: [kube-control-2] => {
"msg": "gcr.k8s.li/kube-scheduler"
}
ok: [kube-control-1] => {
"msg": "gcr.k8s.li/kube-scheduler"
}
ok: [kube-node-1] => {
"msg": "gcr.k8s.li/kube-scheduler"
可用通过 download_container: false 这个参数来禁用 download container 这个 task,这样在 Pod 启动的时候只拉取需要的镜像,可以节省一些部署耗时。