本章主要讲述,使用kubeadm进行安装配置K8S集群,并指定使用containerd作为容器运行时的具体安装步骤,以及尽可能在案例中加入k8s集群常用组件及其操作配置。
更多学习笔记文章请关注 WeiyiGeek
公众账号,学习交流【邮箱联系: Master#weiyigeek.top】
原文地址: https://mp.weixin.qq.com/s/1KeX0Wua5icrAJ_1dBSY6A
本章主要讲述,如果使用kubeadm进行安装配置K8S集群,并指定使用containerd作为容器运行时的具体安装步骤,以及尽可能在案例中加入k8s集群常用组件及其操作配置。
描述: 此操作系统已做安全加固和内核优化,环境可能与读者有些许差异。
主机环境:
操作系统版本: Ubuntu 20.04.2 LTS
操作内核版本: 5.4.0-78-generic
主机名称与IP:
* k8s-master-1 10.10.107.220 2C 4G @# 控制平面节点
* k8s-node-1 10.10.107.221 2C 4G @# 工作节点
软件环境:
kubernetes -- v1.20.8
containerd -- 1.4.6
calico -- v3.18
节点环境
Tips: 以下命令需要再k8s-master-1和k8s-node-1主机上都要运行。
# 1.节点hosts信息以及确保每个节点上 MAC 地址和 product_uuid 的唯一性
tee -a /etc/hosts <<'EOF'
10.10.107.220 k8s-master-1
10.10.107.221 k8s-node-1
10.10.107.220 newcluster.k8s
EOF
# 你可以使用命令 ip link 或 ifconfig -a 来获取网络接口的 MAC 地址
ifconfig -a
# 可以使用 sudo cat /sys/class/dmi/id/product_uuid 命令对 product_uuid 校验
sudo cat /sys/class/dmi/id/product_uuid
# k8s-master-1: d0154d56-ffdd-697d-09a6-34c851710f09
# k8s-node-1: f98c4d56-9fb2-bc92-98be-2648c83eb7b5
# 2.系统时间时区同步设置
date -R
sudo ntpdate ntp.aliyun.com
# chronyc sources
sudo timedatectl set-timezone Asia/Shanghai
sudo dpkg-reconfigure tzdata
sudo timedatectl set-local-rtc 0
timedatectl
# Local time: Tue 2021-07-06 11:28:54 CST
# Universal time: Tue 2021-07-06 03:28:54 UTC
# RTC time: Tue 2021-07-06 03:28:55
# Time zone: Asia/Shanghai (CST, +0800)
# System clock synchronized: yes
# NTP service: active
# RTC in local TZ: no
# 3.禁用防火墙与swap分区(新手一定要有这一步操作)
ufw disable && systemctl disable ufw
swapoff -a && sed -i 's|^/swap.img|#/swap.ing|g' /etc/fstab
# 4.内核相关参数调整
egrep -q "^(#)?vm.swappiness.*" /etc/sysctl.conf && sed -ri "s|^(#)?vm.swappiness.*|vm.swappiness = 0|g" /etc/sysctl.conf || echo "vm.swappiness = 0" >> /etc/sysctl.conf
egrep -q "^(#)?net.ipv4.ip_forward.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.ipv4.ip_forward.*|net.ipv4.ip_forward = 1|g" /etc/sysctl.conf || echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
# - 允许 iptables 检查桥接流量
egrep -q "^(#)?net.bridge.bridge-nf-call-iptables.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.bridge.bridge-nf-call-iptables.*|net.bridge.bridge-nf-call-iptables = 1|g" /etc/sysctl.conf || echo "net.bridge.bridge-nf-call-iptables = 1" >> /etc/sysctl.conf
egrep -q "^(#)?net.bridge.bridge-nf-call-ip6tables.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.bridge.bridge-nf-call-ip6tables.*|net.bridge.bridge-nf-call-ip6tables = 1|g" /etc/sysctl.conf || echo "net.bridge.bridge-nf-call-ip6tables = 1" >> /etc/sysctl.conf
# 5.ipvs负载均衡管理工具安装
apt install ipset ipvsadm -y
# 6.模块加载到内核并查看是否已经正确加载所需的内核模块
# 重启后永久生效
tee /etc/modules-load.d/k8s.conf <<'EOF'
# netfilter
br_netfilter
# containerd.
overlay
# ipvs
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF
# 临时生效
mkdir -vp /etc/modules.d/
cat > /etc/modules.d/k8s.modules <<EOF
#!/bin/bash
# 允许 iptables 检查桥接流量
modprobe -- br_netfilter
# containerd.
modprobe -- overlay
# ipvs
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
EOF
chmod 755 /etc/modules.d/k8s.modules && bash /etc/modules.d/k8s.modules && lsmod | grep -e ip_vs -e nf_conntrack
sysctl --system
reboot
# - 卸载原有Docker 以及 containerd
sudo apt-get remove docker docker-engine docker.io containerd runc
# - 更新apt程序包索引并安装程序包,以允许apt通过HTTPS使用存储库
sudo apt-get update
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
# - 添加Docker的官方GPG密钥:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# - 使用以下命令设置稳定存储库。要添加nightly或test存储库,请在下面的命令中的单词stable后面添加单词nightly或test(或两者)。
echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/container.list > /dev/null
# - 更新apt包索引,安装最新版本的containerd或进入下一步安装特定版本:
sudo apt-get update
# - 安装前可查看containerd.io可用的版本: 2021年7月6日 21:03:47 当前最新版本 1.4.6-1
apt-cache madison containerd.io && apt install -y containerd.io=1.4.6-1
# - 创建并修改 containerd 配置
mkdir -vp /etc/containerd/
containerd config default > /etc/containerd/config.toml
sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /etc/containerd/config.toml
sed -i '/containerd.runtimes.runc.options/a\ \ \ \ \ \ \ \ \ \ \ \ SystemdCgroup = true' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#https://xlx9erfu.mirror.aliyuncs.com#g" /etc/containerd/config.toml
# - 启动 Containerd
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd
# - 验证安装的 Containerd 状态及其版本
systemctl status containerd.service && ctr --version
# ctr containerd.io 1.4.6
# - 为了提升下载速度我们还是使用使用阿里云的源进行安装。
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
tee /etc/apt/sources.list.d/kubernetes.list <<'EOF'
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
# 更新 apt 包索引,并查看可用的kubernetes版本为了环境的稳定性
apt-get update
apt-cache madison kubeadm | head -n 8
# kubeadm | 1.21.2-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.21.1-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.21.0-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.20.8-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.20.7-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.20.6-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.20.5-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# kubeadm | 1.20.4-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
# - 安装 kubelet、kubeadm 和 kubectl,并锁定其版本(此处选择1.20.8-00版本)
sudo apt-get install -y kubelet=1.20.8-00 kubeadm=1.20.8-00 kubectl=1.20.8-00
sudo apt-mark hold kubelet kubeadm kubectl
# containerd - 运行时设置
crictl config runtime-endpoint /run/containerd/containerd.sock
# 重载systemd守护进程并将 kubelet 设置成开机启动
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet
Tips : kubelet 现在每隔几秒就会重启,因为它陷入了一个等待 kubeadm 指令的死循环。
systemctl status kubelet.service
# Active: activating (auto-restart) (Result: exit-code) since Tue 2021-07-06 21:38:26 CST; 9s ago
只在 k8s-master-1 机器执行
)集群安装参考地址: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/接下来在 master 节点配置 kubeadm 初始化文件,可以通过如下命令导出默认的初始化配置:
root@k8s-master-1:~/k8s# kubeadm config print init-defaults > kubeadm.yaml
然后根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs 以及 criSocket 设置为 /run/containerd/containerd.sock
,需要注意的是由于我们使用的containerd作为运行时,所以在初始化节点的时候需要指定cgroupDriver为systemd
cat > kubeadm.yaml <<'EOF'
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.10.107.220
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
name: k8s-master-1
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.8
controlPlaneEndpoint: "newcluster.k8s:6443"
networking:
dnsDomain: cluster.local
podSubnet: 172.16.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF
# 如果想设置APISERVER名称需要再/etc/hosts添加与其对应IP地址
tee -a /etc/hosts <<'EOF'
10.10.107.220 newcluster.k8s
EOF
然后使用上面的配置文件进行初始化Master节点:
kubeadm init --config=kubeadm.yaml
# [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
# [certs] Using certificateDir folder "/etc/kubernetes/pki"
# [certs] Generating "ca" certificate and key
# [certs] Generating "apiserver" certificate and key
# [certs] apiserver serving cert is signed for DNS names [k8s-master-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local newcluster.k8s] and IPs [10.96.0.1 10.10.107.220]
......
# [addons] Applied essential addon: CoreDNS
# [addons] Applied essential addon: kube-proxy
# Your Kubernetes control-plane has initialized successfully!
# To start using your cluster, you need to run the following as a regular user:
# - 拷贝 kubeconfig 文件到当前用户的根目录,完毕后即可采用kubectl进行查看管理k8s集群。
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
# You should now deploy a pod network to the cluster.
# Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
# You can now join any number of control-plane nodes by copying certificate authorities
# and service account keys on each node and then running the following as root:
# 从Master节点执行
kubeadm join newcluster.k8s:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:d57743fa8657a959e6f96ea1b2d16ce32c315a2a6dc080a65a2b0fc8849bfbd4 \
--control-plane
# Then you can join any number of worker nodes by running the following on each as root:
# 工作节点执行
kubeadm join newcluster.k8s:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:d57743fa8657a959e6f96ea1b2d16ce32c315a2a6dc080a65a2b0fc8849bfbd4
Tips : 由于 kubeadm 把 kubelet 视为一个系统服务来管理,所以对基于 kubeadm 的安装, 我们推荐使用 systemd 驱动,不推荐 cgroupfs 驱动。参考地址: https://kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
k8s-node-1
)执行上面初始化完成后提示的 join 命令即可:kubeadm join newcluster.k8s:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:d57743fa8657a959e6f96ea1b2d16ce32c315a2a6dc080a65a2b0fc8849bfbd4
kubectl get node
命令可以看到是 NotReady 状态,这是因为还没有安装网络插件,现在应该在集群中部署一个pod网络,可以从kubernetes官方提供的各类组件中选择我们自己的网络插件,这里我们安装 calio (是一个安全的三层网络和网络策略驱动), 其Calico版本选择 https://docs.projectcalico.org/releases# - 安装 Pod 网络前节点的状态为NotReady
~/k8s# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master-1 NotReady control-plane,master 11m v1.20.8
k8s-node-1 NotReady <none> 4m19s v1.20.8
# - 从calico官方下载calico插件的部署清单。
~/k8s# wget https://docs.projectcalico.org/v3.18/manifests/calico.yaml
# - 自定义更改calico插件的地址池。
# The default IPv4 pool to create on startup if none exists. Pod IPs will be chosen from this range. Changing this value after installation will have no effect. This should fall within `--cluster-cidr`.
vim calico.yaml
- name: CALICO_IPV4POOL_CIDR
value: "192.168.0.0/16"
# - 部署 calico 网络插件
kubectl apply -f calico.yaml
# - 部署后查看kube-system中和网络相关的pod的运行状态,一般的状态回从Pending -> Init -> ContainerCreate -> Running过程转变。
~/k8s# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77dd468cdb-2lchv 0/1 Pending 0 18s
calico-node-pz9qx 0/1 Init:0/3 0 18s
calico-node-zvst7 0/1 Init:0/3 0 18s
coredns-54d67798b7-78n5j 0/1 Pending 0 15m
coredns-54d67798b7-z9c8f 0/1 Pending 0 15m
# - 等待几分钟后calico插件相关Pod已成功运行
~/k8s# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77dd468cdb-2lchv 1/1 Running 0 2m18s
calico-node-pz9qx 1/1 Running 0 2m18s
calico-node-zvst7 1/1 Running 0 2m18s
coredns-54d67798b7-78n5j 1/1 Running 0 17m
coredns-54d67798b7-z9c8f 1/1 Running 0 17m
# - 同时可以看到其Node节点的状态已变为Ready,至此calico网络插件安装部署已完毕。
~/k8s# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master-1 Ready control-plane,master 17m v1.20.8
k8s-node-1 Ready <none> 10m v1.20.8
apt install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
实践目标: 在Kubernetes集群中运行Nginx容器并设置nginx-status
查看,并尽可能使用k8s相关组件以及控制器的简单使用。
实践流程:
weiyigeek
的名称空间进行部署 Nginx Web 容器。$ kubectl create namespace weiyigeek
# namespace/weiyigeek created
# Yaml 方式创建 ConfigMap
tee nginx-conf.yaml <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-conf
namespace: weiyigeek
data:
nginx.conf: |
user nginx;
worker_processes auto;
worker_cpu_affinity 00000001 00000010 00000100 00001000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
worker_rlimit_nofile 65536;
events {
worker_connections 65535;
accept_mutex on;
multi_accept on;
}
http {
include mime.types;
default_type application/octet-stream;
log_format access_json '{"@timestamp":"$time_iso8601",'
'"host":"$server_addr",'
'"clientip":"$remote_addr",'
'"size":$body_bytes_sent,'
'"responsetime":$request_time,'
'"upstreamtime":"$upstream_response_time",'
'"upstreamhost":"$upstream_addr",'
'"http_host":"$host",'
'"url":"$uri",'
'"domain":"$host",'
'"xff":"$http_x_forwarded_for",'
'"referer":"$http_referer",'
'"status":"$status"}';
access_log /var/log/nginx/access.log access_json;
client_max_body_size 50M;
keepalive_timeout 300;
fastcgi_buffers 8 128k;
fastcgi_buffer_size 128k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
proxy_connect_timeout 90;
proxy_read_timeout 300;
proxy_send_timeout 300;
sendfile on;
server {
listen 80;
server_name localhost;
add_header Cache-Control no-cache;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /status
{
stub_status on;
access_log off;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
# include /etc/nginx/conf.d/*.conf;
}
EOF
# 方式1.采用 Yaml 方式创建 ConfigMap 存储配置信息
$ kubectl apply -f nginx-conf.yaml
# configmap/nginx-conf created
# 方式2.采用 nginx.conf 配置创建 ConfigMap 存储配置信息
# kubectl create configmap nginx-conf --from-file=nginx.conf
# kubectl describe configmap nginx-conf
# 查看创建的 configmap
$ kubectl -n weiyigeek get configmap nginx-conf
# NAME DATA AGE
# nginx-conf 1 25m
# 设置的配置信息
$ kubectl describe -n weiyigeek configmap nginx-conf
# Name: nginx-conf
# Namespace: weiyigeek
# Labels: <none>
# Annotations: <none>
# Data
# ====
# nginx.conf:
# ----
# user nginx;
# worker_processes auto;
# ............
# events {
# ...................
# }
# http {
# ....................
# server {
# listen 80;
# server_name localhost;
# add_header Cache-Control no-cache;
# location / {
# root /usr/share/nginx/html;
# index index.html index.htm;
# }
# location /status
# {
# stub_status on;
# access_log off;
# }
# error_page 500 502 503 504 /50x.html;
# location = /50x.html {
# root html;
# }
# }
# }
# Events: <none>
tee nginx-deployment.yaml <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-deploy
namespace: weiyigeek
spec:
replicas: 2
selector:
matchLabels:
app: nginx-test
template:
metadata:
labels:
app: nginx-test
spec:
initContainers:
- name: init-html
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', "env;echo ConfigMap:${MSG}--HostName-${HOSTNAME} > /usr/share/nginx/html/index.html"]
volumeMounts:
- name: web
mountPath: "/usr/share/nginx/html"
securityContext:
privileged: true
containers:
- name: nginx
image: "nginx:latest"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
volumeMounts:
- name: nginx-conf
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: web
mountPath: "/usr/share/nginx/html"
volumes:
- name: nginx-conf
configMap:
name: nginx-conf
items:
- key: nginx.conf
path: nginx.conf
- name: web
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: weiyigeek
labels:
app: nginx-test
spec:
type: NodePort
ports:
- name: nginx
port: 80
targetPort: 80
nodePort: 30000
protocol: TCP
selector:
app: nginx-test
EOF
# 利用 kubectl apply 部署 Nginx 控制器
kubectl apply -f nginx-deployment.yaml
# deployment.apps/web-deploy created
# service/nginx-service created
~/k8s/containerd# kubectl -n weiyigeek get pod
# NAME READY STATUS RESTARTS AGE
# web-deploy-99fbb677d-jbbwk 1/1 Running 0 48s
# web-deploy-99fbb677d-md5z9 1/1 Running 0 51s
~/k8s/containerd# kubectl -n weiyigeek get svc
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# nginx-service NodePort 10.105.172.104 <none> 80:30000/TCP 59m
# 可以看到以轮询的方式来访问nginx副本
~/k8s/containerd# curl http://10.105.172.104
# ConfigMap:--HostName-web-deploy-99fbb677d-md5z9
~/k8s/containerd# curl http://10.105.172.104
# ConfigMap:--HostName-web-deploy-99fbb677d-jbbwk
# 查看nginx的状态信息
~/k8s/containerd# curl http://10.105.172.104/status
# Active connections: 1
# server accepts handled requests
# 3 3 3
# Reading: 0 Writing: 1 Waiting: 0
kubectl -n weiyigeek port-forward --address 0.0.0.0 web-deploy-99fbb677d-jbbwk 80:80
# Forwarding from 0.0.0.0:80 -> 80
# Handling connection for 80
# Handling connection for 80
Nginx Status 详解:
active - (reading+writing)
意思就是 Nginx 已经处理完正在等候下一次请求指令的驻留连接。ctr: failed to dial "/run/containerd/containerd.sock": connection error: desc = "transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: permission denied"
# - 服务状态
systemctl status containerd.servic
# - 用户切换
su - root
ctr images ls
$ ctr container rm busybox
ERRO[0000] failed to delete container "busybox" error="container \"busybox\" in namespace \"default\": not found"
ctr: container "busybox" in namespace "default": not found
ctr namespace ls
INFO[0001] trying next host error="failed to authorize: failed to fetch anonymous token:
错误$ ctr -n k8s.io images pull docker.io/library/busybox:latest
# docker.io/library/busybox:latest: resolving |--------------------------------------|
# elapsed: 1.1 s total: 0.0 B (0.0 B/s)
# INFO[0001] trying next host error="failed to authorize: failed to fetch anonymous token: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fbusybox%3Apull&service=registry.docker.io: read tcp 10.10.107.220:62946->107.23.149.57:443: read: connection reset by peer" host=registry-1.docker.io
# ctr: failed to resolve reference "docker.io/library/busybox:latest": failed to authorize: failed to fetch anonymous token: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fbusybox%3Apull&service=registry.docker.io: read tcp 10.10.107.220:62946->107.23.149.57:443: read: connection reset by peer
# ctr command
# images, image, i manage images
ctr -n k8s.io image pull docker.io/library/busybox:latest
# docker.io/library/busybox:latest: resolved |++++++++++++++++++++++++++++++++++++++|
# index-sha256:930490f97e5b921535c153e0e7110d251134cc4b72bbb8133c6a5065cc68580d: done |++++++++++++++++++++++++++++++++++++++|
# manifest-sha256:dca71257cd2e72840a21f0323234bb2e33fea6d949fa0f21c5102146f583486b: done |++++++++++++++++++++++++++++++++++++++|
# layer-sha256:b71f96345d44b237decc0c2d6c2f9ad0d17fde83dad7579608f1f0764d9686f2: done |++++++++++++++++++++++++++++++++++++++|
# config-sha256:69593048aa3acfee0f75f20b77acb549de2472063053f6730c4091b53f2dfb02: done |++++++++++++++++++++++++++++++++++++++|
# elapsed: 2.5 s total: 0.0 B (0.0 B/s)
# - 此错误可能由以下原因引起:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
# - 如果您使用的是systemd的系统,则可以尝试使用以下命令排除错误:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
# - 此外,当容器运行时启动时,控制平面组件可能已崩溃或退出。要进行故障排除,请使用首选容器运行时CLI列出所有容器。
- 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
# - 找到发生故障的容器后,可以使用以下工具检查其日志:
- 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID'
# - 集群 pod 排错常用命令指南
kubectl get pod <pod-name> -o yaml # 查看 Pod 的配置是否正确
kubectl describe pod <pod-name> # 查看 Pod 的事件
kubectl logs <pod-name> [-c <container-name>] # 查看容器日志
journalctl -xeu kubelet
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.108499 11333 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.206203 11333 kubelet.go:2263] node "k8s-master-1" not found
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.208552 11333 remote_runtime.go:116] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to get sandbox image "k8s.gcr.io/pause:3.2": failed to pull image ">
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.208609 11333 kuberuntime_sandbox.go:70] CreatePodSandbox for pod "kube-controller-manager-k8s-master-1_kube-system(d1d11a3cb97124022c9d85b070508dfa)" failed: rpc error: code = Unknown de>
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.208621 11333 kuberuntime_manager.go:755] createPodSandbox for pod "kube-controller-manager-k8s-master-1_kube-system(d1d11a3cb97124022c9d85b070508dfa)" failed: rpc error: code = Unknown d>
# Jul 06 22:15:29 k8s-master-1 kubelet[11333]: E0706 22:15:29.208682 11333 pod_workers.go:191] Error syncing pod d1d11a3cb97124022c9d85b070508dfa ("kube-controller-manager-k8s-master-1_kube-system(d1d11a3cb97124022c9d85b070508dfa)"), skipping: fail>
sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /etc/containerd/config.toml
FATA[0000] pulling image failed: rpc error: code = Unknown desc = failed to pull and unpack image
错误。$ crictl pull docker.io/pollyduan/ingress-nginx-controller:v0.47.0
# FATA[0000] pulling image failed: rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/pollyduan/ingress-nginx-controller:v0.47.0": failed to resolve reference "docker.io/pollyduan/ingress-nginx-controller:v0.47.0": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
$ tee /etc/crictl.yaml <<'EOF'
runtime-endpoint: /run/containerd/containerd.sock
image-endpoint: "/run/containerd/containerd.sock"
timeout: 0
debug: false
EOF
$ grep -C 3 'registry.mirrors."docker.io" ' /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://xlx9erfu.mirror.aliyuncs.com"]
Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container.
错误Warning FailedCreatePodSandBox 89s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "1c97ad2710e2939c0591477f9d6dde8e0d7d31b3fbc138a7fa38aaa657566a9a" network for pod "coredns-7f89b7bc75-qg924": networkPlugin cni failed to set up pod "coredns-7f89b7bc75-qg924_kube-system" network: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"), failed to clean up sandbox container "1c97ad2710e2939c0591477f9d6dde8e0d7d31b3fbc138a7fa38aaa657566a9a" network for pod "coredns-7f89b7bc75-qg924": networkPlugin cni failed to teardown pod "coredns-7f89b7bc75-qg924_kube-system" network: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")]
$ vim calico.yaml
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# 下方熙增新增
- name: IP_AUTODETECTION_METHOD
value: "interface=ens192"
# ens192为本地网卡名字
$ kubectl apply -f calico.yaml
$ kubectl get pods -n kube-system
[ERROR SystemVerification]: unexpected kernel config: CONFIG_CGROUP_PIDS
错误error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR SystemVerification]: unexpected kernel config: CONFIG_CGROUP_PIDS
[ERROR SystemVerification]: missing required cgroups: pids
CONFIG_CGROUP_PIDS=y
然后你再升级一下内核就可以了。$ cat /boot/config-`uname -r` | grep CGROUP
CONFIG_CGROUPS=y
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
CONFIG_CGROUP_RDMA=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
# CONFIG_BLK_CGROUP_IOLATENCY is not set
CONFIG_BLK_CGROUP_IOCOST=y
# CONFIG_BFQ_CGROUP_DEBUG is not set
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NET_CLS_CGROUP=m
CONFIG_CGROUP_NET_PRIO=y
CONFIG_CGROUP_NET_CLASSID=y
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。