首页
学习
活动
专区
圈层
工具
发布
50 篇文章
1
kubernetes与velero的第一次尝试
2
在Kubernetes中如何针对Namespace进行资源限制?
3
kubernetes之metrics-server安装与配置
4
kubernetes部署metrics-server
5
Kubernetes1.20.9摘掉一个master节点再重新加入(ETCD需要注意的)
6
Kubernetes 1.17.17升级到1.18.20
7
Kubernetes 1.18.20升级到1.19.12
8
Kubernetes 1.19.12升级到1.20.9(强调一下selfLink)
9
Kubernetes 1.16.15升级到1.17.17
10
使用 kainstall 工具一键部署 kubernetes 高可用集群
11
附034.Kubernetes_v1.21.0高可用部署架构二
12
附016.Kubernetes_v1.17.4高可用部署
13
附022.Kubernetes_v1.18.3高可用部署架构一
14
附024.Kubernetes_v1.18.3高可用部署架构二
15
使用 StatefulSet 部署 etcd 集群
16
Kubernetes 稳定性保障手册 -- 极简版
17
Linux(centos7)离现安装kubernetes1.19.2和docker——组件部分
18
docker register 私有仓库部署 - http模式
19
KubeSphere 开源 KubeEye:Kubernetes 集群自动巡检工具
20
K8S 中的 CPUThrottlingHigh 到底是个什么鬼?
21
全链路分布式跟踪系统 Apache SkyWalking 入门教程
22
pod Evicted的状态究竟是何人所为
23
使用 ezctl 工具部署和管理 Kubernetes 集群
24
Kubernetes部署策略详解
25
kubernetes容器探针检测
26
使用Spring Boot实现动态健康检查HealthChecks
27
真一文搞定 ingress-nginx 的使用
28
K8S备份、恢复、迁移神器 Velero
29
一次关于k8s kubectl top 和 contained ps 不一致的问题探究
30
kubernetes备份恢复之velero
31
使用 Velero 进行集群备份与迁移
32
TKE集群中nginx-ingress使用实践
33
使用velero进行kubernetes灾备
34
Kubernetes 映射外部服务
35
运维体系建设套路
36
k8s解决pod调度不均衡的问题
37
ingress中虚拟路径解决方案
38
容器下的两地三中心建设
39
k8s集群外的主机访问pod的解决方案
40
k8s基础-健康检查机制
41
k8s基础-标签使用
42
ingress-nginx请求改写
43
nginx ingress server alias 多域名多证书问题
44
JAVA | Java 解决跨域问题 花式解决跨域问题
45
如何通过ingress-nginx实现应用灰度发布?
46
在Kubernetes(k8s)中使用GPU
47
使用 Prometheus-Operator 监控 Calico
48
使用Kubespray部署Kubernetes集群
49
云原生下的CI/CD:Argo CD 详解,手把手教你入门
50
Pod的健康检查机制
清单首页k8s文章详情

k8s基础-健康检查机制

探针类型

  • Execaction
    • 该探针在容器内执行任意命令,并检查命令的退出状态码,如果状态码是0,则探测成功,否则重启
  • TCPSocketAction
    • 该探针尝试与容器指定端口建立TCP连接,如果连接成功建立,则探测成功,否则容器重新启动
  • HTTPGetAction
    • 该探针对容器的IP地址执行HTTP GET请求,如果探测器收到响应,并且响应状态码没有错误,则认为探测成功,如果返回一个不是期望的状态码或未响应,则视为失败,容器将会被重新启动

示例

创建一个http get类型的探针

代码语言:javascript
复制
# cat nginx-health.yml 
apiVersion: v1
kind: Pod
metadata:
  name: nginx-health
spec:
  nodeSelector:
    server: 'backend'
  containers:
  - image: nginx:latest
    name: nginx-health
    ports:
    - containerPort: 80
      protocol: TCP
    livenessProbe:
      httpGet:
        path: /index.php
        port: 80

这里我探测了一个并不存在的地址,所以pod在探测失败后肯定会重启 启动该pod,启动成功后查看pod的描述和日志

代码语言:javascript
复制
# kubectl logs -f nginx-health
2019/10/14 08:56:34 [error] 6#6: *1 open() "/usr/share/nginx/html/index.php" failed (2: No such file or directory), client: 192.168.152.168, server: localhost, request: "GET /index.php HTTP/1.1", host: "192.168.166.155:80"
192.168.152.168 - - [14/Oct/2019:08:56:34 +0000] "GET /index.php HTTP/1.1" 404 153 "-" "kube-probe/1.15" "-"

看下pod信息

代码语言:javascript
复制
# kubectl describe pod nginx-health
Name:         nginx-health
Namespace:    default
Priority:     0
Node:         node1/192.168.152.168
Start Time:   Mon, 14 Oct 2019 04:50:10 -0400
Labels:       <none>
Annotations:  cni.projectcalico.org/podIP: 192.168.166.155/32
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx-health","namespace":"default"},"spec":{"containers":[{"image":"...
Status:       Running
IP:           192.168.166.155
Containers:
  nginx-health:
    Container ID:   docker://36e07faa8b8d0eb7f3e5465186cc2f23cf8198776d45c546f9ead3264e901c02
    Image:          nginx:latest
    Image ID:       docker-pullable://nginx@sha256:aeded0f2a861747f43a01cf1018cf9efe2bdd02afd57d2b11fcc7fcadc16ccd1
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 14 Oct 2019 05:00:09 -0400
      Finished:     Mon, 14 Oct 2019 05:00:34 -0400
    Ready:          False
    Restart Count:  7
    Liveness:       http-get http://:80/index.php delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ps4lj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-ps4lj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ps4lj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  server=backend
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  11m                  default-scheduler  Successfully assigned default/nginx-health to node1
  Normal   Created    9m21s (x3 over 11m)  kubelet, node1     Created container nginx-health
  Normal   Started    9m21s (x3 over 11m)  kubelet, node1     Started container nginx-health
  Normal   Pulling    8m52s (x4 over 11m)  kubelet, node1     Pulling image "nginx:latest"
  Normal   Killing    8m52s (x3 over 10m)  kubelet, node1     Container nginx-health failed liveness probe, will be restarted
  Normal   Pulled     6m3s (x6 over 11m)   kubelet, node1     Successfully pulled image "nginx:latest"
  Warning  Unhealthy  82s (x22 over 10m)   kubelet, node1     Liveness probe failed: HTTP probe failed with statuscode: 404

可以看到相关报错,从该描述中也可以看到相关信息: delay=0s 表示在容器启动后立即开始探测 timeout=1s 表示容器必须在一秒内进行响应,否则记作失败 period=10s 表示每隔10秒探测一次 failure=3 表示连续三次探测失败后重启容器

现在我们改成一个存在的链接进行探测,然后将以上提到的几个指标修改下,比如等pod启动十秒后再进此探测 我们怎么去查看有哪些指标呢?k8s有相关类似的help

代码语言:javascript
复制
# kubectl explain pods.spec.containers.livenessProbe
KIND:     Pod
VERSION:  v1

RESOURCE: livenessProbe <Object>

DESCRIPTION:
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

     Probe describes a health check to be performed against a container to
     determine whether it is alive or ready to receive traffic.

FIELDS:
   exec <Object>
     One and only one of the following should be specified. Exec specifies the
     action to take.

   failureThreshold     <integer>
     Minimum consecutive failures for the probe to be considered failed after
     having succeeded. Defaults to 3. Minimum value is 1.

   httpGet      <Object>
     HTTPGet specifies the http request to perform.

   initialDelaySeconds  <integer>
     Number of seconds after the container has started before liveness probes
     are initiated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

   periodSeconds        <integer>
     How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
     value is 1.

   successThreshold     <integer>
     Minimum consecutive successes for the probe to be considered successful
     after having failed. Defaults to 1. Must be 1 for liveness. Minimum value
     is 1.

   tcpSocket    <Object>
     TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported

   timeoutSeconds       <integer>
     Number of seconds after which the probe times out. Defaults to 1 second.
     Minimum value is 1. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

修改后的yml文件如下

代码语言:javascript
复制
# cat nginx-health.yml 
apiVersion: v1
kind: Pod
metadata:
  name: nginx-health
spec:
  nodeSelector:
    server: 'backend'
  containers:
  - image: nginx:latest
    name: nginx-health
    ports:
    - containerPort: 80
      protocol: TCP
    livenessProbe:
      httpGet:
        path: /index.html
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 10

启动后再次查看日志

代码语言:javascript
复制
# kubectl logs -f nginx-health
192.168.152.168 - - [14/Oct/2019:09:03:39 +0000] "GET /index.html HTTP/1.1" 200 612 "-" "kube-probe/1.15" "-"
192.168.152.168 - - [14/Oct/2019:09:03:49 +0000] "GET /index.html HTTP/1.1" 200 612 "-" "kube-probe/1.15" "-"

查看修改后的几个指标有没有生效

代码语言:javascript
复制
# kubectl describe pod nginx-health
Name:         nginx-health
Namespace:    default
Priority:     0
Node:         node1/192.168.152.168
Start Time:   Mon, 14 Oct 2019 05:17:49 -0400
Labels:       <none>
Annotations:  cni.projectcalico.org/podIP: 192.168.166.157/32
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx-health","namespace":"default"},"spec":{"containers":[{"image":"...
Status:       Running
IP:           192.168.166.157
Containers:
  nginx-health:
    Container ID:   docker://011be58ccbe6fbc6e490588ec5a1f60028e1593a1f28a59022fda72ff544cffc
    Image:          nginx:latest
    Image ID:       docker-pullable://nginx@sha256:aeded0f2a861747f43a01cf1018cf9efe2bdd02afd57d2b11fcc7fcadc16ccd1
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Mon, 14 Oct 2019 05:18:18 -0400
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:80/index.html delay=10s timeout=10s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ps4lj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-ps4lj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ps4lj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  server=backend
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  67s   default-scheduler  Successfully assigned default/nginx-health to node1
  Normal  Pulling    66s   kubelet, node1     Pulling image "nginx:latest"
  Normal  Pulled     38s   kubelet, node1     Successfully pulled image "nginx:latest"
  Normal  Created    38s   kubelet, node1     Created container nginx-health
  Normal  Started    38s   kubelet, node1     Started container nginx-health
下一篇
举报
领券