在Kubernetes集群上运行多个服务和应用程序时,统一的日志收集不可或缺,Elasticsearch、Filebeat和Kibana(EFK)堆栈是目前较受欢迎的日志收集解决方案。在本教程中,我们将为部署在集群中的应用和集群本身设置生产级Kubernetes日志记录。将使用Elasticsearch作为日志后端,同时Elasticsearch的设置将具有极高的可扩展性和容错性。本教程一共有两篇,这是第一篇。
部署架构
在部署过程中有几个重要的配置需要特别注意:
那么接下来我们将在GKE集群上部署这些服务(你使用其他的云服务也可以)。
Master节点Deployment和headless service
部署以下manifest以创建master节点和headless service
apiVersion: v1
kind: Namespace
metadata:
name: elasticsearch
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-master
namespace: elasticsearch
labels:
component: elasticsearch
role: master
spec:
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- master
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-master
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NUMBER_OF_MASTERS
value: "2"
- name: NODE_MASTER
value: "true"
- name: NODE_INGEST
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 2
ports:
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: "storage"
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
namespace: elasticsearch
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: transport
port: 9300
protocol: TCP
clusterIP: None
如果你关注任何一个master节点 pod的日志,你会看到它们之间的master节点的选举。这时,master节点pod会选择哪一个是该组的leader。当跟踪master节点日志时,你还会看到新的数据和客户端节点何时被添加。
root$ kubectl -n elasticsearch logs -f po/es-master-594b58b86c-9jkj2 | grep ClusterApplierService
[2018-10-21T07:41:54,958][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-9jkj2] detected_master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300}, added {{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300},{es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [3]])
上面可以看到,名为es-master-594b58b86c-bj7g7的es-master pod被选为领导者,其他2个pod被添加到集群中。名为elasticsearch-discovery的headless service默认设置为docker镜像中的环境变量,用于节点间的发现。当然,这个可以被覆盖。
部署数据节点
我们将使用以下manifest来部署数据节点的有状态集和Headless service:
apiVersion: v1
kind: Namespace
metadata:
name: elasticsearch
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
fsType: xfs
allowVolumeExpansion: true
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: es-data
namespace: elasticsearch
labels:
component: elasticsearch
role: data
spec:
serviceName: elasticsearch-data
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: data
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- data
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 2
ports:
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-data
namespace: elasticsearch
labels:
component: elasticsearch
role: data
spec:
ports:
- port: 9300
name: transport
clusterIP: None
selector:
component: elasticsearch
role: data
在数据节点中,headless service为节点提供了稳定的Network Identity,也有助于节点之间的数据传输。在将持久卷附加到pod之前,格式化持久卷十分重要。这可以通过在创建存储类时指定卷类型来完成。我们也可以设置一个flag来允许卷即时扩展。
...
parameters:
type: pd-ssd
fsType: xfs
allowVolumeExpansion: true
...
客户端节点部署
我们将使用以下manifest来创建客户端节点Deployment和内部服务:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-client
namespace: elasticsearch
labels:
component: elasticsearch
role: client
spec:
replicas: 2
template:
metadata:
labels:
component: elasticsearch
role: client
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- client
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-client
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NODE_MASTER
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: NETWORK_HOST
value: _site_,_lo_
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: storage
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
namespace: elasticsearch
labels:
component: elasticsearch
role: client
spec:
selector:
component: elasticsearch
role: client
ports:
- name: http
port: 9200
type: LoadBalancer
在这里部署服务的目的是为了从Kubernetes集群外部访问ES集群,但服务仍在我们子网内部。注释 "cloud.google.com/load-balancer-type: Internal "确保了这一点。
然而,如果对我们的ES集群进行读/写的应用部署在集群内部,那么可以通过http://elasticsearch.elasticsearch:9200访问Elasticsearch服务。
一旦所有组件都部署完毕,我们应该验证以下内容:
1、 使用Ubuntu容器从Kubernetes集群内部部署Elasticsearch
root$ kubectl run my-shell --rm -i --tty --image ubuntu -- bash
root@my-shell-68974bb7f7-pj9x6:/# curl http://elasticsearch.elasticsearch:9200/_cluster/health?pretty
{
"cluster_name" : "my-es",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 7,
"number_of_data_nodes" : 2,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
2、 从集群外部使用GCP内部负载均衡器IP(本例为10.9.120.8)部署Elasticsearch。当我们使用curl http://10.9.120.8:9200/_cluster/health?pretty 检查健康状况时,输出结果应该和上面一样。
3、 我们的ES-Pods反亲和规则
root$ kubectl -n elasticsearch get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
es-client-69b84b46d8-kr7j4 1/1 Running 0 10m 10.8.14.52 gke-cluster1-pool1-d2ef2b34-t6h9
es-client-69b84b46d8-v5pj2 1/1 Running 0 10m 10.8.15.53 gke-cluster1-pool1-42b4fbc4-cncn
es-data-0 1/1 Running 0 12m 10.8.16.58 gke-cluster1-pool1-4cfd808c-kpx1
es-data-1 1/1 Running 0 12m 10.8.15.52 gke-cluster1-pool1-42b4fbc4-cncn
es-master-594b58b86c-9jkj2 1/1 Running 0 18m 10.8.15.51 gke-cluster1-pool1-42b4fbc4-cncn
es-master-594b58b86c-bj7g7 1/1 Running 0 18m 10.8.16.57 gke-cluster1-pool1-4cfd808c-kpx1
es-master-594b58b86c-lfpps 1/1 Running 0 18m 10.8.14.51 gke-cluster1-pool1-d2ef2b34-t6h9
考虑弹性伸缩
我们可以根据CPU阈值为客户端节点部署自动弹性伸缩器(autoscaler)。客户端节点的HPA示例如下:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: es-client
namespace: elasticsearch
spec:
maxReplicas: 5
minReplicas: 2
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: es-client
targetCPUUtilizationPercentage: 80
每当autoscaler启动时,我们可以通过观察任何一个master节点pods的日志来观察新的客户端节点pod被添加到集群中。
在数据节点Pod的情况下,我们要做的就是使用K8s Dashboard或GKE控制台增加副本的数量。新创建的数据节点将被自动添加到集群中,并开始复制其他节点的数据。master节点Pod不需要自动扩展,因为它们只存储集群状态信息。如果你想添加更多的数据节点,请确保集群中的master节点数量不是偶数。另外,确保环境变量NUMBER_OF_MASTERS也保持一致地更新。
#Check logs of es-master leader pod
root$ kubectl -n elasticsearch logs po/es-master-594b58b86c-bj7g7 | grep ClusterApplierService
[2018-10-21T07:41:53,731][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] new_master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300}, added {{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [1] source [zen-disco-elected-as-master ([1] nodes joined)[{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300}]]])
[2018-10-21T07:41:55,162][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-master-594b58b86c-9jkj2}{x9Prp1VbTq6_kALQVNwIWg}{7NHUSVpuS0mFDTXzAeKRcg}{10.9.125.81}{10.9.125.81:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [3] source [zen-disco-node-join[{es-master-594b58b86c-9jkj2}{x9Prp1VbTq6_kALQVNwIWg}{7NHUSVpuS0mFDTXzAeKRcg}{10.9.125.81}{10.9.125.81:9300}]]])
[2018-10-21T07:48:02,485][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-data-0}{SAOhUiLiRkazskZ_TC6EBQ}{qirmfVJBTjSBQtHZnz-QZw}{10.9.126.88}{10.9.126.88:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [4] source [zen-disco-node-join[{es-data-0}{SAOhUiLiRkazskZ_TC6EBQ}{qirmfVJBTjSBQtHZnz-QZw}{10.9.126.88}{10.9.126.88:9300}]]])
[2018-10-21T07:48:21,984][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-data-1}{fiv5Wh29TRWGPumm5ypJfA}{EXqKGSzIQquRyWRzxIOWhQ}{10.9.125.82}{10.9.125.82:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [5] source [zen-disco-node-join[{es-data-1}{fiv5Wh29TRWGPumm5ypJfA}{EXqKGSzIQquRyWRzxIOWhQ}{10.9.125.82}{10.9.125.82:9300}]]])
[2018-10-21T07:50:51,245][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-client-69b84b46d8-v5pj2}{MMjA_tlTS7ux-UW44i0osg}{rOE4nB_jSmaIQVDZCjP8Rg}{10.9.125.83}{10.9.125.83:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [6] source [zen-disco-node-join[{es-client-69b84b46d8-v5pj2}{MMjA_tlTS7ux-UW44i0osg}{rOE4nB_jSmaIQVDZCjP8Rg}{10.9.125.83}{10.9.125.83:9300}]]])
领导者master pod的日志清楚地描述了每个节点何时被添加到集群中。这在调试问题时非常有用。
部署Kibana和ES-HQ
Kibana是一个简单的可视化ES数据的工具,而ES-HQ则有助于Elasticsearch集群的管理和监控。对于Kibana和ES-HQ的部署,我们需要牢记以下几点:
Kibana部署
我们将使用以下manifest来创建Kibana Deployment和服务:
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: logging
name: kibana
labels:
component: kibana
spec:
replicas: 1
selector:
matchLabels:
component: kibana
template:
metadata:
labels:
component: kibana
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana-oss:6.2.2
env:
- name: CLUSTER_NAME
value: my-es
- name: ELASTICSEARCH_URL
value: http://elasticsearch.elasticsearch:9200
resources:
limits:
cpu: 200m
requests:
cpu: 100m
ports:
- containerPort: 5601
name: http
---
apiVersion: v1
kind: Service
metadata:
namespace: logging
name: kibana
annotations:
cloud.google.com/load-balancer-type: "Internal"
labels:
component: kibana
spec:
selector:
component: kibana
ports:
- name: http
port: 5601
type: LoadBalancer
ES-HQ部署
我们将使用以下manifest来创建ES-HQ Deployment和服务:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-hq
namespace: elasticsearch
labels:
component: elasticsearch
role: hq
spec:
replicas: 1
template:
metadata:
labels:
component: elasticsearch
role: hq
spec:
containers:
- name: es-hq
image: elastichq/elasticsearch-hq:release-v3.4.0
env:
- name: HQ_DEFAULT_URL
value: http://elasticsearch:9200
resources:
limits:
cpu: 0.5
ports:
- containerPort: 5000
name: http
---
apiVersion: v1
kind: Service
metadata:
name: hq
namespace: elasticsearch
labels:
component: elasticsearch
role: hq
spec:
selector:
component: elasticsearch
role: hq
ports:
- name: http
port: 5000
type: LoadBalancer
我们可以使用新创建的内部LoadBalancers访问这两个服务。
访问http://<External-Ip-Kibana-Service>/app/kibana#/home?_g=()
Kibana Dashboard
访问http://<External-Ip-ES-Hq-Service>/#!/clusters/my-es
用于集群监控和管理的ElasticHQ Dashboard
总 结
至此,部署ES后端进行日志记录的工作就结束了。我们部署的Elasticsearch也可以被其他应用使用。客户端节点应该在高负载下自动扩展,数据节点可以通过增加statefulset中的副本数来增加。我们还需要调整一些环境变量,但这是相当简单的。在下一篇文章中,我们将学习部署Filebeat DaemonSet来发送日志到Elasticsearch后端。请保持关注哟~
原文链接:
https://appfleet.com/blog/orchestrating-elasticsearch-on-kubernetes/
About Rancher Labs
Rancher Labs由CloudStack之父梁胜创建。旗舰产品Rancher是一个开源的企业级Kubernetes管理平台,实现了Kubernetes集群在混合云+本地数据中心的集中部署与管理。Rancher一向因操作体验的直观、极简备受用户青睐,被Forrester评为2018年全球容器管理平台领导厂商,被Gartner评为2017年全球最酷的云基础设施供应商。
目前Rancher在全球拥有超过三亿的核心镜像下载量,并拥有包括中国联通、中国平安、中国人寿、上汽集团、三星、施耐德电气、西门子、育碧游戏、LINE、WWK保险集团、澳电讯公司、德国铁路、厦门航空、新东方等全球著名企业在内的共40000家企业客户。
文章转载自RancherLabs。