[TOC]
Etcd应用背景说明:
在实际生产环境中,有很多应用在同一时刻只能启动一个实例,例如更新数据库的操作,多个实例同时更新不仅会降低系统性能,还可能导致数据的不一致。但是单点部署也使得系统的容灾性减弱,比如进程异常退出
目前进程保活,也有很多方案如supervisor和systemd
。但是如果宿主机down掉呢?
所有的进程保活方法都会无济于事,于是我们可以采用基于etcd自带的leader选举机制
,轻松的使服务具备了高可用性。
Q:什么是Etcd?有何作用?
答:Etcd(发音是“et-cee-dee”)
作为开源、分布式、高可用、强一致性
的key-value存储系统
(采用Go开发则跨平台
), 它由CoreOS团队开发,现在由Cloud Native Computing Foundation负责管理,提供了配置共享和服务发现等众多功能。它是许多分布式系统的主干,为跨服务器集群存储数据提供可靠的方式。
通过raft算法维护集群
中各个节点的通信和数据一致性,节点之间是对等的关系,即使leader节点故障会很快选举出新的leader来保证系统的正常运行;
目前已广泛应用在kubernetes、ROOK、CoreDNS、M3以及openstack
等领域
官方地址&文档: https://etcd.io/docs/
补充:2020年4月23日
(1)Etcd可用版本一览说明
* v3.4.x
* v3.3.x
* v3.2.x
* v3.1.x
* v2
etcd概念词汇表:
Etcd的架构如下图所示,主要分为四部分 HTTP server、Store、Raft和WAL
组成:
WeiyiGeek.Etcd的架构
组成部分说明:
数据索引、节点状态变更、监控与反馈、事件处理与执行
等等。保证节点之间数据的强一致性
。ETCD集群是一个分布式系统,每个ETCD节点都维护了一个状态机
,并且存储了完整的数据,任意时刻至多存在一个有效的主节点,而主节点处理所有来自客户端的读写操作。
WeiyiGeek.
为了更好的了解Etcd工作机制我们需要了解三个概念(也就是下图所想表达)
每个集群在任何给定的时间内只能有一个leader
补充情况说明:
节点将通过启动新的term将自己标记为候选
,并要求其他节点投票来开始新的election(每个节点投票给请求其投票的第一个候选)。Q:上面所诉实际就是选择主机制,那什么是选主机制呢 答:举个例子在军事演习中,我们总会发现某架预警机周围分布着多架战斗机和歼击机,他们统一听从预警机的调度,有序的完成消灭敌军的任务。那么在这个集群中,预警机就类似于我们选主中的master,某个集群有且只有一个master,完成任务的分发等工作,其他节点配合行动,当这个master节点挂掉之后,要能够立刻选出新的节点作为master。
如上所述: 一个基于Raft的系统中,集群使用elections为给定的term选择leader。
意味着每次更改都需要得到集群节点的仲裁才能提交
)通过上面基本了解我们再来看Replicate State Machine
状态转换规则:ETCD中每个节点的状态集合为(Follower、Candidate、Leader)
WeiyiGeek.
流程声明:
(选举成功)
;当收到票数不足半数选举失败或者选举超时
,注意若本轮未选出主节点将进行下一轮选举。(1)硬件建议
vCPU、16GB RAM、50GB SSD GCE
实例上运行etcd(生产环境中推荐按照官方配置进行自定义资源配置
)因为数据保存在匿名内存中而不是从文件映射内存
;注意事项:
推荐使用SSD
奇数个集群数量
,因为需要通过仲裁来更新集群的状态不超过7个节点
单机实例安装方式:
此处我们采用Build的方式进行安装Etcd,具体的实现流程如下;
(2)安装流程
Step1.Go环境安装采用二进制包直接解压安装https://golang.org/dl/
(自带梯子),且版本必须在1.13以上;
wget https://studygolang.com/dl/golang/go1.14.2.linux-amd64.tar.gz -O /opt/go1.14.2.linux-amd64.tar.gz
tar -zxf /opt/go1.14.2.linux-amd64.tar.gz -C /usr/local/
#在/root/.profile进行添加
cat >> /etc/profile<<END
#Go环境配置
export GOROOT=/usr/local/go
#第三方包的安装包路径
export GOBIN=\$GOROOT/bin
export GOPATH=\$GOROOT/path
export PATH=\$PATH:\$GOBIN:\$GOPATH
END
source /etc/profile
mkdir -vp /usr/local/go/path
ln -s /usr/local/go/bin/* /usr/local/bin/
go version
Step2.如果使用官方构建脚本从主分支构建etcd我们先进行Clone然后build即可(如何采用此种方式安装就不需要第三步了
)
git config --global http.proxy 'socks5://10.20.172.135:2083'
cd etcd
./build
Step3.如果通过go get从主分支构建一个vendored etcd(一键获取代码、编译并安装
), 执行以下命令即/usr/local/go/path
Go第三方包安装目录中看见下载文件;
$ echo $GOPATH # /usr/local/go/path
go get -v go.etcd.io/etcd
go get -v go.etcd.io/etcd/etcdctl
Step4.测试安装通过启动etcd并设置密钥,检查etcd二进制文件是否正确构建。
./usr/local/go/path/bin/etcd
{"level":"warn","ts":"2020-04-23T15:12:17.368+0800","caller":"etcdmain/etcd.go:89","msg":"'data-dir' was empty; using default","data-dir":"default.etcd"}
{"level":"info","ts":"2020-04-23T15:12:17.368+0800","caller":"embed/etcd.go:113","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":"2020-04-23T15:12:17.990+0800","caller":"membership/cluster.go:524","msg":"set initial cluster version","cluster-id":"cdf818194e3a8c32","local-member-id":"8e9e05c52164694d","cluster-version":"3.5"}
{"level":"info","ts":"2020-04-23T15:12:17.990+0800","caller":"etcdserver/server.go:1850","msg":"published local member to cluster through raft","local-member-id":"8e9e05c52164694d","local-member-attributes":"{Name:default ClientURLs:[http://localhost:2379]}","request-path":"/0/members/8e9e05c52164694d/attributes","cluster-id":"cdf818194e3a8c32","publish-timeout":"7s"}
{"level":"info","ts":"2020-04-23T15:12:17.991+0800","caller":"embed/serve.go:139","msg":"serving client traffic insecurely; this is strongly discouraged!","address":"127.0.0.1:2379"}
Step5.put一个关键key-value进行测试
#如果OK被打印,那么etcd正在工作
[root@initiator bin]# /usr/local/go/bin/etcdctl put name WeiyiGeek
OK
[root@initiator bin]# /usr/local/go/bin/etcdctl get name
name
WeiyiGeek
[root@node3 ~]# etcdctl --endpoints=$ENDPOINTS --write-out="json" get name
{"header":{"cluster_id":2819294416482393232,"member_id":17704130064291257467,"revision":7300,"raft_term":301},"kvs":[{"key":"bmFtZQ==","create_revision":7300,"mod_revision":7300,"version":1,"value":"V2VpeWlHZWVr"}],"count":1}
Step6.至此简单实例的etcd安装完成;
#补充etcd服务启动的时候开放了两个端口(默认只能本机访问)
tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 108333/./etcd #客户端使用
tcp 0 0 127.0.0.1:2380 0.0.0.0:* LISTEN 108333/./etcd #对等etcd peer使用(集群使用)
参考:
除了采用etcdctl命令进行数据的增删改查,我们也可以采用CURL命令采用GET/PUT方式操作etcd中的数据,但是注意V2/V3版本的些许不同
;
补充知识:[2020年4月26日 10:20:57]
原子CAS操作(Compare And Swap)
: 基本用途就是创建分布式的锁服务,即选主仅当客户端提供的条件等于当前etcd的条件时,才会修改一个key的值。
当前提供的可以比较的条件有:
## [通用方式]
#版本查看
curl -X GET http://192.168.10.243:2379/version
#健康状态
curl -L http://192.168.10.243:2379/health
#度量查看
curl -sL http://localhost:22379/metrics
[version 2]
#查看所有键并以json格式显示
curl -LsS http://192.168.10.243:2379/v2/keys | python -mjson.tool
# put:新建key值为keyname value为“WeiyiGeekd”
curl -X PUT -L http://192.168.10.243:2379/v2/keys/keyname -d value="WeiyiGeek"
# get:查看key
curl -X GET -L http://192.168.10.243:2379/v2/keys/keyname
# delete:删除key
curl -X DELETE -L http://127.0.0.1:2379/v2/keys/keyname
# 新建TTL的key
curl -X PUT http://127.0.0.1:2379/v2/keys/message -d value="Hello world" -d ttl=30
curl http://127.0.0.1:2379/v2/keys/message
{"action":"get","node":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:08:10.674930705Z","ttl":2,"modifiedIndex":20,"createdIndex":20}}
# 取消key的TTL
curl -X PUT http://127.0.0.1:2379/v2/keys/message -d value="Hello world" -d ttl= -d prevExist=true
{"action":"update","node":{"key":"/message","value":"Hello world","modifiedIndex":23,"createdIndex":22},"prevNode":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:10:23.220573683Z","ttl":16,"modifiedIndex":22,"createdIndex":22}}
# 重置key的TTL
curl -X PUT http://127.0.0.1:2379/v2/keys/message -d ttl=30 -d refresh=true -d prevExist=true
{"action":"update","node":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:15:29.569276199Z","ttl":30,"modifiedIndex":26,"createdIndex":25},"prevNode":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:15:01.34698273Z","ttl":2,"modifiedIndex":25,"createdIndex":25}}
# 新建带有TTL的目录
curl http://127.0.0.1:2379/v2/keys/dir -d ttl=30 -d dir=true
# 在TTL到期前更新该目录的TTL
curl -X PUT http://127.0.0.1:2379/v2/keys/dir -d ttl=60 -d dir=true -d prevExist=true
# 向该目录插入数据
curl -X PUT http://127.0.0.1:2379/v2/keys/dir/message -d value="Hello world"
# 查看该目录中的数据,但是该目录到期后数据会被自动删除
curl http://127.0.0.1:2379/v2/keys/dir/message
{"action":"get","node":{"key":"/dir/message","value":"Hello world","modifiedIndex":51,"createdIndex":51}}
curl http://127.0.0.1:2379/v2/keys/dir/message
{"errorCode":100,"message":"Key not found","cause":"/dir","index":52}
# 自动创建有序的key
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job1
{"action":"create","node":{"key":"/queue/00000000000000000042","value":"Job1","modifiedIndex":42,"createdIndex":42}}
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job2
{"action":"create","node":{"key":"/queue/00000000000000000043","value":"Job2","modifiedIndex":43,"createdIndex":43}}
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job3
{"action":"create","node":{"key":"/queue/00000000000000000044","value":"Job3","modifiedIndex":44,"createdIndex":44}}
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job4
{"action":"create","node":{"key":"/queue/00000000000000000045","value":"Job4","modifiedIndex":45,"createdIndex":45}}
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job5
{"action":"create","node":{"key":"/queue/00000000000000000046","value":"Job5","modifiedIndex":46,"createdIndex":46}}
curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job6
{"action":"create","node":{"key":"/queue/00000000000000000047","value":"Job6","modifiedIndex":47,"createdIndex":47}}
# 查看创建有序的key
curl 'http://127.0.0.1:2379/v2/keys/queue?recursive=true&sorted=true'
{"action":"get","node":{"key":"/queue","dir":true,"nodes":[{"key":"/queue/00000000000000000042","value":"Job1","modifiedIndex":42,"createdIndex":42},{"key":"/queue/00000000000000000043","value":"Job2","modifiedIndex":43,"createdIndex":43},{"key":"/queue/00000000000000000044","value":"Job3","modifiedIndex":44,"createdIndex":44},{"key":"/queue/00000000000000000045","value":"Job4","modifiedIndex":45,"createdIndex":45},{"key":"/queue/00000000000000000046","value":"Job5","modifiedIndex":46,"createdIndex":46},{"key":"/queue/00000000000000000047","value":"Job6","modifiedIndex":47,"createdIndex":47}],"modifiedIndex":42,"createdIndex":42}}
# 原子操作
# 插入一个已存在的key并添加参数prevExist=false,因为已经有存在的key
curl -XPUT http://127.0.0.1:2379/v2/keys/foo?prevExist=false -d value=two
{"errorCode":105,"message":"Key already exists","cause":"/foo","index":56}
# 将插入条件换成prevValue,即检查key的value值,条件相等就替换,否则就提示条件不匹配
curl -XPUT http://127.0.0.1:2379/v2/keys/foo?prevValue=three -d value=two
{"errorCode":101,"message":"Compare failed","cause":"[three != one]","index":56} #值不匹配
curl http://127.0.0.1:2379/v2/keys/foo?prevValue=one -XPUT -d value=two #值匹配替换
{"action":"compareAndSwap","node":{"key":"/foo","value":"two","modifiedIndex":57,"createdIndex":56},"prevNode":{"key":"/foo","value":"one","modifiedIndex":56,"createdIndex":56}}
# 持续watch
curl http://127.0.0.1:2379/v2/keys/message?wait=true
[version 3]
#PS 3.x 版本中需要对k/v进行base64编码,注意是POST请求而不再是PUT
# https://www.base64encode.org/
# WeiyiGeek is 'V2VpeWlHZWVr' in Base64
# etcddemo is 'ZXRjZGRlbW8='
# btoa("Weiyi")
# "V2VpeWk="
# btoa("123456")
# "MTIzNDU2"
# [Put and get keys ] : /v3/kv/range and /v3/kv/put
[root@node3 ~]# curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "V2VpeWlHZWVr", "value": "ZXRjZGRlbW8="}'
[root@node3 ~]# etcdctl get WeiyiGeek
WeiyiGeek
etcddemo
[root@node3 ~]# curl -L http://localhost:2379/v3/kv/range -X POST -d '{"key": "V2VpeWlHZWVr"}'
{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7303","raft_term":"301"},"kvs":[{"key":"V2VpeWlHZWVr","create_revision":"7303","mod_revision":"7303","version":"1","value":"ZXRjZGRlbW8="}],"count":"1"}
#get all keys prefixed with "foo" | 把所有的键都加上前缀"foo"
#curl -L http://localhost:2379/v3/kv/range -X POST -d '{"key": "Zm9v", "range_end": "Zm9w"}'
# [Watch keys]: /v3/watch
curl -N http://localhost:2379/v3/watch -X POST -d '{"create_request": {"key":"Zm9v"} }'
# {"result":{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7303","raft_term":"301"},"created":true}}
# {"result":{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7304","raft_term":"314"},"events":[{"kv":{"key":"Zm9v","create_revision":"7301","mod_revision":"7304","version":"2","value":"YmFy"}}]}}
# [Transactions] : /v3/kv/txn事务处理
#创建目标
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"target":"CREATE","key":"Zm9v","createRevision":"2"}],"success":[{"requestPut":{"key":"Zm9v","value":"YmFy"}}]}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"3","raft_term":"2"},"succeeded":true,"responses":[{"response_put":{"header":{"revision":"3"}}}]}
#目标版本
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"version":"4","result":"EQUAL","target":"VERSION","key":"Zm9v"}],"success":[{"requestRange":{"key":"Zm9v"}}]}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"6","raft_term":"3"},"succeeded":true,"responses":[{"response_range":{"header":{"revision":"6"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"6","version":"4","value":"YmF6"}],"count":"1"}}]}
# [Authentication] : /v3/auth
# create root user
curl -L http://127.0.0.1:2379/v3/auth/user/add -X POST -d '{"name": "root", "password": "pass"}'
# create root role
curl -L http://localhost:2379/v3/auth/role/add \
-X POST -d '{"name": "root"}'
# grant root role
curl -L http://localhost:2379/v3/auth/user/grant \
-X POST -d '{"user": "root", "role": "root"}'
# enable auth
curl -L http://localhost:2379/v3/auth/enable -X POST -d '{}'
#使用etcd对使用/v3/auth/ Authenticate的身份验证令牌进行身份验证
#获取根用户的认证令牌
curl -L http://localhost:2379/v3/auth/authenticate -X POST -d '{"name": "root", "password": "pass"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"},"token":"sssvIpwfnLAcWAQH.9"}
# 然后在请求的Header头中Authorization字段加入上面的token即可认证然后便可以进行操作
curl -L http://localhost:2379/v3/kv/range \
-H 'Authorization: ExmKVoSbXOhIonIj.7329' \
-X POST -d '{"key": "V2VpeWlHZWVr"}'
{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7307","raft_term":"314"},"kvs":[{"key":"V2VpeWlHZWVr","create_revision":"7303","mod_revision":"7303","version":"1","value":"ZXRjZGRlbW8="}],"count":"1"}
# disenable auth
curl -L http://localhost:2379/v3/auth/disable -X POST -d '{}' -H 'Authorization: ExmKVoSbXOhIonIj.7329'
{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7307","raft_term":"314"}}[
参考地址:
描述:此时只是一个简单的集群实例,没有加入证书认证以及安全效验,正式的生产环境中一定需要做这两样;
环境说明:
192.168.107.241 node1.weiyigeek.top #CentOS8
192.168.107.242 node2.weiyigeek.top #CentOS7.7
192.168.107.243 node3.weiyigeek.top #CentOS7.7
Step1.安装etc部署安装(此处以node1为例即上面的一台主机
),其它主机安装类似不同之处在于ETCD_NAME的配置(多台机器名称需要不同)
#!/bin/bash
## Desc: Etcd一键部署
ETCD_VER=v3.4.7
#选择下载的地址
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}
rm -rf /tmp/etcd && mkdir -p /tmp/etcd
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd --strip-components=1
ln -s /tmp/etcd/etcd /usr/bin
ln -s /tmp/etcd/etcdctl /usr/bin
etcd --version && etcdctl version
Step2.在该台机器上设置Etcd和Systemd
启动设置;
etcd配置
创建etcd配置文件/etc/etcd/etcd.conf
vim /etc/etcd/etcd.conf
[member]
#member名称
ETCD_NAME=instance1
#存储数据的目录(注意需要建立)
ETCD_DATA_DIR="/var/lib/etcd/data"
#用于监听客户端etcdctl或者curl连接
ETCD_LISTEN_CLIENT_URLS="http://192.168.107.241:2379,http://127.0.0.1:2379"
#用于监听集群中其它member的连接
ETCD_LISTEN_PEER_URLS="http:/192.168.107.241:2380"
[cluster]
#本机地址用于通知客户端,客户端通过此IPs与集群通信;
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.107.241:2379"
#本机地址用于通知集群member与member通信;
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.107.241:2380"
#描述集群中所有节点的信息,本member根据此信息去联系其他member;
ETCD_INITIAL_CLUSTER="instance01=http://192.168.107.241:2380,instance02=http://192.168.107.242:2380,instance03=http://192.168.107.243:2380"
#集群状态新建集群时候设置为new,若是想加入某个已经存在的集群设置为existing
ETCD_INITIAL_CLUSTER_STATE=new
systemd管理配置 描述:创建etcd的systemd配置文件 /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
Type=simple
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/bin/etcd
KillMode=process
Restart=always
RestartSec=5
LimitNOFILE=655350
LimitNPROC=655350
PrivateTmp=false
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
Step3.启动etcd服务并查看集群状态
#服务启动
systemctl daemon-reload
systemctl enable etcd.service
systemctl start etcd.service
Step4.读写以及删除操作
HOST_1=192.168.10.241
HOST_2=192.168.10.242
HOST_3=192.168.10.243
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
#写入|替换Key
etcdctl --endpoints=$ENDPOINTS put name "WeiyiGeek"
#读取Key
etcdctl --endpoints=$ENDPOINTS get name
#删除Key
etcdctl --endpoints=$ENDPOINTS del name
#debug 查看
$etcdctl --endpoints=$ENDPOINTS --debug get --from-key '\0'
ETCDCTL_CACERT=
ETCDCTL_CERT=
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_DISCOVERY_SRV_NAME=
ETCDCTL_ENDPOINTS=[192.168.10.241:2379,192.168.10.242:2379,192.168.10.243:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=
ETCDCTL_PASSWORD=
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=simple
WARNING: 2020/04/26 00:18:51 Adjusting keepalive ping interval to minimum period of 10s
WARNING: 2020/04/26 00:18:51 Adjusting keepalive ping interval to minimum period of 10s
INFO: 2020/04/26 00:18:51 parsed scheme: "endpoint"
INFO: 2020/04/26 00:18:51 ccResolverWrapper: sending new addresses to cc: [{192.168.10.241:2379 0 <nil>} {192.168.10.242:2379 0 <nil>} {192.168.10.243:2379 0 <nil>}]
Step5.Watch进行监听我们在etcd集群中的操作
#写入 v
#读取 x
#删除 v
etcdctl --endpoints=$ENDPOINTS watch [key]
WeiyiGeek.执行效果
Step6.集群状态查看
etcdctl --endpoints=$ENDPOINTS --write-out=table member list
etcdctl --endpoints=$ENDPOINTS --write-out=table endpoint status
etcdctl --endpoints=$ENDPOINTS --write-out=table endpoint health
etcdctl --endpoints=$ENDPOINTS -w table endpoint status #--write-out简写-w
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.10.241:2379 | ef8199869f22c2b7 | 3.4.7 | 20 kB | true | false | 301 | 8 | 8 | |
| 192.168.10.242:2379 | cbd80ba26fce8c16 | 3.4.7 | 20 kB | false | false | 301 | 8 | 8 | |
| 192.168.10.243:2379 | f5b1b47e3364dc7b | 3.4.7 | 20 kB | false | false | 301 | 9 | 9 | |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Step7.权限与认证设置
export ETCDCTL_API=3
#权限赋予
etcdctl --endpoints=${ENDPOINTS} role add root
etcdctl --endpoints=${ENDPOINTS} role grant-permission root readwrite foo
etcdctl --endpoints=${ENDPOINTS} role get root
#角色赋予
etcdctl --endpoints=${ENDPOINTS} user add root
etcdctl --endpoints=${ENDPOINTS} user grant-role root root
etcdctl --endpoints=${ENDPOINTS} user get root
etcdctl --endpoints=${ENDPOINTS} auth enable
# now all client requests go through aute|现在所有的客户端请求都经过验证
etcdctl --endpoints=${ENDPOINTS} --user=root:123 put foo bar
etcdctl --endpoints=${ENDPOINTS} get foo
etcdctl --endpoints=${ENDPOINTS} --user=root:123 get foo
描述:为了保证通信安全客户端(如etcdctl)与etcd 集群、etcd 集群之间的通信
需要使用TLS 加密。
我们将使用 CloudFlare’s PKI 工具 cfssl 来配置 PKI Infrastructure,然后使用它去创建 Certificate Authority(CA), 并为 etcd创建 TLS 证书。
Step0.证书生成工具安装
mkdir -p /k8s/kubernetes/ssl/ && cd $_
# cfssl 最新下载地址: https://github.com/cloudflare/cfssl/releases
# cfssl 相关工具拉取 (如果拉取较慢,建议使用某雷下载,然后上传到服务器里)
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl_1.6.1_linux_amd64 -o /usr/local/bin/cfssl
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssljson_1.6.1_linux_amd64 -o /usr/local/bin/cfssljson
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl-certinfo_1.6.1_linux_amd64 -o /usr/local/bin/cfssl-certinfo
chmod +x /usr/local/bin/cfssl /usr/local/bin/cfssl-certinfo /usr/local/bin/cfssljson
#验证 cfssl 的版本为 1.6.0 或是更高
cfssl version
Step1.创建 etcd CA 及 证书签名请求
//# etcd ca配置
# cat ca-config.json
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"www": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
//# cat ca-csr.json ca证书
cat > ca-csr.json <<EOF
{
"CN": "etcd CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF
//# cat etcd-csr.json
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"192.168.10.241",
"192.168.10.242",
"192.168.10.243"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
Step2.生成etcd证书和私钥
# CA 凭证和私钥生成:
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
# 证书生成
cfssl gencert -ca=/k8s/kubernetes/ssl/ca.pem -ca-key=/k8s/kubernetes/ssl/ca-key.pem -config=/k8s/kubernetes/ssl/ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
# ls etcd*
etcd.csr etcd-csr.json etcd-key.pem etcd.pem # 证书签名请求正在 + 证书公钥与密钥
Step3.在Etcd启动脚本中加入以下参数
--cert-file=/k8s/etcd/ssl/etcd.pem \
--key-file=/k8s/etcd/ssl/etcd-key.pem \
--peer-cert-file=/k8s/etcd/ssl/etcd.pem \
--peer-key-file=/k8s/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/k8s/kubernetes/ssl/ca.pem \
--peer-trusted-ca-file=/k8s/kubernetes/ssl/ca.pe\
set /flannel/network/config '{"Network":"10.244.0.0/16", "SubnetMin": "10.244.1.0", "SubnetMax": "10.244.254.0", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}' #注意必须在参数中加入–enable-v2
Step4.采用证书进行通讯认证链接交换即节点状态查看
ENDPOINTS=192.168.10.241:2379,192.168.10.242:2379,192.168.10.243:2379
etcdctl --cacert=/k8s/kubernetes/ssl/ca.pem --cert=/k8s/etcd/ssl/etcd.pem --key=/k8s/etcd/ssl/etcd-key.pem --endpoints=$ENDPOINTS endpoint health
# 192.168.10.243:2379 is healthy: successfully committed proposal: took = 6.152483ms
# 192.168.10.241:2379 is healthy: successfully committed proposal: took = 6.461149ms
# 192.168.10.242:2379 is healthy: successfully committed proposal: took = 5.701604ms
分布式系统中的数据分为控制数据和应用数据
。使用etcd的场景默认处理的数据都是控制数据,对于应用数据只推荐数据量很小,但是更新访问频繁的情况。并且Etcd常常与微服务架构一起使用比如下面的一些场景之中;
场景一:服务发现(Service Discovery)
Q:什么是Service Discovery?
答:即在同一个分布式集群中的进程或服务,要如何才能找到对方并建立连接,简单的说服务发现就是想要了解集群中是否有进程在监听udp或tcp端口
,并且通过名字就可以查找和连接;
WeiyiGeek.
服务发现对应的具体场景:
WeiyiGeek.
场景二:消息发布与订阅 在分布式系统中组件间通信方式就是消息发布与订阅,即构建一个配置共享中心,数据提供者在这个配置中心发布消息,而消息使用者则订阅他们关心的主题,一旦主题有消息发布就会实时通知订阅者,etcd实现了分布式系统配置的集中式管理与动态更新。
WeiyiGeek.
场景三:负载均衡
此处指的是软负载均衡,为了保证服务的高可用以及数据的一致性,通常都会把数据和服务部署多份,以此达到对等服务即使其中的某一个服务失效了也不影响使用;由此带来的坏处是数据写入性能下降
,而好处则是数据访问时的负载均衡
;
WeiyiGeek.
场景四:分布式通知与协调 分布式通知与协调与消息发布和订阅有些相似,可以使用etcd中的Watcher机制,通过注册与异步通知机制,实现分布式环境下不同系统之间的通知与协调,从而对数据变更做到实时处理。 实现方式通常是这样:不同系统都在etcd上对同一个目录进行注册,同时设置Watcher观测该目录的变化(如果对子目录的变化也有需要,可以设置递归模式),当某个系统更新了etcd的目录,那么设置了Watcher的系统就会收到通知,并作出相应处理。
WeiyiGeek.
场景五:分布式锁
因为etcd使用Raft算法保持了数据的强一致性,某次操作存储到集群中的值必然是全局一致的,所以很容易实现分布式锁。
锁服务有两种使用方式:一是保持独占,二是控制时序
。
WeiyiGeek.
场景六:分布式队列
分布式队列的常规用法与场景五中所描述的分布式锁的控制时序用法类似,即创建一个先进先出的队列保证顺序
。另一种比较有意思的实现是在保证队列达到某个条件时再统一按顺序执行,这种方法的实现可以在/queue这个目录中另外建立一个/queue/condition节点。;
WeiyiGeek.
场景七:集群监控与Leader竞选 通过etcd来进行监控实现起来非常简单并且实时性强。 比如通过Watcher机制,当某个节点消失或有变动时,Watcher会第一时间发现并告知用户,同时节点可以设置TTL key,进行节点存活健康状态的检测,以完成集群的监控要求;
另外使用分布式锁,可以完成Leader竞选,通常是一些长时间CPU计算或者使用IO操作的机器,只需要竞选出的Leader计算或处理一次,就可以把结果复制给其他的Follower,从而避免重复劳动节省计算资源。
该经典场景是搜索系统中建立全量索引,通过在etcd的CAS机制同时创建一个节点,创建成功的机器作为Leader进行索引计算
,然后把计算结果分发到其它节点。
WeiyiGeek.
Q:为什么用etcd而不用ZooKeeper? 相较之下ZooKeeper有如下缺点:
而etcd作为一个后起之秀其优点也很明显。
项目中如何利用etcd的选主机制来实现应用的高可用? 答:此时您需要往下看,您就能找到您的答案;
环境说明:假如您此时已经按照前面安装Go环境进行了安装;
Step1.采用go安装Clientv3
go get "github.com/coreos/etcd/clientv3"
Step2.添加常量
const prefix = "/nanoPing"
const prop = "local"
var leaderFlag bool
Step3.编写client节点竞选函数campaign
func campaign(c *clientv3.Client, election string, prop string) {
for {
//gets the leased session for a client(获取客户端租用的会话)
s, err := concurrency.NewSession(c, concurrency.WithTTL(15))
if err != nil {
log.Println(err)
continue
}
//returns a new election on a given key prefix(返回对给定键前缀的新选择)
e := concurrency.NewElection(s, election)
ctx := context.TODO()
//Campaign puts a value as eligible for the election on the prefix key.
//Multiple sessions can participate in the election for the same prefix, |多届会议可参加同一前缀的选举,
//but only one can be the leader at a time 但是一次只能有一个领导者
if err = e.Campaign(ctx, prop); err != nil {
log.Println(err)
continue
}
log.Println("elect: success")
leaderFlag = true
select {
case <-s.Done():
leaderFlag = false
log.Println("elect: expired")
}
}
}
Step4.添加竞选成功后执行的动作run
func run() {
log.Println("[info] Service master")
log.Println("[info] Task start.")
}
Step5.编写入口函数,创建client节点,参与竞选master,竞选成功执行任务。
func Start() {
donec := make(chan struct{})
//create a client 创建客户端
cli, err := clientv3.New(clientv3.Config{Endpoints: g.Config().Etcd.Addr,Username:g.Config().Etcd.User,Password:g.Config().Etcd.Password})
if err != nil {
log.Fatal(err)
}
defer cli.Close()
go campaign(cli, prefix, prop)
go func() {
ticker := time.NewTicker(time.Duration(10) * time.Second)
for {
select {
case <-ticker.C:
{
if leaderFlag == true {
run()
return
} else {
log.Println("[info] Service is not master")
}
}
}
}
}()
<-donec
}
执行结果:
WeiyiGeek.选主
总结: 通过etcd中的选主机制我们实现了服务的高可用。同时利用systemd对etcd本身进行了保活,只要etcd服务所在的机器没有宕机,进程就具备了容灾性。当然一个etcd集群不仅仅可以对一个服务提供高可用,我们可以将多个服务注册在一个etcd集群中,同时利用etcd所提供的共享配置和服务发现,此外etcd还有很多值得深入研究的技术,比如raft一致性算法等等;
配置文件说嘛
#### [Memberflags]
# member成员名称
–name
# 数据存储路径
–data-dir
–wal-dir
–snapshot-count
–heartbeat-interval
–election-timeout
–listen-peer-urls
–listen-client-urls
–max-snapshots
–max-wals
–cors
–quota-backend-bytes
–backend-batch-limit
–backend-bbolt-freelist-type
–backend-batch-interval
–max-txn-ops
–max-request-bytes
–grpc-keepalive-min-time
–grpc-keepalive-interval
–grpc-keepalive-timeout
#### [Clustering flags]
–initial-advertise-peer-urls
–initial-cluster
–initial-cluster-state
–initial-cluster-token: 参数为每个集群单独配置一个token认证,确保每个集群和集群的成员都拥有独特的ID。
–advertise-client-urls
–discovery
–discovery-srv
–discovery-srv-name
–discovery-fallback
–discovery-proxy
–strict-reconfig-check
–auto-compaction-retention
–auto-compaction-mode
–enable-v2 #Flannel操作etcd使用的是v2的API,而kubernetes操作etcd使用的v3的API,为了兼容Flannel网络
Proxy flags
–proxy
–proxy-failure-wait
–proxy-refresh-interval
–proxy-dial-timeout
–proxy-write-timeout
–proxy-read-timeout
## [Security flags]
–ca-file #SSL CA根证书文件
–cert-file #SSL 证书文件
–key-file #SSL 证书密钥文件
–client-cert-auth
–client-crl-file
–client-cert-allowed-hostname
–trusted-ca-file #信任的 SSL CA根证书文件
–auto-tls
–peer-ca-file #集群成员端点SSL CA根证书文件
–peer-cert-file #集群成员端点SSL 证书文件
–peer-key-file #集群成员端点SSL 证书密钥文件
–peer-client-cert-auth
–peer-crl-file
–peer-trusted-ca-file #集群成员端点 信任的 SSL CA根证书文件
–peer-auto-tls
–peer-cert-allowed-cn
–peer-cert-allowed-hostname
–cipher-suites
## [Logging flags]
–logger
–log-outputs
–log-level
–debug
–log-package-levels
## [Unsafe flags]
–force-new-cluster
## [Miscellaneous flags]
–version
–config-file
## [Profiling flags]
–enable-pprof
–metrics
–listen-metrics-urls
## [Auth flags]
–auth-token
–bcrypt-cost
## [Experimental flags]
–experimental-corrupt-check-time
–experimental-compaction-batch-limit
–experimental-peer-skip-client-san-verification
常用命令:
#增删改查
etcdctl put [key] [value]
etcdctl get [key]
etcdctl del [key]
基础实例:
#(1)校验成员状态以及操作
$etcdctl member list
member add #Adds a member into the cluster
member list #Lists all members in the cluster
member promote #Promotes a non-voting member in the cluster
member remove #Removes a member from the cluster
member update #Updates a member in the cluster
#(2)检查etcd集群的状态性能
$etcdctl --endpoints=$ENDPOINTS check perf
#60 / 60 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00% 1m0s
etcdctl --endpoints=$ENDPOINTS endpoint health
etcdctl --endpoints=$ENDPOINTS endpoint status
#(3)认证相关
auth disable # Disables authentication
auth enable # Enables authentication
#(4)用户相关
user add #Adds a new user
user delete #Deletes a user
user get #Gets detailed information of a user
user grant-role #Grants a role to a user
user list #Lists all users
user passwd #Changes password of user
user revoke-role #Revokes a role from a user
#(5)角色权限相关
role add #Adds a new role
role delete #Deletes a role
role get #Gets detailed information of a role
role grant-permission #Grants a key to a role
role list #Lists all roles
role revoke-permission #Revokes a key from a role
#!/bin/bash
# Desc: Etcd集群一件安装
# Author: WeiyiGeek
# Create: 2020年4月24日 09:48:48
#[全局变量]
export ETCD_VER=v3.4.7
export ETCD_SRCNAME=etcd-${ETCD_VER}-linux-amd64.tar.gz
export ETCD_URL=https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/${ETCD_SRCNAME}
export ETCD_DIR=/usr/local/etcd
export ETCD_CONF=/etc/etcd
export ETCD_DATA=/apps/etcd/data
export ETCD_NODE1=192.168.10.241
export ETCD_NODE2=192.168.10.242
export ETCD_NODE3=192.168.10.243
function Usage(){
echo -e "\e[33m#Usage:$0 node[1~3].weiyigeek.top\n#Example: NODENAME = node[1~3] \n\n\e[0m"
}
Usage
set -xue
export CURRENT_NODE=$1
export NODE_NAME=${CURRENT_NODE%%.*}
export NODE_DOMAIN=${CURRENT_NODE#*.}
#[使用帮助]
function BeforeSetting(){
#1.当前机器hostname设置
hostnamectl set-hostname ${NODE_NAME}
#2.配置/etc/hosts
rm -rf /etc/hosts
cat <<END >/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
$ETCD_NODE1 node1.${NODE_DOMAIN}
$ETCD_NODE2 node2.${NODE_DOMAIN}
$ETCD_NODE3 node3.${NODE_DOMAIN}
END
#3.防火墙配置
firewall-cmd --permanent --zone=public --add-port=2379-2380/tcp
firewall-cmd --reload
}
#[暂时没用]
# function ChangeDownUrl(){
# ETCD_URL=$(echo $1 | sed 's/raw.githubusercontent.com/cdn.jsdelivr.net\/gh/' \
# | sed 's/github.com/cdn.jsdelivr.net\/gh/' \
# | sed 's/\/master//' | sed 's/\/blob//' )
# }
function Download(){
if [[ ! -f ./${ETCD_SRCNAME} ]];then
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o etcd-${ETCD_VER}-linux-amd64.tar.gz
else
echo -e "#${ETCD_SRCNAME} Already Exsit!"
fi
}
function Install(){
#安装验证
if [[ ! -d ${ETCD_DIR} ]];then
mkdir -p ${ETCD_DIR}
else
rm -rf ${ETCD_DIR}
mkdir -p ${ETCD_DIR}
fi
#数据存储
mkdir -vp ${ETCD_DATA}
#解压
tar xzvf ${ETCD_SRCNAME} -C ${ETCD_DIR} --strip-components=1
#判断软件链接
if [[ ! -f /usr/bin/etcd ]];then
echo -e "#links File Not Exsit!"
ln -s ${ETCD_DIR}/etcd /usr/bin
ln -s ${ETCD_DIR}/etcdctl /usr/bin
fi
#验证是否安装成功否则停止安装
/usr/bin/etcd --version
/usr/bin/etcdctl version
if [[ $? -ne 0 ]];then
echo -e "\e[31m#Install Error!\e[0m"
exit 0
fi
}
function AfterSetting(){
#Etcd配置
rm -rf ${ETCD_CONF}/etcd.conf
mkdir -vp ${ETCD_CONF} /var/lib/etcd/
export CURRENT_HOST=$(cat /etc/hosts | grep ${NODE_NAME} | awk -F " " '{print $1}')
cat > ${ETCD_CONF}/etcd.conf <<END
[member]
ETCD_NAME=${NODE_NAME}
ETCD_DATA_DIR=${ETCD_DATA}
ETCD_LISTEN_CLIENT_URLS="http://${CURRENT_HOST}:2379,http://127.0.0.1:2379"
ETCD_LISTEN_PEER_URLS="http://${CURRENT_HOST}:2380"
[cluster]
ETCD_ADVERTISE_CLIENT_URLS="http://${CURRENT_NODE}:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://${CURRENT_NODE}:2380"
ETCD_INITIAL_CLUSTER="node1=http://node1.${NODE_DOMAIN}:2380,node2=http://node2.${NODE_DOMAIN}:2380,node3=http://node3.${NODE_DOMAIN}:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE=new
# [version]
# ETCD_ENABLE_V2=true
#[Security]
# ETCD_CERT_FILE="/opt/etcd/ssl/server.pem"
# ETCD_KEY_FILE="/opt/etcd/ssl/server-key.pem"
# ETCD_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
# ETCD_CLIENT_CERT_AUTH="true"
# ETCD_PEER_CERT_FILE="/opt/etcd/ssl/server.pem"
# ETCD_PEER_KEY_FILE="/opt/etcd/ssl/server-key.pem"
# ETCD_PEER_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
# ETCD_PEER_CLIENT_CERT_AUTH="true"
END
#systemd配置
cat > /usr/lib/systemd/system/etcd.service <<END
[Unit]
Description=Etcd Server
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
Type=simple
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/bin/etcd
KillMode=process
Restart=always
RestartSec=3
LimitNOFILE=655350
LimitNPROC=655350
PrivateTmp=false
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
END
#Reload systemd manager configuration
systemctl daemon-reload
}
function ServiceRun(){
echo -e "#正在启动Etcd服务"
systemctl restart etcd.service
echo -e "#当所有节点启动后运行以下脚本或者 cat etcd.txt 查看"
echo -e "ENDPOINTS=$ETCD_NODE1:2379,$ETCD_NODE2:2379,$ETCD_NODE3:2379 \netcdctl -w table --endpoints=\$ENDPOINTS endpoint status"
echo -e "ENDPOINTS=$ETCD_NODE1:2379,$ETCD_NODE2:2379,$ETCD_NODE3:2379 \netcdctl -w table --endpoints=\$ENDPOINTS endpoint status" > etcd.txt
}
BeforeSetting
Download
Install
AfterSetting
ServiceRun
WeiyiGeek.脚本效果
问题1.Go下载etcd时候报错unrecognized import path - install error
错误信息:import path does not begin with hostname
解决办法:设置GOROOT环境变量或者删除执行unset GOROOT
问题2.采用systemctl管理etcd会触发以下类似报错“etcd: conflicting environment variable “ETCD_NAME” is shadowed by corresponding command-line flag (either unset environment variable or disable flag)” 原因:ETCD3.4版本会自动读取环境变量的参数,所以EnvironmentFile文件中有的参数,不需要再次在ExecStart启动参数中添加二选一即可解决(但是需要注意官网启动参数是否有旧参数被替代)
问题3.出现类似提示无法获取某个节点健康状态的提示 问题描述:
$/opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/ca.pem --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --endpoints=$ENDPOINTS cluster-health
member 11babd38de9e1f0f is healthy: got healthy result from https://10.0.52.13:2379
failed to check the health of member 22436a037c5adb3b on https://10.0.52.14:2379: Get https://10.0.52.14:2379/health: dial tcp 10.0.52.14:2379: i/o timeout
member 22436a037c5adb3b is unreachable: [https://10.0.52.14:2379] are all unreachable
member a5e80429e983b681 is healthy: got healthy result from https://10.0.52.6:2379
解决方式:各个主机应该关闭firewalld服务;
问题4.request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742) 问题描述:
$journalctl -u etcd -f
-- Logs begin at Thu 2019-05-23 14:29:05 CST. --
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
解决办法: 删除缓存并重启etcd
rm -rf /var/lib/etcd/default.etcd
systemctl restart etcd
问题6.搭建部署etcd集群并配置https访问时报remote error: tls: bad certificate
错误
错误信息:
Apr 27 18:39:27 master-225 etcd[3747038]: {"level":"warn","ts":"2022-04-27T18:39:27.998+0800","caller":"embed/mbed/config_logging.go:169","msg":"rejected connection","remote-addr":"10.10.107.224:47920","server-name":"","error":"remote error: tls: bad certificate"}
问题原因: 由于你证书中hosts不包含你当前的名称或者地址, 此处我的问题是证书中不包括SAN地址或者域名。
openssl x509 -in etcd.pem -text -noout | grep "X509v3 Subject Alternative Name" -A 1
# X509v3 Subject Alternative Name:
# DNS:etcd1, DNS:etcd2, DNS:etcd3, IP Address:127.0.0.1, IP Address:10.10.107.223, IP Address:10.10.107.224, IP Address:10.10.107.225
解决办法: 重新签发证书
# etcd 证书请求文件
tee etcd-csr.json <<'EOF'
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"10.10.107.223",
"10.10.107.224",
"10.10.107.225"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "etcd",
"OU": "System"
}
]
}
EOF
# 利用ca证书签发生成etcd证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd
问题6.使用etcdctl
命令查看etcd集群成员时报authentication handshake failed: x509: certificate signed by unknown authority
错误。
错误信息:
$ etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 --write-out=table member list
{"level":"warn","ts":"2022-04-27T19:38:10.489+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0004a68c0/10.10.107.225:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate signed by unknown authority\""}
Error: context deadline exceeded
解决方法: 由于集群采用证书认证,所以我们需要指定CA以及其签发的证书与key。
export ETCDCTL_API=3
etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--cacert="/etc/etcd/pki/ca.pem" --cert="/etc/etcd/pki/etcd.pem" --key="/etc/etcd/pki/etcd-key.pem" \
--write-out=table member list