专栏首页莫韵的专栏ZooKeeper在线迁移实验
原创

ZooKeeper在线迁移实验

注意事项

首先,当我们要从3台扩充到5台时,应保证集群不停止服务。

3台不停止服务的最低限度是2台X/2+1),而5台的最低限度是3台

  • 我们应该保证,集群中最低有3台ZooKeeper是启动的。
  • 此外,重启时应保证先重启myid最小的机器,由小向大进行重启
  • Leader无论其myid大小,都放到最后重启

因为ZooKeeper的机制中,myid大的会向小的发起连接,而小的不会向大的发起连接。因此如果最后重启myid最小的机器,则其可能无法加入集群

环境情况

五台机器

IP

Hostname

10.1.24.110

idc02-kafka-ds-00

10.1.24.111

idc02-kafka-ds-01

10.1.24.112

idc02-kafka-ds-02

10.1.24.113

idc02-kafka-ds-03

10.1.24.114

idc02-kafka-ds-04

JDK

jdk1.7.0_67

ZooKeeper

zookeeper-3.4.6

myid

根据IP自增为1-5

配置文件

1 2 3 4

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888

实验过程

配置一个3节点的ZooKeeper

idc02-kafka-ds-00

1 2 3 4 5

[hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower


idc02-kafka-ds-01

1 2 3 4 5

[hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader


idc02-kafka-ds-02

1 2 3 4 5

[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

将其扩容为5节点的ZooKeeper

先查看原先的ZooKeeper集群情况

echo mntr|nc localhost 2181

这条4字命令可以查看集群的情况,其中follower的相关数据需要在Leader机器上才能查看

idc02-kafka-ds-01上查看

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

[hadoop@idc02-kafka-ds-01 bin]$ echo mntr|nc localhost 2181 zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 3 zk_packets_sent 2 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_open_file_descriptor_count 27 zk_max_file_descriptor_count 65535 zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0

启动另外两台机器的Zookeeper

另外两台机器的配置文件

1 2 3 4 5 6

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888 server.4=10.1.24.113:2888:3888 server.5=10.1.24.114:2888:3888

启动

idc02-kafka-ds-03

1 2 3 4 5

[hadoop@idc02-kafka-ds-03 bin]# ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower


idc02-kafka-ds-04

1 2 3 4 5

[hadoop@idc02-kafka-ds-04 bin]# ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

再查看集群情况

仍然在idc02-kafka-ds-01上查看

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

[hadoop@idc02-kafka-ds-01 bin]$ echo mntr|nc localhost 2181 zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 4 zk_packets_sent 3 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_open_file_descriptor_count 31 zk_max_file_descriptor_count 65535 zk_followers 4 zk_synced_followers 4 zk_pending_syncs 0

可以看到zk_followers4,连接到的follower2变为4

而且zk_synced_followers4,说明新加入的2个也都同步好了

接下来我们滚动重启myid1-3的前三台机器

先处理idc02-kafka-ds-00

关闭

如不放心请在关闭其间于Leader机器或后加入的两台机器上监控日志

1 2 3 4 5

[hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

修改其配置文件

由原来的

1 2 3 4

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888

到新的

1 2 3 4 5 6

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888 server.4=10.1.24.113:2888:3888 server.5=10.1.24.114:2888:3888

启动

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

然后跳过作为Leaderidc02-kafka-ds-01,先处理idc02-kafka-ds-02

关闭

1 2 3 4 5

[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

修改配置文件

1 2 3 4 5 6

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888 server.4=10.1.24.113:2888:3888 server.5=10.1.24.114:2888:3888

启动

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

最后处理原Leaderidc02-kafka-ds-01

关闭

1 2 3 4 5

[hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

查看新Leader

ZooKeeper会尽可能的选择myid最大的机器为Leader,因此原本的idc02-kafka-ds-04myid5变为了Leader

1 2 3 4 5

[hadoop@idc02-kafka-ds-04 bin]# ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader

修改配置文件

1 2 3 4 5 6

server.1=10.1.24.110:2888:3888 server.2=10.1.24.111:2888:3888 server.3=10.1.24.112:2888:3888 server.4=10.1.24.113:2888:3888 server.5=10.1.24.114:2888:3888

启动

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

在新的Leader上查看集群情况

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

[hadoop@idc02-kafka-ds-04 bin]# echo mntr|nc localhost 2181 zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT zk_avg_latency 1 zk_max_latency 4 zk_min_latency 0 zk_packets_received 12 zk_packets_sent 11 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_open_file_descriptor_count 33 zk_max_file_descriptor_count 65535 zk_followers 4 zk_synced_followers 4 zk_pending_syncs 0

一切正常 到这里,我们已经将原本的3台扩展到了5台,成功了一半。 然后只要将现在的5台再缩小到3台且不包括原本myid1-2的机器,就完成了迁移

将5台缩小回3台

修改idc02-kafka-ds-02

根据前面的注意事项,我们此时5台集群中启动的数量不得少于3台,因此我们需要先修改3-5号机器的配置文件为3台,再关闭1-2号机器

关闭

1 2 3 4 5

[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

修改配置文件为

1 2 3 4

server.3=10.1.24.110:2888:3888 server.4=10.1.24.111:2888:3888 server.5=10.1.24.112:2888:3888

启动

l

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

然后修改idc02-kafka-ds-03

关闭

1 2 3 4 5

[hadoop@idc02-kafka-ds-03 bin]# ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

修改配置文件为

1 2 3 4

server.3=10.1.24.110:2888:3888 server.4=10.1.24.111:2888:3888 server.5=10.1.24.112:2888:3888

启动

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-03 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-03 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

最后修改idc02-kafka-ds-04

关闭

1 2 3 4 5

[hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

关闭后Leader移动到了myid第二大的idc02-kafka-ds-02

修改配置文件为

l

1 2 3 4

server.3=10.1.24.110:2888:3888 server.4=10.1.24.111:2888:3888 server.5=10.1.24.112:2888:3888

启动

1 2 3 4 5 6 7 8 9

[hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh start JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh status JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower

Leader中查看

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

[hadoop@idc02-kafka-ds-03 bin]$ echo mntr|nc localhost 2181 zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 4 zk_packets_sent 3 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_open_file_descriptor_count 27 zk_max_file_descriptor_count 65535 zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0

此时的zk_followers为2,说明Leader已经不认1-2号机器了

关闭1-2号机器

关闭idc02-kafka-ds-00

1 2 3 4 5

[hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

关闭idc02-kafka-ds-01

1 2 3 4 5

[hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh stop JMX enabled by default Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED

再查看

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

[hadoop@idc02-kafka-ds-03 bin]$ echo mntr|nc localhost 2181 zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 5 zk_packets_sent 4 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_open_file_descriptor_count 27 zk_max_file_descriptor_count 65535 zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0

没有任何影响 实验成功

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 关于zk迁移的一些教训

    https://cloud.tencent.com/developer/article/1406912

    莫韵
  • 简易 linux 网卡带宽检查工具纯 shell 和 awk

    最近要检查网卡流量,其实是有各种现存工具,非常容易实现 。但需要把这个东西与icinga2 集成起来搞报警什么的。这些cati collectd tsar 网管...

    莫韵
  • linux 根分区的空间去哪里了 ?记一次根分区满的服务故障排查记录

    linux 根分区的空间去哪里了 ?记一次根分区满的服务故障排查记录。我的排查思路是先找占用没有占用,找占用的文件句柄。

    莫韵
  • 基于转移学习的图像识别

    算法该如何分辨这只狗可能属于哪个品种?当然小伙伴们可以训练自己的卷积神经网络来对这张图片进行分类,但是通常情况下我们既没有GPU的计算能力,也没有时间去训练自己...

    小白学视觉
  • 笔记 | GWAS 操作流程2-2:性别质控

    「原理:」检查性别差异。先验信息,女性的受试者的F值必须小于0.2,男性的受试者的F值必须大于0.8。这个F值是基于X染色体近交(纯合子)估计。不符合这些要求的...

    邓飞
  • Amazon Sage​Maker启示录

    在“机器学习模型常见对比”我们聊了机器学习常见模型, 在“R语言和表数据分析”里面我们解读了数据处理一般流程。

    史博
  • Pandas DataFrame 数据合并、连接

    merge 通过键拼接列 pandas提供了一个类似于关系数据库的连接(join)操作的方法merage,可以根据一个或多个键将不同DataFrame中的行...

    马哥Python
  • 如何为协同过滤选择合适的相似度算法

    近邻推荐之基于用户的协同过滤 以及 近邻推荐之基于物品的协同过滤 讲解的都是关于如何使用协同过滤来生成推荐结果,无论是基于用户的协同过滤还是基于物品的协同过滤...

    abs_zero
  • 浅谈PHP中的设计模式

    沈唁
  • 「视频」《Pokemon GO》有毒,半夜不睡只想出门捉小精灵

    镁客网

扫码关注云+社区

领取腾讯云代金券