redis-trib.rb工具也实现了为现有集群添加新节点的命令,同时也实现了直接添加为slave的支持:
# 新节点加入集群 redis-trib.rb add-node new_host:new_port old_host:old_port # 新节点加入集群并作为指定master的slave redis-trib.rb add-node new_host:new_port old_host:old_port --slave --master-id <master-id>
建议使用redis-trib.rb add-node
将新节点添加到集群中,该命令会检查新节点的状态,如果新节点已经加入了其他集群或者已经包含数据,则会报错,而使用cluster meet
命令则不会做这样的检查,假如新节点已经存在数据,则会合并到集群中,造成数据不一致
无需要求每个master的slot编号是连续的,只要每个master管理的slot的数量均衡就可以。
我们在上面添加了10.0.0.103:6379和10.0.0.103:6380两个节点,现在把这两个节点下线
10.0.0.103:6380> cluster nodes
...
# 10.0.0.103:6380是slave
# 10.0.0.103:6379是master
724a8a15f4efe5a01454cb971d7471d6e84279f3 10.0.0.103:6379@16379 master - 0 1585055101000 8 connected 0-1364 5461-6826 10923-12287
ed9b72fffd04b8a7e5ad20afdaf1f53e0eb95011 10.0.0.103:6380@16380 slave 724a8a15f4efe5a01454cb971d7471d6e84279f3 0 1585055099000 0 connected
[root@node01 redis]# redis-trib.rb reshard 10.0.0.100:6379
......
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1364
What is the receiving node ID? 1be5d1aaaa9e9542224554f461694da9cba7c2b8
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:724a8a15f4efe5a01454cb971d7471d6e84279f3
Source node #2:done
Ready to move 1364 slots.
Source nodes:
M: 724a8a15f4efe5a01454cb971d7471d6e84279f3 10.0.0.103:6379
slots:0-1364,5461-6826,10923-12287 (4096 slots) master
1 additional replica(s)
Destination node:
M: 1be5d1aaaa9e9542224554f461694da9cba7c2b8 10.0.0.101:6379
slots:6827-10922 (4096 slots) master
1 additional replica(s)
Resharding plan:
.......
[root@node01 redis]# redis-trib.rb reshard 10.0.0.100:6379
......
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1364
What is the receiving node ID? 4fb4c538d5f29255f6212f2eae8a761fbe364a89
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:724a8a15f4efe5a01454cb971d7471d6e84279f3
Source node #2:done
.......
[root@node01 redis]# redis-trib.rb reshard 10.0.0.100:6379
......
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1365
What is the receiving node ID? 690b2e1f604a0227068388d3e5b1f1940524c565
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:724a8a15f4efe5a01454cb971d7471d6e84279f3
Source node #2:done
.......
10.0.0.103:6380> cluster nodes
1be5d1aaaa9e9542224554f461694da9cba7c2b8 10.0.0.101:6379@16379 master - 0 1585056902000 9 connected 0-1363 6827-10922
4fb4c538d5f29255f6212f2eae8a761fbe364a89 10.0.0.101:6380@16380 master - 0 1585056903544 12 connected 2729-6826 10923-12287
690b2e1f604a0227068388d3e5b1f1940524c565 10.0.0.102:6379@16379 master - 0 1585056903000 11 connected 1364-2728 12288-16383
# 10.0.0.103:6379的slot已经迁移完成
724a8a15f4efe5a01454cb971d7471d6e84279f3 10.0.0.103:6379@16379 master - 0 1585056903000 10 connected
ed9b72fffd04b8a7e5ad20afdaf1f53e0eb95011 10.0.0.103:6380@16380 myself,slave 4fb4c538d5f29255f6212f2eae8a761fbe364a89 0 1585056898000 0 connected
89f52bfbb8803db19ab0c5a90adc4099df8287f7 10.0.0.100:6379@16379 slave 4fb4c538d5f29255f6212f2eae8a761fbe364a89 0 1585056901000 12 connected
8c13a2afa76194ef9582bb06675695bfef76b11d 10.0.0.100:6380@16380 slave 690b2e1f604a0227068388d3e5b1f1940524c565 0 1585056900000 11 connected
86e1881611440012c87fbf3fa98b7b6d79915e25 10.0.0.102:6380@16380 slave 1be5d1aaaa9e9542224554f461694da9cba7c2b8 0 1585056904551 9 connected
Redis提供了cluster forget{downNodeId}
命令来通知其他节点忘记下线节点,当节点接收到cluster forget {down NodeId}
命令后,会把nodeId
指定的节点加入到禁用列表中,在禁用列表内的节点不再与其他节点发送消息,禁用列表有效期是60秒,超过60秒节点会再次参与消息交换。也就是说当第一次forget命令发出后,我们有60秒的时间让集群内的所有节点忘记下线节点
线上操作不建议直接使用cluster forget
命令下线节点,这需要跟大量节点进行命令交互,建议使用redis- trib.rb del-node {host:port} {downNodeId}
命令
另外,先下线slave,再下线master可以防止不必要的数据复制
# 先下线slave 10.0.0.103:6380
[root@node01 redis]# redis-trib.rb del-node 10.0.0.100:6379 ed9b72fffd04b8a7e5ad20afdaf1f53e0eb95011
>>> Removing node ed9b72fffd04b8a7e5ad20afdaf1f53e0eb95011 from cluster 10.0.0.100:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
# 再下线slave 10.0.0.103:6379
[root@node01 redis]# redis-trib.rb del-node 10.0.0.100:6379 724a8a15f4efe5a01454cb971d7471d6e84279f3
>>> Removing node 724a8a15f4efe5a01454cb971d7471d6e84279f3 from cluster 10.0.0.100:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
10.0.0.100:6379> cluster nodes
8c13a2afa76194ef9582bb06675695bfef76b11d 10.0.0.100:6380@16380 slave 690b2e1f604a0227068388d3e5b1f1940524c565 0 1585057049247 11 connected
690b2e1f604a0227068388d3e5b1f1940524c565 10.0.0.102:6379@16379 master - 0 1585057048239 11 connected 1364-2728 12288-16383
4fb4c538d5f29255f6212f2eae8a761fbe364a89 10.0.0.101:6380@16380 master - 0 1585057048000 12 connected 2729-6826 10923-12287
89f52bfbb8803db19ab0c5a90adc4099df8287f7 10.0.0.100:6379@16379 myself,slave 4fb4c538d5f29255f6212f2eae8a761fbe364a89 0 1585057047000 1 connected
86e1881611440012c87fbf3fa98b7b6d79915e25 10.0.0.102:6380@16380 slave 1be5d1aaaa9e9542224554f461694da9cba7c2b8 0 1585057048000 9 connected
1be5d1aaaa9e9542224554f461694da9cba7c2b8 10.0.0.101:6379@16379 master - 0 1585057048000 9 connected 0-1363 6827-10922
redis-trib.rb del-node
还可以自动停止下线节点的服务。