redis的sentinel模式故障演练

本文主要研究一下redis的sentinel模式

启动

docker-compose up

这里使用redis-cluster的docker-compose文件进行演示

  • master日志
1:M 12 Sep 06:42:02.159 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 12 Sep 06:42:02.159 # Server started, Redis version 3.2.8
1:M 12 Sep 06:42:02.159 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 12 Sep 06:42:02.159 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 12 Sep 06:42:02.159 * The server is now ready to accept connections on port 6379
1:M 12 Sep 06:42:02.849 * Slave 172.17.0.3:6379 asks for synchronization
1:M 12 Sep 06:42:02.849 * Full resync requested by slave 172.17.0.3:6379
1:M 12 Sep 06:42:02.849 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:42:02.851 * Background saving started by pid 16
16:C 12 Sep 06:42:02.861 * DB saved on disk
16:C 12 Sep 06:42:02.862 * RDB: 6 MB of memory used by copy-on-write
1:M 12 Sep 06:42:02.865 * Background saving terminated with success
1:M 12 Sep 06:42:02.866 * Synchronization with slave 172.17.0.3:6379 succeeded
1:M 12 Sep 06:42:13.649 # Connection with slave 172.17.0.3:6379 lost.
1:M 12 Sep 06:42:14.072 * Slave 172.17.0.3:6379 asks for synchronization
1:M 12 Sep 06:42:14.073 * Full resync requested by slave 172.17.0.3:6379
1:M 12 Sep 06:42:14.073 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:42:14.075 * Background saving started by pid 17
17:C 12 Sep 06:42:14.085 * DB saved on disk
17:C 12 Sep 06:42:14.085 * RDB: 8 MB of memory used by copy-on-write
1:M 12 Sep 06:42:14.185 * Background saving terminated with success
1:M 12 Sep 06:42:14.186 * Synchronization with slave 172.17.0.3:6379 succeeded
  • slave日志1:S 12 Sep 06:42:02.847 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:S 12 Sep 06:42:02.847 # Server started, Redis version 3.2.8 1:S 12 Sep 06:42:02.847 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 1:S 12 Sep 06:42:02.847 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. 1:S 12 Sep 06:42:02.847 * The server is now ready to accept connections on port 6379 1:S 12 Sep 06:42:02.847 * Connecting to MASTER redis-master:6379 1:S 12 Sep 06:42:02.848 * MASTER <-> SLAVE sync started 1:S 12 Sep 06:42:02.848 * Non blocking connect for SYNC fired the event. 1:S 12 Sep 06:42:02.849 * Master replied to PING, replication can continue... 1:S 12 Sep 06:42:02.849 * Partial resynchronization not possible (no cached master) 1:S 12 Sep 06:42:02.851 * Full resync from master: 32f526697a22fef7945974d2b4dfc599401e2525:1 1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: receiving 76 bytes from master 1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: Flushing old data 1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: Loading DB in memory 1:S 12 Sep 06:42:02.867 * MASTER <-> SLAVE sync: Finished with success 1:S 12 Sep 06:42:02.869 * Background append only file rewriting started by pid 15 1:S 12 Sep 06:42:02.903 * AOF rewrite child asks to stop sending diffs. 15:C 12 Sep 06:42:02.904 * Parent agreed to stop sending diffs. Finalizing AOF... 15:C 12 Sep 06:42:02.904 * Concatenating 0.00 MB of AOF diff received from parent. 15:C 12 Sep 06:42:02.906 * SYNC append only file rewrite performed 15:C 12 Sep 06:42:02.907 * AOF rewrite: 6 MB of memory used by copy-on-write 1:S 12 Sep 06:42:02.948 * Background AOF rewrite terminated with success 1:S 12 Sep 06:42:02.948 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 1:S 12 Sep 06:42:02.948 * Background AOF rewrite finished successfully 1:S 12 Sep 06:42:13.649 # Connection with master lost. 1:S 12 Sep 06:42:13.649 * Caching the disconnected master state. 1:S 12 Sep 06:42:13.650 * Discarding previously cached master state. 1:S 12 Sep 06:42:13.650 * SLAVE OF 172.17.0.2:6379 enabled (user request from 'id=3 addr=172.17.0.4:57270 fd=6 name=sentinel-927320a2-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec') 1:S 12 Sep 06:42:13.650 # CONFIG REWRITE executed with success. 1:S 12 Sep 06:42:14.071 * Connecting to MASTER 172.17.0.2:6379 1:S 12 Sep 06:42:14.072 * MASTER <-> SLAVE sync started 1:S 12 Sep 06:42:14.072 * Non blocking connect for SYNC fired the event. 1:S 12 Sep 06:42:14.072 * Master replied to PING, replication can continue... 1:S 12 Sep 06:42:14.072 * Partial resynchronization not possible (no cached master) 1:S 12 Sep 06:42:14.076 * Full resync from master: 32f526697a22fef7945974d2b4dfc599401e2525:733 1:S 12 Sep 06:42:14.185 * MASTER <-> SLAVE sync: receiving 76 bytes from master 1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Flushing old data 1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Loading DB in memory 1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Finished with success 1:S 12 Sep 06:42:14.189 * Background append only file rewriting started by pid 16 1:S 12 Sep 06:42:14.221 * AOF rewrite child asks to stop sending diffs. 16:C 12 Sep 06:42:14.221 * Parent agreed to stop sending diffs. Finalizing AOF... 16:C 12 Sep 06:42:14.221 * Concatenating 0.00 MB of AOF diff received from parent. 16:C 12 Sep 06:42:14.223 * SYNC append only file rewrite performed 16:C 12 Sep 06:42:14.224 * AOF rewrite: 6 MB of memory used by copy-on-write 1:S 12 Sep 06:42:14.274 * Background AOF rewrite terminated with success 1:S 12 Sep 06:42:14.274 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 1:S 12 Sep 06:42:14.274 * Background AOF rewrite finished successfully

主从切换

  • docker-compose ps
       Name                      Command               State           Ports
-------------------------------------------------------------------------------------
sentinel_master_1     docker-entrypoint.sh redis ...   Up      0.0.0.0:6379->6379/tcp
sentinel_sentinel_1   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_sentinel_2   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_sentinel_3   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_slave_1      docker-entrypoint.sh redis ...   Up      6379/tcp
sentinel_slave_2      docker-entrypoint.sh redis ...   Up      6379/tcp
  • 停止master节点
docker pause sentinel_master_1
  • 查看sentinel日志
1:X 12 Sep 06:46:42.611 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 12 Sep 06:46:42.615 # Sentinel ID is 9e1da269ca7f134ed7bae15ad8efa3f5dd22f72d
1:X 12 Sep 06:46:42.615 # +monitor master redis-master 172.17.0.2 6379 quorum 2
1:X 12 Sep 06:46:42.617 * +slave slave 172.17.0.3:6379 172.17.0.3 6379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:46:43.467 * +sentinel sentinel 927320a2afbfd70eae1716e8a024c726e71f2b51 172.17.0.4 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:46:44.554 * +sentinel sentinel 8fc2f95bc671dc8a3df30046a29fdc41743a774d 172.17.0.5 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:47:02.679 * +slave slave 172.17.0.7:6379 172.17.0.7 6379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:32.777 # +new-epoch 1
1:X 12 Sep 06:48:32.784 # +vote-for-leader 927320a2afbfd70eae1716e8a024c726e71f2b51 1
1:X 12 Sep 06:48:32.843 # +sdown master redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:32.944 # +odown master redis-master 172.17.0.2 6379 #quorum 3/2
1:X 12 Sep 06:48:32.944 # Next failover delay: I will not start a failover before Wed Sep 12 06:48:43 2018
1:X 12 Sep 06:48:33.857 # +config-update-from sentinel 927320a2afbfd70eae1716e8a024c726e71f2b51 172.17.0.4 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:33.861 # +switch-master redis-master 172.17.0.2 6379 172.17.0.3 6379
1:X 12 Sep 06:48:33.863 * +slave slave 172.17.0.7:6379 172.17.0.7 6379 @ redis-master 172.17.0.3 6379
1:X 12 Sep 06:48:33.864 * +slave slave 172.17.0.2:6379 172.17.0.2 6379 @ redis-master 172.17.0.3 6379
1:X 12 Sep 06:48:38.902 # +sdown slave 172.17.0.2:6379 172.17.0.2 6379 @ redis-master 172.17.0.3 6379
  • 查看新的master
1:M 12 Sep 06:48:32.996 # Connection with master lost.
1:M 12 Sep 06:48:32.997 * Caching the disconnected master state.
1:M 12 Sep 06:48:32.997 * Discarding previously cached master state.
1:M 12 Sep 06:48:32.997 * MASTER MODE enabled (user request from 'id=3 addr=172.17.0.4:57270 fd=6 name=sentinel-927320a2-cmd age=389 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:M 12 Sep 06:48:32.998 # CONFIG REWRITE executed with success.
1:M 12 Sep 06:48:33.983 * Slave 172.17.0.7:6379 asks for synchronization
1:M 12 Sep 06:48:33.983 * Full resync requested by slave 172.17.0.7:6379
1:M 12 Sep 06:48:33.983 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:48:33.984 * Background saving started by pid 28
28:C 12 Sep 06:48:33.992 * DB saved on disk
28:C 12 Sep 06:48:33.992 * RDB: 6 MB of memory used by copy-on-write
1:M 12 Sep 06:48:34.076 * Background saving terminated with success
1:M 12 Sep 06:48:34.076 * Synchronization with slave 172.17.0.7:6379 succeeded
  • 可以看到MASTER MODE enabled

恢复节点

docker unpause sentinel_master_1

查看该节点日志

1:M 12 Sep 06:56:05.592 # Connection with slave client id #12 lost.
1:M 12 Sep 06:56:05.592 # Connection with slave client id #5 lost.
1:S 12 Sep 06:56:17.140 * SLAVE OF 172.17.0.3:6379 enabled (user request from 'id=144 addr=172.17.0.5:41876 fd=7 name=sentinel-8fc2f95b-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:S 12 Sep 06:56:17.141 # CONFIG REWRITE executed with success.
1:S 12 Sep 06:56:17.206 * Connecting to MASTER 172.17.0.3:6379
1:S 12 Sep 06:56:17.206 * MASTER <-> SLAVE sync started
1:S 12 Sep 06:56:17.206 * Non blocking connect for SYNC fired the event.
1:S 12 Sep 06:56:17.207 * Master replied to PING, replication can continue...
1:S 12 Sep 06:56:17.208 * Partial resynchronization not possible (no cached master)
1:S 12 Sep 06:56:17.211 * Full resync from master: b2e78c2c21c3a4caa7a37fe86da9b3a2cda0dce4:134615
1:S 12 Sep 06:56:17.288 * MASTER <-> SLAVE sync: receiving 94 bytes from master
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Flushing old data
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Finished with success
1:S 12 Sep 06:56:17.292 * Background append only file rewriting started by pid 32
1:S 12 Sep 06:56:17.339 * AOF rewrite child asks to stop sending diffs.
32:C 12 Sep 06:56:17.339 * Parent agreed to stop sending diffs. Finalizing AOF...
32:C 12 Sep 06:56:17.339 * Concatenating 0.00 MB of AOF diff received from parent.
32:C 12 Sep 06:56:17.342 * SYNC append only file rewrite performed
32:C 12 Sep 06:56:17.342 * AOF rewrite: 4 MB of memory used by copy-on-write
1:S 12 Sep 06:56:17.407 * Background AOF rewrite terminated with success
1:S 12 Sep 06:56:17.407 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 12 Sep 06:56:17.407 * Background AOF rewrite finished successfully
  • 可以看到自己切换为slave跟新的master同步

小结

redis的sentinel模式相对cluster来说比较简单,缺点是需要浪费一些资源来做sentinel节点,对于中小数据量的业务来说,相对比较划算。

doc

  • redis-cluster
  • 高可用Redis服务架构分析与搭建

原文发布于微信公众号 - 码匠的流水账(geek_luandun)

原文发表时间:2018-09-12

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏osc同步分享

springMVC 表单校验、日期类型的转换

@Controller @RequestMapping("/appointments") public class AppointmentsController...

3447
来自专栏乐沙弥的世界

MHA 自动故障转移步骤及过程剖析

    MHA是众多使用MySQL数据库企业高可用的不二选择,它简单易用,功能强大,实现了基于MySQL replication架构的自动主从故障转移,本文主要...

1063
来自专栏JAVA技术站

SpringCloud 微服务实现方式 原

消费启动服务,注意EnableFeginClients 一定要加basePackages,要不然扫不到单独作为api的jar包里面接口

1171
来自专栏技术小黑屋

Install Git Daemon on Fedora

Git-daemon is A really simple server for git repositories.You can take a detail...

891
来自专栏光变

SpringMVC 使用Valid注解校验数据

1113
来自专栏Netkiller

Phalcon VS Spring 用法对照手册

Phalcon VS Spring 摘要 Phalcon VS Spring 用法对照表 ---- 目录 1. Install 1.1. Phalcon 1.2...

4176
来自专栏Netkiller

Spring Cloud Config

摘要: 本文节选自《Netkiller Java 手札》 Spring Cloud Config 本文节选自《Netkiller Java 手札》 https:...

3817
来自专栏云知识学习

kubernetes 基础集群排障

在排错过程中,kubectl 是最重要的工具,通常也是定位错误的起点。这里也列出一些常用的命令,在后续的各种排错过程中都会经常用到。

1.3K12
来自专栏Netkiller

Spring boot with Thymeleaf

本文节选自电子书《Netkiller Java 手札》 5.19. Spring boot with Thymeleaf 5.19.1. Maven <dep...

36413
来自专栏用户2442861的专栏

java SLF4J 使用其他的 log框架

http://saltnlight5.blogspot.com/2013/08/how-to-configure-slf4j-with-different.ht...

1351

扫码关注云+社区

领取腾讯云代金券