网易DG Broker系列:FSFO配置篇

上一篇文章我们介绍了 DG Broker 的手动切换,一条命令搞定切库。有同学就说了那不还得用手么?有木有办法在主库挂掉时自动切换?请看下文……

1

FSFO简介

Fast-Start Failover是建立在broker基础上的一个快速故障转换的机制,通过fast-start failover可以自动检测primary的故障,然后自动的failover到预先指定的standby上面,这样可以最大化的减少故障时间,提高数据库的可用性。

2

FSFO测试环境

OS:linux Redhat 7

oracle:11.2.0.4

10.110.110.22主库 DB_UNIQUE_NAME test_pri

10.110.110.21备库 DB_UNIQUE_NAME test_std

10.110.110.30 observer主机

DG broker 配置名称为test

sys密码:123456

3

FSFO前提条件

1)DG broker configuration的配置必须在最大可用或者最大性能模式下2)保护模式、standby redo logs和传输模式

最大可用 sync

最大性用 async

主库的LogXptMode和备库一致。

主备的standby redo要保持一致。

3)主备库都打开数据库级别闪回4)在观察者主机上安装DGMGRL5)观察者主机上配置 TNSNAMES.ORA 文件6)主备库配置静态服务

修改主库监听

SID_LIST_LISTENER=

加入:

(SID_LIST=

(SID_DESC=

(GLOBAL_DBNAME=test_pri_dgmgrl)

(ORACLE_HOME=/home/oracle/app/product/11.2.0/db_1)

(SID_NAME=test))

)

修改备库监听

SID_LIST_LISTENER=

加入:

(SID_LIST=

(SID_DESC=

(GLOBAL_DBNAME=test_std_DGMGRL

(ORACLE_HOME=/home/oracle/app/product/11.2.0/db_1)

(SID_NAME=test))

)

4

FSFO配置

1)根据DG环境确定目标备库,一般一主一备。

2)指定目标备库 FastStartFailoverTarget

DGMGRL> edit database 'TEST_PRI' set property FastStartFailoverTarget='test_std';

Property "faststartfailovertarget" updated

DGMGRL> show database 'TEST_PRI' FastStartFailoverTarget;

FastStartFailoverTarget = 'test_std'

如果系统只有一主一备,这个步骤可以省略,在FSFO启动的时候,会自动指定。

3)配置保护模式。

DGMGRL> EDIT DATABASE ' TEST_PRI ' SET PROPERTY LogXptMode=SYNC;

DGMGRL> EDIT DATABASE ' test_std ' SET PROPERTY LogXptMode=SYNC;

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;

4)设置 FastStartFailoverThreshold 参数

DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold = 15;

默认值30 秒,最小值 6.

5)可选的数据库配置

FastStartFailoverPmyShutdown 默认 TRUE

FastStartFailoverLagLimit 默认30 seconds

FastStartFailoverAutoReinstate 默认 TRUE

6)启用额外可选的fast-start failover条件。

Datafile Offline Yes

Corrupted Dictionary Yes

Corrupted Controlfile Yes

Inaccessible Logfile No

Stuck Archiver No

还可以指定ORA类型错误作为FSFO的触发条件。

7)启用 fast-start failover.

DGMGRL> ENABLE FAST_START FAILOVER;

Enabled.

8)启动观察者

主库必须是启动状态。

DGMGRL> START OBSERVER;

9)检查 fast-start failover 环境。

DGMGRL> SHOW FAST_START FAILOVER;

Fast-Start Failover: ENABLED

Threshold: 15 seconds

Target: test_std

Observer: 10-110-110-30

Lag Limit: 30 seconds (not in use)

Shutdown Primary: TRUE

Auto-reinstate: TRUE

Observer Reconnect: (none)

Observer Override: FALSE

Configurable Failover Conditions

Health Conditions:

Corrupted Controlfile YES

Corrupted Dictionary YES

Inaccessible Logfile NO

Stuck Archiver NO

Datafile Offline YES

Oracle Error Conditions:

(none)

5

客户端TAF切换配置(主库)

1)在主库上配置一个TAF的service

此服务在数据库出现故障时会发送通知给客户端,允许查询语句在故障转移发生后继续运行。

在主库端运行:

begin

DBMS_SERVICE.CREATE_SERVICE(

service_name => 'SERVICE_PRI',

network_name => 'SERVICE_PRI',

aq_ha_notifications => TRUE,

failover_method => 'BASIC',

failover_type => 'SELECT',

failover_retries => 15,

failover_delay => 5);

end;

/

2)创建在主库启动服务的存储过程

创建一个存储过程来实现此目的,如果当前数据库是主库它就启动此服务,如果是备库就停止。

主库执行:

create or replace procedure SERVICE_PRI is

v_role VARCHAR(30);

begin

select DATABASE_ROLE into v_role from V$DATABASE;

if v_role = 'PRIMARY' then

DBMS_SERVICE.START_SERVICE('SERVICE_PRI');

else

DBMS_SERVICE.STOP_SERVICE('SERVICE_PRI');

end if;

end;

/

3)创建一个触发器来确保服务运行

创建触发器,让数据库在启动或角色转换时运行此存储过程。用于当数据库open时,不需要重启数据库,如果是主库则执行存储过程。当数据库切换后,如果是主库则执行存储过程。

主库执行:

create or replace TRIGGER FSFO_TRIGGER

after startup or db_role_change on database

begin

SERVICE_PRI;

end;

/

4)启动新创建的service

在主库上执行该存储过程(或者重启数据库,在启动数据库时会触发执行触发器),在主库做日志的切换,将变化应用到备库。

主库执行:

SQL> exec SERVICE_PRI;

PL/SQL 过程已成功完成。

已用时间: 00: 00: 00.01

查看监听状态:lsnrctl status

22主库上,多出来一个服务:

Services Summary...

Service "TEST_PRI" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

Service "TEST_PRI_DGB" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

Service "SERVICE_PRI" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

Service "test" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

Service "testXDB" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

Service "test_pri_dgmgrl" has 1 instance(s).

Instance "test", status UNKNOWN, has 1 handler(s) for this service...

Service "test_std" has 1 instance(s).

Instance "test", status READY, has 1 handler(s) for this service...

The command completed successfully

21备库不会出现该服务:

Services Summary...

Service "test" has 1 instance(s).

Instance "test", status READY, has 2 handler(s) for this service...

Service "test_pri" has 1 instance(s).

Instance "test", status READY, has 2 handler(s) for this service...

Service "test_std" has 1 instance(s).

Instance "test", status READY, has 2 handler(s) for this service...

Service "test_std_DGB" has 1 instance(s).

Instance "test", status READY, has 2 handler(s) for this service...

Service "test_std_DGMGRL" has 1 instance(s).

Instance "test", status UNKNOWN, has 1 handler(s) for this service...

The command completed successfully

5)在备库查询,确认触发器和存过已经应用

sys@test> select trigger_name, TRIGGERING_EVENT from dba_triggers where trigger_name = 'FSFO_TRIGGER';

TRIGGER_NAME TRIGGERING_EVENT

--------------------------------------------

FSFO_TRIGGER STARTUP

6)客户端tnsnames 配置

此时在dgmgrl中手动shutdown abort掉主库,那么连接的cmd中会停顿一会,等待fast-start failover切换完成后,则继续返回结果。主备切换不影响用户的select操作,但是如果是dml操作,则所有事务回滚:

FSFO =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = tcp)(HOST = 10.110.110.22)(PORT = 9539))

(ADDRESS = (PROTOCOL = tcp)(HOST = 10.110.110.21)(PORT = 9539))

(LOAD_BALANCE = off)

(FAILOVER = on)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = SERVICE_PRI)

(FAILOVER_MODE =

(TYPE = select)

(METHOD = basic)

(RETRIES = 180)

(DELAY = 5)

)

)

)

6

JDBC中连接串的配置

在数据库中建立服务、存储过程和触发器后,需要如下配置:

注意:在druid连接池中连接串的配置必须在一行中。

好啦,FSFO的配置本文先介绍到这里,关于主库挂掉后的自动切换过程,咱们下回讲解。

网易MySQL开源中间件Cetus

__________________________

github地址

https://github.com/Lede-Inc/cetus/blob/master/doc/cetus-quick-try.md

欢迎加star关注

社群

技术专家在线及时反馈

  • 发表于:
  • 原文链接https://kuaibao.qq.com/s/20180831G0PHEV00?refer=cp_1026
  • 腾讯「云+社区」是腾讯内容开放平台帐号(企鹅号)传播渠道之一,根据《腾讯内容开放平台服务协议》转载发布内容。

扫码关注云+社区

领取腾讯云代金券