Currently, TencentDB for MariaDB supports the intra-city 2-DC active/active scheme, which has the following main features:
Intra-city 2-DC deployment
2-DC writability: If your servers are deployed in different subnets of two DCs, you can connect to the database and write data to it from any server in either DC.
Automatic failover and recovery
Unique access IP of both DCs
However, the intra-city 2-DC active-active scheme alone cannot implement disaster recovery at the business system level. Actually, it is easy to switch a single system/module to an intra-city disaster recovery DC, but the complicated correlation among and configurations in enterprise-level system businesses are challenges for the 2-DC scheme.
To build a dual-active business system, it is essential to base the design, usage, management, and system upgrade processes on the 2-DC architecture, with real-time usage and configuration intercommunication. This ensures that the business can quickly resume operation with little or no modification after a failure. The goal of TencentDB for MariaDB's intra-city 2-DC active/active design is to enable both business systems in the two DCs to read and write to the database system through their local networks while maintaining strong data consistency.
Design standards
The active-active feature of TencentDB for MariaDB is designed based on "GB/T 20988-2007 Information Security Technology - Disaster Recovery Specifications for Information System". For a single database module:
RTO ≤ 60 seconds
RPO ≤ 5 seconds
Failover time ≤ 5 seconds
Failure detection time ≤ 30 seconds
This means that it takes about 40 seconds to complete failover after a failure occurs (including failure detection time).
Risk warning: When performing tests in a production environment, make sure that the business system has an automatic database reconnection mechanism. The business system usually has multiple modules, and each module may be associated with multiple data sources; therefore, the more complex the system, the longer the recovery time.
Support Status
Supported items
Instance Version:
Standard Edition: One primary and one replica (two nodes) or one primary and two replicas (three nodes);
Finance Edition: One primary and one replica (two nodes) or one primary and two replicas (three nodes);
Network requirement: VPC only
Supported regions:
Beijing (Beijing Zone 1, Beijing Zone 3)
Shanghai Finance (Finance Zone 1, Finance Zone 2)
Shenzhen Finance (Finance Zone 1, Finance Zone 2)
Pricing
The pricing for dual availability zones is the same as that for a single availability zone. For more information, see Pricing Details.
Purchase and use
If the primary and replica AZs are the same, the single-AZ deployment scheme is used.
If the primary and replica AZs are different, the intra-city 2-DC deployment scheme is used.
Note
The master AZ is the region where your primary server is located. Ideally, the database should be allocated in the same VPC subnet as the primary server to minimize access latency. The replica AZ is where the database replica nodes are located. If there is a 3-node configuration with one master and two replicas, the master AZ will have two nodes deployed. If there is a 2-node configuration with one master and one replica, the master AZ will have one node deployed.
If intra-city 2-DC policy is required for the finance cloud cage solution, an intra-city 2-DC cage solution needs to be built first. For more information, contact your sales rep and architect.
Viewing details of instance availability zones
You can visit the MariaDB console and click on the instance ID or Manage in the Operation column to access the instance details page.
Master-slave switch
To switch the master node from one availability zone (AZ) to another, simply click Master/Replica Switch. This is a high-risk operation that requires IP address verification of the login account. The switch may cause a brief database disconnection (≤1s), so ensure your business has a database reconnection mechanism. Frequent switching may lead to system anomalies or even data inconsistencies.
How It Works
By integrating the highly available primary/replica architecture of TencentDB for MariaDB with virtual IP drifting of VPC AZ, simultaneous reads from and writes to two DCs can be implemented. This architecture has the following features:
Proxy modules are deployed in a hybrid manner on the frontend of each TencentDB for MariaDB database node, which are responsible for routing data requests to corresponding database nodes.
Deploy a cross-regional VPC gateway in front of the Proxy module, supporting virtual IP migration functionality.
As illustrated above, taking data writing as an example, suppose the business server is deployed in Availability Zone A. The VPC gateway forwards data requests to the Proxy gateway in Availability Zone A, which then transparently forwards them to the Master node. If the business server is deployed in Availability Zone B, the VPC gateway forwards data requests to the Proxy gateway in Availability Zone B, which then transparently forwards them (via Tencent Cloud BGP private network) to the Master node.
The entire process is transparent to the business, whether it is a read or write request. In case of a database exception, the database cluster handles it according to the following principles:
If both the primary and proxy fail, the cluster will automatically promote the optimal replica to the new primary. The system will notify the VPC to modify the association between the virtual and physical IPs. The business will only perceive that some write requests are disconnected.
If the primary fails but the proxy is normal, the cluster will automatically promote the optimal replica to the new primary. The proxy will block requests until primary/replica switch is completed. In this case, the business will only perceive that some requests time out.
In case of a Slave failure (regardless of whether the Proxy is faulty or not), during read/write separation, the pre-configured read-only account read-only policy (with three types) will be executed.
If AZ A experiences a complete failure, the VPC and database in AZ B will still be operational. At this point, the slave2 node will be automatically promoted to Master, and the read/write policy of this node will be adjusted according to the strong synchronization strategy. The VPC network IP will migrate to AZ B. The cluster will then attempt to restore the node in AZ A. If the node cannot be restored within 30 minutes, at least one Slave node will be automatically rebuilt in AZ B. Due to the IP migration policy, there is no need to modify the database configuration for the business.
If DC B completely fails, it is equivalent to the failure of a replica node in the TencentDB for MariaDB cluster, and the failure can be processed in the same way as described in item 3 above.
FAQs
Compared with intra-city 1-DC, will the intra-city 2-DC scheme cause a decrease in performance?
Based on the strong sync replication scheme, as the cross-DC delay is slightly larger than that between devices in the same DC, the speed of SQL response will drop by about 5% in theory.
Is it possible for a primary node to switch from the primary AZ to the replica AZ?
Yes, if it doesn't affect your business usage, you can ignore it. If you're concerned about the impact, you can switch back during off-peak hours using the primary/replica switch feature in the console.
How do I know that primary/replica switch is performed in the database cluster?
Please go to the Tencent Cloud Observability Platform Console > Alarm Policy > TencentDB for MariaDB > Configure Primary-Replica Switch Alarm.
If part of the read or write requests are handled by the replica AZ, the network delay will cause a decrease in performance, but I need the intra-city 2-DC feature. What should I do?
You can submit a ticket specifying the instance ID, your server's AZ deployment scheme, and the read/write request ratio. Tencent Cloud DBAs can help you adjust the dual-AZ load balancing mechanism to minimize the read/write requests handled by the secondary AZ.
What should I do if I want to change from the 1-DC architecture to intra-city 2-DC architecture?
Check whether the intra-city 2-DC scheme is supported in your region. It is now available in Beijing, Shanghai Finance, and Shenzhen Finance regions. Then, submit a ticket indicating the information of the account to be adjusted, instance ID, two AZs to be used, and recommended Ops time. Tencent Cloud staff will conduct an audit. If your request is eligible, the operation can be performed; otherwise, it will be rejected.