This document elucidates the process of constructing a high-availability primary and standby cluster within Tencent Cloud VPC, utilizing the keepalived software in conjunction with High Availability Virtual IP (HAVIP).
Reminder
The HAVIP feature is in beta now, with a switchover latency of around 10 seconds. To try it out, please submit a ticket.
Principles
Typically, a high availability primary/secondary cluster consists of two servers: an active primary server and a standby secondary server. The two servers share the same VIP (virtual IP) which is only valid for the primary server. When the primary server fails, the secondary server will take over the VIP to continue providing services. This mode is widely used in MySQL source/replica switch and Ngnix web access.
Keepalived is a VRRP-based high availability software that can be used to build a high availability primary/secondary cluster among VPC-based CVMs. To use Keepalived, first complete its configuration in the keepalived.conf file.
In traditional physical networks, the primary/secondary status can be negotiated with Keepalived’s VRRP protocol. The primary device periodically sends free-of-charge ARP messages to purge the MAC table or terminal ARP table of the uplink exchange to trigger the VIP migration to the primary device.
In a Tencent Cloud VPC, a high availability primary/secondary cluster can also be implemented by deploying Keepalived on CVMs, with the following differences:
The VIP utilized must be the High Availability Virtual IP (HAVIP) procured from Tencent Cloud.
HAVIP is subnet-based and can only be bound to a server under the same subnet.
Supports and Limits
The Unicast mode is recommended for VRRP communications.
Reminder
This article demonstrates the Unicast configuration. To use multicast for VRRP communication, you need to join the multicast beta test. Then you can enable VPC multicast as instructed in Enabling and Disabling Multicast. There is no need to configure the IP of the peer device in the keepalived configuration file, which means you do not need to configure the "unicast_peer" parameter.
Keepalived 1.2.24 and later versions are recommended.
Ensure that the garp parameters have been configured. Because Keepalived relies on ARP messages to update the IP address, these configurations ensure that the primary device always sends ARP messages for the communication.
garp_master_delay 1
garp_master_refresh 5
Configure a unique VRRP router ID for each primary/secondary cluster in the VPC.
Do not use the strict mode. Ensure the “vrrp_strict” configurations have been deleted.
Control the number of HAVIPs bound to a single ENI to be no more than 5. If you need to use multiple VIPs, add or modify vrrp_garp_master_repeat 1 in the “global_defs” section of the Keepalived configuration file.
Adjust the adver_int parameter to keep a balance between network jitter resistance and disaster recovery speed. If advert_int is too small, it is susceptible to frequent failovers and temporary split-brain (dual primary) situations due to network jitter until the network recovers. If advert_int is too large, it will result in slow primary-secondary failover (i.e., longer service downtime) when the primary server fails. Please assess the impact of split-brain (dual primary) on your business beforehand.
Set the interval parameter in the specific execution item of track_script script (such as checkhaproxy) to a larger value, avoiding the FAULT status caused by script execution timeout.
Optional: be aware of increased disk usage due to log printing. This can be solved using logrotate or other tools.
Instructions
Reminder
This document uses the following environments as an example. Please replace with your actual configurations.
2. In the left sidebar, select IP & Network Adapters > High Availability Virtual IP.
3. Select the target region on the HAVIP management page and click Apply.
4. In the Apply for HAVIP window, enter the name, and select the VPC and subnet, and click OK.
Reminder
The IP address of the HAVIP can be automatically assigned or manually specified. If you choose to enter an IP address, make sure that the entered private IP address is within the subnet IP range and is not a reserved IP address of the system. For example, if the subnet IP range is 10.0.0.0/24, the entered private IP address should be within 10.0.0.2 - 10.0.0.254.
Then you can see the HAVIP you applied for.
Step 2: install Keepalived (version 1.2.24 or later) on primary and secondary CVMs
This document uses CentOS 7.6 as an example to install Keepalived.
1. Run the following command to verify whether the Keepalived version meets the requirements.
Step 3: configure Keepalived, and bind HAVIP to the primary and secondary CVMs.
1. Log in to the primary CVM HAVIP-01 and run vim /etc/keepalived/keepalived.conf to modify its configurations.
Reminder
In this example, HAVIP-01 and HAVIP-02 are configured with the same weight. Both are in the BACKUP status, with a priority of 100. This will reduce the number of switchovers caused by network jitter.
script "/etc/keepalived/do_sth.sh" # Check whether the service process runs normally. Replace “do_sth.sh” with your actual script name. Run it as needed.
interval 5
}
vrrp_instance VI_1 {
Select proper parameters for the primary and secondary CVMs.
state BACKUP #Set the initial status to Backup
interface eth0 # The ENI such as eth0 used to bind a VIP
virtual_router_id 51 # Thevirtual_router_id value for the cluster
nopreempt # Non-preempt mode,
# preempt_delay 10 #Effective only when “state MASTER”
priority 100 # Configure the same weight for the two devices
advert_int 5
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 172.16.16.5 # Set the local private IP address
2. In the Bind EIP window, select the EIP to be bound and click OK. If there are no available EIPs, apply for one in the Elastic Public IP console first.
Step 5: use notify_action.sh for simple logging (optional)
The Keepalived’s main logs are still recorded in “/var/log/message”, and you can add the “notify” script for simple logging.
1. Log in to the CVM and run the vim /etc/keepalived/notify_action.sh command to add the following “notify_action.sh” script.
2. Run the chmod a+x /etc/keepalived/notify_action.sh command to modify the script permission.
Step 6: verify whether VIP and public IP are switched normally during primary/secondary switch
Simulate the CVM failure by restarting the Keepalived process or restarting the CVM to check whether the VIP can be migrated.
If the primary/secondary switch succeeds, the secondary CVM will become the server bound with the HAVIP in the console.
You can also ping a VIP from within the VPC to check the time lapse from network interruption to recovery. Each switch may cause an interruption for about 4 seconds. If you ping the EIP bound to HAVIP over a public network, the result will be the same.
Run the ip addr show command to check whether the HAVIP is bound to the primary ENI.