distribute_HA_keepalived
Introduction
Load balancing is a method of distributing IP traffic across a cluster of real servers, providing one or more highly available virtual services. When designing load balanced topologies, it is important to account for the availability of the load balancer itself as well as the real servers behind it
.
Keepalived provides frameworks for both load balancing and high availability. The load balancing framework relies on the well-known and widely used Linux Virtual Server (IPVS) kernel module, which provides Layer 4 load balancing. Keepalived implements a set of health checkers to dynamically and adaptively maintain and manage load balanced server pools according to their health.
high-availability is achieved by the Virtual Router Redundancy Protocol (VRRP). VRRP is a fundamental brick for router failover. In addition, Keepalived implements a set of hooks to the VRRP finite state machine providing low-level and high-speed protocol interactions. In order to offer fastest network failure detection, Keepalived implements the Bidirectional Forwarding Detection (BFD) protocol
. VRRP state transition can take into account BFD hints to drive fast state transition. Keepalived frameworks can be used independently or all together to provide resilient infrastructures.
In short, Keepalived provides two main functions:
- Health checking for LVS systems
- Implementation of the VRRPv2 stack to handle load balancer failover
In this article, we only talk about high-availability, load balancer failover.
Inside keepalived
VRRP
The Virtual Router Redundancy Protocol (VRRP) is a computer networking protocol that provides for automatic assignment of available Internet Protocol (IP) routers to participating hosts
. This increases the availability and reliability of routing paths via automatic default gateway selections on an IP subnetwork.
The protocol achieves this by creation of virtual routers, which are an abstract representation of multiple routers, i.e. Primary/Active and Secondary/Standby routers, acting as a group. The virtual router is assigned to act as a default gateway of participating hosts, instead of a physical router
. If the physical router that is routing packets on behalf of the virtual router fails, another physical router is selected to automatically replace it. The physical router that is forwarding packets at any given time is called the Primary/Active router.
VRRP provides information on the state of a router, not the routes processed and exchanged by that router.
Physical routers within the virtual router must communicate within themselves using packets with multicast IP address 224.0.0.18(newly implementation support unicast heartbeat to peers)
and IP protocol number 112.
Routers have a priority of between 1 and 254 and the router with the highest priority will become the Primary/Active. The default priority is 100.
Elections of Primary/Active routers
A failure to receive a multicast packet from the Primary/Active router for a period longer than three times the advertisement
timer causes the Secondary/Standby routers to assume that the Primary/Active router is dead. The virtual router then transitions into an unsteady state and an election process is initiated to select the next Primary/Active router from the Secondary/Standby routers. This is fulfilled through the use of multicast packets.
Secondary/Standby router(s) are only supposed to send multicast packets during an election process. One exception to this rule is when a physical router is configured with a higher priority than the current Primary/Active, which means that on connection to the network it will preempt the Primary/Active status
. This allows a system administrator to force a physical router to the Primary/Active state immediately after booting, for example when that particular router is more powerful than others within the virtual router. The Secondary/Standby router with the highest priority becomes the Primary/Active router by raising its priority above that of the current Primary/Active. It will then take responsibility for routing packets sent to the virtual gateway’s MAC address. In cases where Secondary/Standby routers all have the same priority, the Secondary/Standby router with the highest IP address becomes the Primary/Active router.
All physical routers acting as a virtual router must be in the same local area network (LAN) segment(newly implementation support unicast)
. Communication within the virtual router takes place periodically. This period can be adjusted by changing advertisement interval timers. The shorter the advertisement interval, the shorter the black hole period, though at the expense of more traffic in the subnet.
Once the new master has been elected, it sends out a “gratuitous ARP.”, every host has an ARP table that ties IP addresses to Ethernet addresses. A gratuitous ARP is an unsolicited message with an IP address to Ethernet address mapping. All hosts receiving the gratuitous ARP update their tables, which effectively means that the virtual IP address is owned by a new device on the network.
Note that whether we use VRRP in multicast or unicast mode, we are not using UDP/IP or TCP/IP. VRRP is its own protocol on top of IP that is independent of either of those
keepalived cases
Even keepalived supports nodes located at different subnet, but the best choice is to run them at same subnet.
different subnets nodes
1 | # VRRP advertisements ordinarily go out over multicast. This |
nodes are at same subnet, no across router
Two nodes runs keepalived
1 | # Ubuntu |
- vrrp_instance defines an individual instance of the VRRP protocol running on an interface.
- state defines the initial state that the instance should start in, but may not be final state due to master selection algorithm.
- interface defines the interface that VRRP runs on.
- virtual_router_id is the unique identifier, should be same for the all nodes.
- priority is the advertised priority used for master/slave election.
- advert_int specifies the frequency that advertisements are sent at (1 second, in this case).
- authentication specifies the information necessary for servers participating in VRRP to authenticate with each other. In this case, a simple password is defined.
- virtual_ipaddress defines the IP addresses (there can be multiple) that VRRP is responsible for.
If you’re using a host-based firewall, such as firewalld or iptables, then you need to add the necessary rules to permit IP protocol 112 traffic.
Debug keepalived
1 | # check virtual ip configured or not on master |
split-brain
In a highly available (HA) system, when the "heartbeat" linking the two nodes is disconnected
, the HA system, which was originally a whole and coordinated in action, splits into two independent individuals. Since they lost contact with each other, they thought it was the other party that had malfunctioned. The HA software on the two nodes is like a “brain splitter”. If they compete for "shared resources" and compete for "application services", serious consequences will occur-or if the shared resources are divided and the "services" on both sides will
not`. Coming; or both “services” are up, but at the same time reading and writing “shared storage”, resulting in data corruption (common errors such as online logs polled by the database).
Two active nodes, same virtual IP configured at differetn nodes
Why it happens
- The heartbeat link between the pair of highly available servers fails, which prevents normal communication. If the heartbeat line is broken (including broken, aging).
- Because the network card and related drivers are broken, IP configuration and conflict problems (network card direct connection).
- Due to the failure of the equipment connected between the heartbeat cables (network card and switch).
- There is a problem with the arbitration machine (using the arbitration scheme).
- The iptables firewall is turned on on the high availability server to block the transmission of heartbeat messages.
- In the same VRRP instance in the Keepalived configuration, if the virtual_router_id parameter settings on both ends are inconsistent, split-brain problems can also occur.
- vrrp instance names are inconsistent and their priorities are the same
avoid it
Add redundant heartbeat wires, for example: double-line wires (heartbeat wires are also HA), to minimize the occurrence of “split brain”
Enable disk lock. The serving party locks the shared disk, and when the “split brain” occurs, let the other party completely “snatch away” the shared disk resources. But there is also a problem with using locked disks. If the party occupying the shared disk does not actively “unlock” it, the other party will never get the shared disk. In reality, if the service node suddenly crashes or crashes, it is impossible to execute the unlock command. The backup node cannot take over shared resources and application services. So someone designed a “smart” lock in HA. That is: the party that is serving only enables the disk lock when it finds that the heartbeat line is all disconnected (the peer end is not detected). Usually it is not locked.
Set up an arbitration mechanism. For example, set the reference IP (such as the gateway IP). When the heartbeat line is completely disconnected, both nodes ping the reference IP. If they fail, the breakpoint is at the local end. Not only the “heartbeat”, but also the external “service” of the local network link is broken, even if the application service is started (or continued) is useless, then actively give up competition and let the end that can ping the reference IP to start the service . More secure, the party that cannot ping the reference IP simply restarts itself to completely release the shared resources that may be occupied.
Script detection and alarm
The last two are commonly used in production env
troubleshooting
Check why
- first make sure, config is correct, check
/etc/keepalived/keepalived.conf
- check route is ok
- check iptables to allow vrrp