Local Area Network design/Redundancy and load balancing at layer 3 in LANs
At the boundaries of the corporate LAN with the network layer, the router providing connectivity with outside (typically Internet) represents, for hosts having it as their default gateway, a single point of failure, unless the router is redounded properly.
Simple router duplication is not enough: hosts are not able to automatically switch to the other router in case their default gateway fails, because they are not able to learn the network topology through network-layer routing protocols.
Therefore some protocols for automatic management of redundant routers have been defined:
Hot Standby Routing Protocol (HSRP) guarantees automatically that every host keeps connectivity with outside the LAN through its default gateway even in case one of the redundant routers fails.
Interfaces belonging to the corporate LAN of all redundant routers are assigned a single virtual IP address and a single virtual MAC address, in addition to their actual IP and MAC addresses. Routers can be:
- active: it is the router which has the right to serve the LAN, that is to answer at the virtual IP address and at the virtual MAC address;
- stand-by: it is the router which has the right to replace the active router in case the latter fails;
- listen: they are other routers neither active nor stand-by; one of them will become the stand-by router in case the active router fails.
The virtual IP address has to be set explicitly by the network administrator during the HSRP configuration, while the virtual MAC address has Cisco's well-known prefix '00:00:0C:07:AC':
|OUI (00:00:0C)||HSRP string (07:AC)||group ID|
where the fields are:
- Organizationally Unique Identifier (OUI) field (3 bytes): string of bits '00:00:0C' is the OUI assigned to Cisco so that MAC addresses of network cards sold by Cisco are globally unique;
- HSRP string field (2 bytes): string of bits '07:AC' identifies a HSRP virtual MAC address, and can not appear in any physical MAC address → the virtual MAC address is guaranteed to be unique within the LAN: it is not possible for a host to have a MAC address equal to the HSRP virtual MAC address;
- group ID field (1 byte): it identifies the group the current HSRP instance is referring to: #HSRP groups.
The virtual IP address is set to all hosts as their default gateway address, that is as the IP address to which the hosts will send IP packets heading outside the LAN.
Traffic asymmetric routingEdit
The goal of HSRP is to 'deceive' the host by making it believe to be communicating with outside through a single router characterized by an IP address equal to the default gateway address and by a MAC address equal to the MAC address got via ARP protocol, while actually HSRP in case of fault moves the active router to another router without making the host realize that:
- ARP Request: when a host connects to the network, it sends an ARP Request to the IP address set as default gateway, that is the virtual IP address;
- ARP Reply: the router sends back an ARP Reply with its own virtual MAC address;
- outgoing traffic: the host sends every following packet to the virtual MAC address, and just the active router processes it, while stand-by and listen routers discard it.
Then, the active router forwards the packet according to external routing protocols (OSPF, BGP, etc.) which are independent of HSRP → the packet may also cross stand-by and listen routers if routing protocols believe that this is the best path;
- incoming traffic: every packet coming from outside and heading to the host can enter the LAN from any of the redundant routers according to external routing protocols independent of HSRP, and the host will receive it with the actual MAC address of the router as its source MAC address.
External routing protocols are also able to detect router failures, including the default gateway, for incoming traffic → protection is achieved even if the LAN lacks HSRP.
HSRP can be used to achieve redundancy of a single machine to improve its fault tolerance: a server can have two network interfaces, one primary and one secondary, to which HSRP assigns a virtual IP address and a virtual MAC address → the server will keep being reachable at the same IP address even if the link connecting the primary interface to the network fails.
Hello packets are messages generated by redundant routers to:
- elect the active router: in the negotiation stage, routers exchange Hello packets proposing themselves as active routers → the active router is the one with the highest priority (configurable by the network administrator), or if there is a tie the one with the highest IP address;
- detect failures of the active router: the active router periodically sends Hello packets as 'keep-alive' messages → in case router active fails, the stand-by router does no longer receive the 'keep-alive' message and elect itself as the active router;
- update filtering databases: when the active router changes, the new active router starts sending Hello messages notifying to bridges within the corporate LAN the new location of the virtual MAC address → all bridges will update their filtering databases accordingly.
When a router becomes active, it also sends a gratuitous ARP Reply in broadcast (normal ARP Replies are unicast) with the virtual MAC address as its source MAC address.
In the Hello packet the HSRP header is encapsulated in the following format:
|14 bytes||20 bytes||8 bytes||20 bytes|
|MAC header||IP header||UDP header||HSRP|
|src: virtual MAC address||src: actual IP address||src: port 1985|
|dst: 01:00:5E:00:00:02||dst: 126.96.36.199||dst: port 1985|
|14 bytes||20 bytes||8 bytes||20 bytes|
|MAC header||IP header||UDP header||HSRP|
|src: actual MAC address||src: actual IP address||src: port 1985|
|dst: 01:00:5E:00:00:02||dst: 188.8.131.52||dst: port 1985|
- source MAC address: it is the virtual MAC address for the active router, it is the actual MAC address for the stand-by router;
- destination IP address: '184.108.40.206' is the IP address for the 'all-routers' multicast group; it is one of the multicast addresses unfiltered by IGMP snooping and therefore sent always in flooding by bridges: IGMP snooping;
- destination MAC address: '01:00:5E:00:00:02' is the multicast MAC address derived from the multicast IP address;
- Time To Live (TTL): it is equal to 1 so that packets are immediately discarded by routers which they reach, because they can be propagated just within the LAN;
- the HSRP header in the Hello packet is encapsulated into UDP and not into TCP because losing a Hello packet does not require transmitting it again;
- listen routers do not generate Hello packets, unless they detect that the stand-by router has become the active router and they should candidate so that one of them will become the new stand-by router.
HSRP header formatEdit
The HSRP header has the following format:
|Version||Op Code||State||Hello Time|
|Virtual IP Address|
where the most significant fields are:
- Op Code field (1 byte): it describes the type of message included in the Hello packet:
- 0 = Hello: the router is running and is capable to become the active or stand-by router;
- 1 = Coup: the router wants to become the active router;
- 2 = Resign: the router does no longer want to be the active router;
- State field (1 byte): it describes the current state of the router sending the message:
- 8 = Standby: the HSRP packet has been sent by the stand-by router;
- 16 = Active: the HSRP packet has been sent by the active router;
- Hello Time field (1 byte): it is the time between Hello messages sent by routers (default: 3 s);
- Hold Time field (1 byte): it is the time of validity for the current Hello message, at the expiry of which the stand-by router proposes itself as the active router (default: 10 s);
- Priority field (1 byte): it is the priority of the router used for the election process for the active/stand-by router (default: 100);
- Group field (1 byte): it identifies the group which the current HSRP instance is referring to: #HSRP groups
- Authentication Data field (8 bytes): it includes a clear-text 8-character-long password (default: 'cisco');
- Virtual IP Address field (4 bytes): it is the virtual IP address used by the group, that is the IP address used as the default gateway address by hosts in the corporate LAN.
With default values for Hello Time and Hold Time parameters, convergence time is equal to about 10 seconds.
HSRP groups allow to distinguish multiple logical IP networks over the same physical LAN: a HSRP group is corresponding to each IP network, with a pair virtual MAC address and virtual IP address. Hosts in an IP network have one of the virtual IP addresses set as their default gateway address, hosts in another IP network have another virtual IP address set as their default gateway address, and so on.
Each redundant router knows multiple pairs virtual MAC address and virtual IP address, one for each group → every router (except listen routers) generates a Hello packet for each group, and answers to one of the virtual MAC addresses on receiving traffic from hosts in an IP network, to another one on receiving traffic from hosts in another IP network, and so on.
The last 8 bits in the virtual MAC address identify the group which the address is referring to → HSRP is able to manage up to 256 different groups over the same LAN.
Defining multiple HSRP groups is mandatory if there are VLANs: every VLAN is in fact a separate LAN with its own default gateway → each VLAN is assigned a HSRP group. Every one-arm router has multiple virtual interfaces, one for every VLAN → HSRP groups are configured on the same physical interface but each one on different logical interfaces.
Multi-group HSRP (mHSRP)Edit
By a proper priority configuration, traffic from IP networks can be distributed over redundant routers (load sharing):
- network with multiple HSRP groups: traffic from IP network 1 crosses router R2, while traffic from IP network 2 crosses router R1;
- network with multiple HSRP groups where there are VLANs: traffic from VLAN 1 crosses router R2, while traffic from VLAN 2 crosses router R1.
- mHSRP is more convenient when incoming traffic for the LAN is symmetrical: a one-arm router for VLAN interconnection can be redounded so that a router sustains traffic incoming from a first VLAN and outgoing to a second VLAN, while the other router sustains traffic incoming from the second VLAN and outgoing to the first VLAN;
- better resource utilization: in a network with a single HSRP group the bandwidth of the stand-by router is altogether unused → mHSRP allows to use the bandwidth of both the routers.
- mHSRP is not so convenient when incoming traffic for the LAN is asymmetrical: load sharing in fact affects just outgoing traffic (incoming traffic is independent of HSRP), and outgoing (upload) traffic generally is lower with respect to incoming (download) traffic;
- load sharing does not necessarily imply traffic balancing: traffic coming from a LAN may be very higher than traffic coming from another LAN;
- configuration troubles: hosts in every IP network must have a different default gateway address with respect to hosts in other IP networks, but the DHCP server usually returns a single default gateway address for all hosts.
HSRP offers protection from failures of the link connecting the default gateway router to the LAN and from failures of the default gateway router itself, but not from failures of the link connecting the default gateway router to Internet: a failure on the WAN link in fact forces packets to be sent to the active router which in turn sends them all to the stand-by router, instead of immediately going to the stand-by router → this does not imply a real loss of internet connectivity, but implies an additional overhead in the packet forwarding process.
The track feature allows to detect failures on WAN links and trigger the stand-by router to take the place of the active router by automatically decreasing the priority of the active router (default: −10).
The track feature works only if the preemption capability is on: if the priority of the active router is decreased so as to bring it below the priority of the stand-by router, the latter can 'preempt' the active state from the active router by sending a Hello message of Coup type.
However, detecting failures happens exclusively at the physical layer: the track features is not able to detect a failure occurred on a farther link beyond a bridge.
HSRP does not protect against all the faults in the data-link-layer network. For example, the fault in the side figure partitions the corporate network into two parts, and since it happens between two bridges it can not be detected by routers at the physical layer. The stand-by router does no longer receive Hello messages from the active router and promotes itself as the active → outgoing traffic is not affected at all by the occurrence of a fault: each of the two routers serves the outgoing traffic from one of the two network portions.
The fault has instead an impact on incoming traffic, because some frames can not reach the destination hosts. Network-layer routing protocols in fact work exclusively from router to router: they just detect the path between the two router was broken somewhere, but they are not able to detect path breaks between a router and a host, because their task is to forward the packet so that it arrives at any of the edge routers, which then is in charge of the direct delivery of the frame to the final destination. As seen from outside, both the routers appear to have connectivity to the same IP network, therefore network-layer routing protocols will assume all the hosts belonging to that IP network can be reached through both the interfaces and choose any of them based on shortest path criterion:
- if the router serving the network portion the destination is belonging to is chosen, the frame is seamlessly delivered to the destination;
- if the router serving the other network portion is chosen, the router performs an ARP Request which no hosts will answer and so the destination will appear non-existing in the network.
Therefore it is important to redound all the links inside the data-link-layer network, by putting multiple links in parallel managed by the spanning tree protocol or configured in link aggregation.
In some network topologies, traffic asymmetrical routing may lead to a situation where in some periods of time the incoming traffic from outside sent in flooding increases considerably, while in other ones the incoming traffic is forwarded properly by bridges. This is due to the fact that router ARP caches generally last longer than bridge filtering databases.
For example, in the side figure mappings in the ARP cache on ingress router R2 expire in 5 minutes, while entries in the filtering database on bridge B2 expire in just 2 minutes:
- the outgoing unicast packet just updates filtering database on bridge B1, because it does not cross bridge B2;
- the incoming packet triggers router R2 to send an ARP Request to host H1 (in broadcast);
- the ARP Reply host H1 sends back updates both the ARP cache on router R2 and the filtering database on bridge B2;
- in the first 2 minutes, incoming packets addressed to host H1 are forwarded seamlessly;
- after 2 minutes since the ARP Reply, the entry related to host H1 in the filtering database on bridge B2 expires;
- in the following 3 minutes, router R2, which still has a valid mapping for host H1 in its ARP cache, sends incoming packets toward bridge B2, which sends them all in flooding because it does not receive frames having host H1 as their sources;
- after 5 minutes since the ARP Reply, the mapping in the ARP cache on router R2 expires too;
- the next ARP Reply solicited by router R2 at last updates also the filtering database on bridge R2.
- Possible solutions
This problem in the network is not easy to be identified because it manifests itself intermittently; once the problem is identified, it is possible to:
- force stations to send gratuitous broadcast frames more often, with a frequency lower than the ageing time of entries in bridge filtering databases;
- increase the ageing time value on bridges along the ingress path to at least the duration time of router ARP caches.
HSRP does not contemplates fault management on unidirectional links: for example, in the side figure a fault occurs toward the stand-by router, which does no longer receive Hello messages from the active router and elects itself as the active, starting sending Hello packets with the virtual MAC address as their source addresses → the bridge receives alternatively Hello packets from both the active routers having the same MAC address as their source addresses, and the entry related to that MAC address will keep oscillating periodically → if a host sends a frame to its default gateway while the entry in the bridge filtering database is associated to the former stand-by router, the frame, being unable to go through the unidirectional link, will be lost.
| Parts of this page are based on materials from:
Wikipedia: the free encyclopedia.
Gateway Load Balancing Protocol (GLBP) adds to the default gateway redundancy the capability of automatically distributing outgoing traffic over all redundant routers.
GLBP elects one Active Virtual Gateway (AVG) for each group; other group members, called Active Virtual Forwarders (AVF), act as backup in case of AVG failure. The elected AVG then assigns a virtual MAC address to each member of the GLBP group, including itself; each AVF assumes responsibility for forwarding packets sent to its virtual MAC address.
In case of an AVF failure, the AVG notifies one of the still active AVF entrusting it with the task of answering also traffic addressed toward the virtual MAC address of the faulted AVF.
The AVG answers ARP Requests sent by hosts with MAC addresses pointing to different routers, based on one of the following load balancing algorithms:
- none: the AVG is the only forwarder (as in HSRP);
- weighted: every router is assigned a weight, determining the percentage of ARP Requests answered with the virtual MAC address of that router, and therefore the percentage of hosts which will use that router as their forwarder → useful when exit links have different capacities;
- round robin: virtual MAC addresses are selected sequentially in a circular queue;
- host dependent: it guarantees that a host always keeps being associated to the same forwarder, that is if the host performs two ARP Requests it will receive two ARP Replies with the same virtual MAC address → this avoids problems with NAT address translation mechanisms.
- Please remember that the ARP Request is a data-link-layer frame with broadcast destination MAC address and with IP address in its payload.