is the ability of a BIG-IP®
system to monitor certain aspects of the system or network, detect interruptions, and consequently take some action. More specifically:
| || |System fail-safe
Monitors the switch board component and a set of key system services.
To configure and manage fail-safe, log in to the BIG-IP Configuration
utility, and on the Main tab, expand System
, and click High Availability
When you configure system fail-safe, the BIG-IP system monitors various
hardware components, as well as the heartbeat of various system services, and can take action if the system detects a heartbeat failure.
You can configure the BIG-IP system to monitor the switch board
component and then take some action if the BIG-IP system detects a failure.
Using the Configuration utility, you can specify the action that you want the
BIG-IP system to take when the component fails. Possible actions that the BIG-IP system can take are:
You can specify the particular action that you want the BIG-IP system to
take when the heartbeat of a system service fails. The following table lists each system service, and shows the possible actions that the BIG-IP system can take in the event of a heartbeat failure.
For maximum reliability, the BIG-IP system supports failure detection on all
VLANs. When you configure the fail-safe option on a VLAN, the BIG-IP system monitors network traffic going through that VLAN. If the BIG-IP system detects a loss of traffic on the VLAN and the fail-safe timeout period has elapsed, the BIG-IP system attempts to generate traffic by issuing ARP requests to nodes accessible through the VLAN. The BIG-IP system also generates an ARP request for the default route, if the default router is accessible from the VLAN. Failover is averted if the BIG-IP system is able to send and receive any traffic on the VLAN, including a response to its ARP request.
For a redundant system configuration, if the BIG-IP system does not receive
traffic on the VLAN before the timeout period expires, the system can initiate failover and switch control to the standby device, reboot, or restart all system services. The default action is Reboot
Each interface card installed on the BIG-IP system is typically mapped to a
different VLAN. Thus, when you set the fail-safe option on a particular VLAN, you need to know the interface to which the VLAN is mapped. You can use the Configuration utility to view VLAN names and their associated interfaces.
The BIG-IP system includes a feature known as fast failover. Fast failover
is a feature that is based on the concept of an HA group. An HA group
is a set of trunks, pools, or clusters (or any combination of these) that you want the BIG-IP system to use to calculate an overall health score for a device in a redundant system configuration. A health score is based on the number of members that are currently available for any trunks, pools, and clusters in the HA group, combined with a weight that you assign to each trunk, pool, and cluster. The device that has the best overall score at any given time becomes or remains the active device.
To configure and manage fast failover, log in to the BIG-IP Configuration
utility, and on the Main tab, expand System
, and click High Availability
Note: Only VIPRION®
systems can have a cluster as an object in an HA group. For all other platforms, HA group members consist of pools and trunks only.
An HA group is typically configured to fail over based on trunk health in
particular. Trunk configurations are not synchronized between units, which means that the number of trunk members on the two units often differs whenever a trunk loses or gains members. The HA group feature allows failover to occur based on changes to trunk health instead of on system or VLAN failure.
To summarize, when you configure the HA group, the process of one
BIG-IP device failing over to the other based on HA scores is noticeably faster than if failover occurs due to a hardware or daemon failure.
is a health value that you assign to each object in the HA group (that is, pool, trunk, and cluster). The weight that you assign to each object must be in the range of 10
. The maximum overall score that the BIG-IP system can potentially calculate for a device is the sum of the individual weights for the HA group objects, plus the active bonus value. (For information on the Active Bonus
setting, see Specifying an active bonus
shows an example of how the system calculates a score for the device, based solely on the weight of objects in the HA group. In this example, the HA group contains two pools (my_http_pool
) and one trunk (my_trunk1
). A user has assigned a weight to each object.
On each device, the system uses each weight, along with a percentage that
the system derives for each object (the percentage of the objects members that are available), to calculate a score for each object.
The system then adds the scores to determine a total score for the device.
The device with the highest score becomes or remains the active device in the redundant system configuration.
Note that if you have configured VLAN fail-safe, and the VLAN fails on an
active device, the device goes offline regardless of its score, and its peer becomes active.
For each object in an HA group, you can specify an optional setting known
as a threshold. A threshold
is a value that specifies the number of object members that must be available to prevent failover. If the number of available members dips below the threshold, the BIG-IP system assigns a score of 0
to the object, so that the score of that object no longer contributes to the overall score of the device.
For example, if a trunk in the HA group has four members and you specify a
threshold value of 3
, and the number of available trunk members falls to 2, then the trunk contributes a score of 0
to the total device score.
If the number of available object members equals or exceeds the threshold
value, or you do not specify a threshold, the BIG-IP system calculates the score as described previously, by multiplying the percentage of available object members by the weight for each object and then adding the scores to determine the overall device score.
Tip: Do not configure the tmsh
on any pool that you intend to include in the HA group.
An active bonus
is an amount that the BIG-IP system automatically adds to the overall score of the active device. An active bonus ensures that the active device remains active when the devices score would otherwise temporarily fall below the score of the standby device.
A common reason to specify an active bonus is to prevent failover due to flapping
, the condition where failover occurs frequently as a trunk member toggles between availability and unavailability. In this case, you might want to prevent the HA scoring feature from triggering failover each time a trunk member is lost. You might also want to prevent the HA scoring feature from triggering failover when you make minor changes to the BIG-IP system configuration, such as adding or removing a trunk member.
Suppose that the HA group on each device
contains a trunk with four members, and you assign a weight of 30
to each trunk. Without an active bonus defined, if the trunk on device 1 loses some number of members, failover occurs because the overall calculated score for device 1 becomes lower than that of device 2. Table 8.3
shows the scores that could result if the trunk on device 1 loses one trunk member and no active bonus is specified.
You can prevent this failover from occur i ng by specifying an active bonus
value. In our example, if we specify an active bonus of 10
(the default value), the score of the active device changes from 23
, thereby ensuring that the score of the active device remains equal to or higher than that of the standby device (30
Although you specify an active bonus value on each device, the BIG-IP
system uses the active bonus specified on the active device only, to contribute to the score of the active device. The BIG-IP system never uses the active bonus on the standby device to contribute to the score of the standby device.
To decide on an active bonus value, calculate the trunk score for some
number of failed members (such as one of four members), and then specify an active bonus that results in a trunk score that is greater than or equal to the weight that you assigned to the trunk.
For example, if you assigned a weight of 30
to the trunk, and one of the four trunk members fails, the trunk score becomes 23
(75% of 30
), putting the device at risk for failover. However, if you specified an active bonus of 7
or higher, failover would not actually occur, because a score of 7
or higher, when added to the score of 23
, is greater than or equal to 30
You can prevent failover from occur i ng by specifying an active bonus
value. Table 8.4
, and the list that follows, show how configuring an active bonus for the active device can affect failover.
To help understand Table 8.4
, the row numbers in the left column of the table correspond to the explanations below:
1 Device 1 is active (initial state)
With all trunk members available on both units, and the active bonus configured, the active device (device 1) retains the higher device score and therefore remains active.
2 Device 1 loses a trunk member
The device score for device 1 is still higher than the score for device 2, due to an active bonus value of 10
3 Device 1 loses another trunk member
With an active bonus of 10
, failover occurs when 50% of the members are lost.
4 Device 1 switches to standby mode and device 2 becomes active
Once the active device (device 1) has failed over to device 2, the active bonus on device 1 no longer applies, thus reducing its score from 25
. The active bonus on device 2 is then applied, increasing device 2s score from 30
5 Device 2 loses a trunk member
If the active device (device 2) loses a trunk member, the score on device 2 is still higher than device 1 (with two unavailable members), due to the active bonus.
6 Device 1 regains two trunk member
Device 2 remains the active device even when one trunk member is unavailable, due to the active bonus.