Heartbeat Configuration
Heartbeat configuration requires three files located in /etc/ha.d
. The ha.cf
contains the main heartbeat configuration, including the list of the nodes and times for identifying failures. haresources
contains the list of resources to be managed within the cluster. The authkeys
file contains the security information for the cluster.
The contents of these files should be identical on each host within the Heartbeat cluster. It is important that you keep these files in sync across all the hosts. Any changes in the information on one host should be copied to the all the others.
For these examples n example of the ha.cf
file is shown below:
logfacility local0 keepalive 500ms deadtime 10 warntime 5 initdead 30 mcast bond0 225.0.0.1 694 2 0 mcast bond1 225.0.0.2 694 1 0 auto_failback off node drbd1 node drbd2
The individual lines in the file can be identified as follows:
logfacility
: Sets the logging, in this case setting the logging to use syslog.keepalive
: Defines how frequently the heartbeat signal is sent to the other hosts.deadtime
- the delay in seconds before other hosts in the cluster are considered 'dead' (failed).warntime
: The delay in seconds before a warning is written to the log that a node cannot be contacted.initdead
: The period in seconds to wait during system startup before the other host is considered to be down.mcast
: Defines a method for sending a heartbeat signal. In the above example, a multicast network address is being used over a bonded network device. If you have multiple clusters then the multicast address for each cluster should be unique on your network. Other choices for the heartbeat exchange exist, including a serial connection.If you are using multiple network interfaces (for example, one interface for your server connectivity and a secondary or bonded interface for your DRBD data exchange), use both interfaces for your heartbeat connection. This decreases the chance of a transient failure causing a invalid failure event.
auto_failback
: Sets whether the original (preferred) server should be enabled again if it becomes available. Switching this toon
may cause problems if the preferred went offline and then comes back on line again. If the DRBD device has not been synced properly, or if the problem with the original server happens again you may end up with two different datasets on the two servers, or with a continually changing environment where the two servers flip-flop as the preferred server reboots and then starts again.node
: Sets the nodes within the Heartbeat cluster group. There should be onenode
for each server.
An optional additional set of information provides the configuration for a ping test that checks the connectivity to another host. Use this to ensure that you have connectivity on the public interface for your servers, so the ping test should be to a reliable host such as a router or switch. The additional lines specify the destination machine for the ping
, which should be specified as an IP address, rather than a host name; the command to run when a failure occurs, the authority for the failure and the timeout before an nonresponse triggers a failure. A sample configure is shown below:
ping 10.0.0.1 respawn hacluster /usr/lib64/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster deadping 5
In the above example, the ipfail command, which is part of the Heartbeat solution, is called on a failure and 'fakes' a fault on the currently active server. Configure the user and group ID under which the command is executed (using the apiauth
). The failure is triggered after 5 seconds.Note
The deadping
value must be less than the deadtime
value.
The authkeys
file holds the authorization information for the Heartbeat cluster. The authorization relies on a single unique 'key' that is used to verify the two machines in the Heartbeat cluster. The file is used only to confirm that the two machines are in the same cluster and is used to ensure that the multiple clusters can co-exist within the same network.