On parameter settings of network keep-alive protocol for failure detection

The keep-alive protocol (or Hello protocol) relying on exchanging periodical keep-alive or Hello messages is often used by many network protocols to detect link, node or other network-related failures. The effectiveness of the Hello protocol depends on the proper setting of its two parameters. Most existing work on the study of the parameter settings was related to a specific environment in connection with a specific network protocol. Independently of any given associated network protocol, this paper studies the impact of network overload, especially the message loss probability, on the choice of these two parameters through both analysis and simulations. The studies show that in lightly-loaded networks with small message transmission loss probability both parameters of the Hello protocol can be set to small values and we can get a fast failure detection time and small number of false alarms. However, in heavily-loaded networks with large message transmission loss probability, the Hello protocol is not effective at all for any parameter values. In normal situations, adaptive changes of the parameters of the Hello protocol are useful.