Modeling of Failure Detector Based on Message Delay Prediction Mechanism

Failure detection is a key technology in tolerant system. Failure detectors without adaptive mechanism cannot meet the requirements of QOS (quality of service) of applications because of the variations of the network in actual distributed system. Adaptive failure detectors should dynamically adjust the detecting quality according to the variations of the real-time state of the network. Assuming that the delay and loss of the messages is a random probability, a failure detection model based on the predicted message delay is proposed in this paper. A PAC-AFD adaptive failure detection algorithm is realized based on the above model which is on the basis of the prediction from historical message delay and contains checking idea. Experimental results show that the algorithm can relieve the effect of the delay and loss of the message on the failure detection while ensuring the accuracy and completeness of detection.

[1]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[2]  Yair Amir,et al.  Transis: A Communication Sub-system for High Availability , 1992 .

[3]  Naixue Xiong,et al.  Comparative analysis of quality of service and memory usage for adaptive failure detectors in healthcare systems , 2009, IEEE Journal on Selected Areas in Communications.

[4]  Naohiro Hayashibara,et al.  Implementation and Performance Analysis of the φ-Failure Detector , 2003 .

[5]  Naixue Xiong,et al.  Design and analysis of quality of service on distributed fault-tolerant communication networks , 2008 .

[6]  Indranil Gupta,et al.  On scalable and efficient distributed failure detectors , 2001, PODC '01.

[7]  Yair Amir,et al.  Transis: a communication subsystem for high availability , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[8]  Takashi Chikayama,et al.  A scalable and efficient self-organizing failure detector for grid applications , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[9]  Michel Raynal,et al.  An adaptive failure detection protocol , 2001, Proceedings 2001 Pacific Rim International Symposium on Dependable Computing.

[10]  Pierre Sens,et al.  Implementation and performance evaluation of an adaptable failure detector , 2002, Proceedings International Conference on Dependable Systems and Networks.

[11]  Marcos K. Aguilera,et al.  On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.