Hierarchical index system for crash-stop service failure detection

Service Availability is an important method meeting adaptive and general requirements of service failure detection and largely affected by failure detection indexes. Therefore, how to select appropriate indicators of service failure detection is an important issue. However, only small number of indicators in the existing service availability mechanism can't accurately describe the complex situation of a distributed system, thus difficult to meet the QoS requirements of applications. After researching on existing service failure detection algorithms, applications' failures and network survivability, this paper proposes a hierarchical index system for crash-stop service failure detection that considers ability to provide services, network performance, and load brought by failure detection.

[1]  Hai Jin,et al.  ALTER: adaptive failure detection services for grids , 2005, 2005 IEEE International Conference on Services Computing (SCC'05) Vol-1.

[2]  Marcos K. Aguilera,et al.  On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[3]  Naohiro Hayashibara,et al.  The φ Accrual Failure Detector , 2004 .

[4]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[5]  Pierre Sens,et al.  Implementation and performance evaluation of an adaptable failure detector , 2002, Proceedings International Conference on Dependable Systems and Networks.

[6]  Chen Ning Adaptive Failure Detection in Web Application Server , 2005 .

[7]  Dong Jian,et al.  An Adaptive Failure Detector for Grid Based on QoS , 2006 .

[8]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.