An efficient adaptive failure detection mechanism for cloud platform based on volterra series

Failure detection module is one of important components in fault-tolerant distributed systems, especially cloud platform. However, to achieve fast and accurate detection of failure becomes more and more difficult especially when network and other resources' status keep changing. This study presented an efficient adaptive failure detection mechanism based on volterra series, which can use a small amount of data for predicting. The mechanism uses a volterra filter for time series prediction and a decision tree for decision making. Major contributions are applying volterra filter in cloud failure prediction, and introducing a user factor for different QoS requirements in different modules and levels of IaaS. Detailed implementation is proposed, and an evaluation is performed in Beijing and Guangzhou experiment environment.