Hidden Markov Model for hard-drive failure detection

This paper illustrates the use of Hidden Markov Model (HMM) to model hard disk failure. The reason we use HMM is because HMM is a formal foundation for making probabilistic models of linear sequence `labeling' problem. We use the database provided by University of California, San Diego for detection of hard-drive failure. We have selected 24 attributes and obtain accuracy of about 90%. We compare machine-learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and non-parametrically distributed data. We develop a new algorithm HMM which is specifically designed for the low false-alarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and non-parametric statistical tests (rank-sum and reverse arrangements). The failure-prediction performance of the SVM, rank-sum and mi-NB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates [13]. Our results suggest that non-parametric statistical tests should be considered for learning problems involving detecting rare events.