Discovering Rules from Disk Events for Predicting Hard Drive Failures

Detecting impending failure of hard disks is an important prediction task which might help computer systems to prevent loss of data and performance degradation. Currently most of the hard drive vendors support self-monitoring, analysis and reporting technology (SMART) which are often considered unreliable for such tasks. The problem of finding alternatives to SMART for predicting disk failure is an area of active research. In this paper, we consider events recorded from live disks and show that it is possible to construct decision support systems which can detect such failures. It is desired that any such prediction methodology should have high accuracy and ease of interpretability. Black box models can deliver highly accurate solutions but do not provide an understanding of events which explains the decision given by it. To this end we explore rule based classifiers for predicting hard disk failures from various disk events. We show that it is possible to learn easy to understand rules, from disk events, which have extremely low false alarm rates on real world data.