PAKDD 2020 Alibaba AIOps Competition - Large-Scale Disk Failure Prediction: Third Place Team

This paper describes our submission to the PAKDD 2020 Alibaba AIOps Competition: Large-scale Disk Failure Prediction. Our approach is based on LightGBM classifier with focal loss objective function. The method ranks third with a F1-score of 0.4047 in the final competition season, while the winning F1-score is 0.4903.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[3]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Wenjun Yang,et al.  Hard Drive Failure Prediction Using Big Data , 2015, 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshop (SRDSW).

[5]  Joseph F. Murray,et al.  Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application , 2005, J. Mach. Learn. Res..

[6]  Kamaljit Kaur,et al.  Failure Prediction, Lead Time Estimation and Health Degree Assessment for Hard Disk Drives Using Voting based Decision Trees , 2019, Computers, Materials & Continua.

[7]  Peng Li,et al.  Improving Service Availability of Cloud Systems by Predicting Disk Error , 2018, USENIX ATC.

[8]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[9]  Scott A. Brandt,et al.  Reliability mechanisms for very large storage systems , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[10]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[11]  Greg Hamerly,et al.  Bayesian approaches to failure prediction for disk drives , 2001, ICML.

[12]  Tommy W. S. Chow,et al.  A Two-Step Parametric Method for Failure Prediction in Hard Disk Drives , 2014, IEEE Transactions on Industrial Informatics.