PAKDD2020 Alibaba AI Ops Competition: An SPE-LightGBM Approach

This paper describes our submission to the PAKDD2020 Alibaba AI Ops Competition. We regard the hard driver disk failure prediction problem as a binary classification problem. Our approach is based on self-paced ensemble (SPE) [9] and a light gradient boosting machine (LightGBM) [8]. With three types of feature (raw feature, window-based feature and combined raw feature) and our proposed training sample selection strategy, our approach achieved rank 14 in the final standings with F-score (defined in [1]) of 0.37. The code for our approach can be found in https://github.com/fengyang95/Alibaba_AI_Ops_Competition_Rank14.

[1]  Joseph F. Murray,et al.  Improved disk-drive failure warnings , 2002, IEEE Trans. Reliab..

[2]  Tie-Yan Liu,et al.  Self-paced Ensemble for Highly Imbalanced Massive Data Classification , 2019, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[3]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[4]  Joseph F. Murray,et al.  Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application , 2005, J. Mach. Learn. Res..

[5]  Greg Hamerly,et al.  Bayesian approaches to failure prediction for disk drives , 2001, ICML.

[6]  Gang Wang,et al.  Proactive drive failure prediction for large scale storage systems , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  S. Menard Applied Logistic Regression Analysis , 1996 .

[8]  Jing Shen,et al.  Random-forest-based failure prediction for hard disk drives , 2018, Int. J. Distributed Sens. Networks.

[9]  Tie-Yan Liu,et al.  Health Status Assessment and Failure Prediction for Hard Drives with Recurrent Neural Networks , 2016, IEEE Transactions on Computers.

[10]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[11]  Fred Douglis,et al.  RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures , 2015, FAST.

[12]  Hong Jiang,et al.  P3: Priority based proactive prediction for soon-to-fail disks , 2015, 2015 IEEE International Conference on Networking, Architecture and Storage (NAS).

[13]  Qiang Miao,et al.  Health monitoring of hard disk drive based on Mahalanobis distance , 2011, 2011 Prognostics and System Health Managment Confernece.

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.