A Novel Linear Classifier for Class Imbalance Data Arising in Failure-Prone Air Pressure Systems

An Air Pressure System (APS) is one of the crucial components of an automobile. Its failure leads to financial loses, and it may lead to loss of lives. Thus predicting such failure is a critical problem that requires a rigorous solution. Recently, many researchers have presented machine learning techniques to deal with APS failure detection. One of the major challenges in dealing with APS failure data is the presence of high class imbalance. Conventional classification criteria may not be able to efficiently handle such data. In this paper, a new machine learning method for APS failure detection is proposed. It is designed to specifically deal with the class imbalance. The method uses a linear decision boundary by maximizing Area Under the Curve (maxAUC) criterion. The proposed method was experimentally validated on an industrial dataset of APS failure. The results of the proposed method are thoroughly compared with existing linear as well as non-linear classifiers.

[1]  C. Lee Giles,et al.  Learning on the border: active learning in imbalanced data classification , 2007, CIKM '07.

[2]  MengChu Zhou,et al.  An embedded feature selection method for imbalanced data classification , 2019, IEEE/CAA Journal of Automatica Sinica.

[3]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[4]  Eleonora Peruffo Improving predictive maintenance classifiers of industrial sensors' data using entropy. A case study , 2018 .

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  Byoung-Tak Zhang,et al.  Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Yuan-Hai Shao,et al.  An efficient weighted Lagrangian twin support vector machine for imbalanced data classification , 2014, Pattern Recognit..

[8]  Boyan Jia,et al.  Integrative Design of an Emergency Resource Predicting-Scheduling-Repairing Method for Rail Track Faults , 2019, IEEE Access.

[9]  MengChu Zhou,et al.  An online fault detection model and strategies based on SVM-grid in clouds , 2018, IEEE/CAA Journal of Automatica Sinica.

[10]  Jonas Biteus,et al.  Planning Flexible Maintenance for Heavy Trucks using Machine Learning Models, Constraint Programming, and Route Optimization , 2017 .

[11]  Hien M. Nguyen,et al.  Borderline over-sampling for imbalanced data classification , 2009, Int. J. Knowl. Eng. Soft Data Paradigms.

[12]  MengChu Zhou,et al.  Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems , 2017, IEEE Transactions on Intelligent Transportation Systems.

[13]  Daniel Hafner,et al.  Prediction of Failures in the Air Pressure System of Scania Trucks Using a Random Forest and Feature Engineering , 2016, IDA.

[14]  Dip Nandi,et al.  An Empirical Comparison of Missing Value Imputation Techniques on APS Failure Prediction , 2019, International Journal of Information Technology and Computer Science.

[15]  Juan José Rodríguez Diez,et al.  Random Balance: Ensembles of variable priors classifiers for imbalanced data , 2015, Knowl. Based Syst..

[16]  Ajith Kumar Parlikad,et al.  Generating Real-valued Failure Data for Prognostics Under the Conditions of Limited Data Availability , 2019, 2019 IEEE International Conference on Prognostics and Health Management (ICPHM).

[17]  Mario A. Nascimento,et al.  IDA 2016 Industrial Challenge: Using Machine Learning for Predicting Failures , 2016, IDA.

[18]  MengChu Zhou,et al.  A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification , 2017, IEEE Transactions on Cybernetics.

[19]  Nabeel Salih Ali,et al.  AN EFFICIENT HYBRID MODEL FOR RELIABLE CLASSIFICATION OF HIGH DIMENSIONAL DATA USING K-MEANS CLUSTERING AND BAGGING ENSEMBLE CLASSIFIER , 2018 .

[20]  Christy Jose,et al.  An Improved Random Forest Algorithm for classification in an imbalanced dataset. , 2019, 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC).

[21]  Medhat Moussa,et al.  Deep Learning for Intelligent Transportation Systems: A Survey of Emerging Trends , 2020, IEEE Transactions on Intelligent Transportation Systems.

[22]  Julio López,et al.  Imbalanced data classification using second-order cone programming support vector machines , 2014, Pattern Recognit..

[23]  MengChu Zhou,et al.  A Distance-Based Weighted Undersampling Scheme for Support Vector Machines and its Application to Imbalanced Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Naiqi Wu,et al.  SVM-DT-based adaptive and collaborative intrusion detection , 2018, IEEE/CAA Journal of Automatica Sinica.