Predicting the need for vehicle compressor repairs using maintenance records and logged vehicle data

Methods and results are presented for applying supervised machine learning techniques to the task of predicting the need for repairs of air compressors in commercial trucks and buses. Prediction models are derived from logged on-board data that are downloaded during workshop visits and have been collected over three years on a large number of vehicles. A number of issues are identified with the data sources, many of which originate from the fact that the data sources were not designed for data mining. Nevertheless, exploiting this available data is very important for the automotive industry as means to quickly introduce predictive maintenance solutions. It is shown on a large data set from heavy duty trucks in normal operation how this can be done and generate a profit.Random forest is used as the classifier algorithm, together with two methods for feature selection whose results are compared to a human expert. The machine learning based features outperform the human expert features, which supports the idea to use data mining to improve maintenance operations in this domain.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Mark Schwabacher,et al.  A Survey of Data -Driven Prognostics , 2005 .

[3]  K. Goebel,et al.  Metrics for evaluating performance of prognostic techniques , 2008, 2008 International Conference on Prognostics and Health Management.

[4]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[5]  Taghi M. Khoshgoftaar,et al.  Experimental perspectives on learning from imbalanced data , 2007, ICML '07.

[6]  F. Gu,et al.  Fault detection and diagnosis using Principal Component Analysis of vibration data from a reciprocating compressor , 2012, Proceedings of 2012 UKACC International Conference on Control.

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Erik Frisk,et al.  Data-Driven Lead-Acid Battery Prognostics Using Random Survival Forests , 2014 .

[9]  James Stuart Tanton,et al.  Encyclopedia of Mathematics , 2005 .

[10]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[11]  Manoj Kumar Tiwari,et al.  Data mining in manufacturing: a review based on the kind of knowledge , 2009, J. Intell. Manuf..

[12]  Donghua Zhou,et al.  Remaining useful life estimation - A review on the statistical data driven approaches , 2011, Eur. J. Oper. Res..

[13]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[14]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[15]  Slawomir Nowaczyk,et al.  Analysis of Truck Compressor Failures Based on Logged Vehicle Data , 2013, ICDM 2013.

[16]  Linxia Liao,et al.  Review of Hybrid Prognostics Approaches for Remaining Useful Life Prediction of Engineered Systems, and an Application to Battery Life Prediction , 2014, IEEE Transactions on Reliability.

[17]  Jin Jiang,et al.  Applications of fault detection and diagnosis methods in nuclear power plants: A review , 2011 .

[18]  Gabriela Medina-Oliva,et al.  Predictive diagnosis based on a fleet-wide ontology approach , 2014, Knowl. Based Syst..

[19]  Dimitar Filev,et al.  Intelligent systems in the automotive industry: applications and trends , 2007, Knowledge and Information Systems.

[20]  Ying Peng,et al.  Current status of machine prognostics in condition-based maintenance: a review , 2010 .

[21]  Jerzy Stefanowski,et al.  BRACID: a comprehensive approach to learning rules from imbalanced data , 2011, Journal of Intelligent Information Systems.

[22]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[23]  Daming Lin,et al.  A review on machinery diagnostics and prognostics implementing condition-based maintenance , 2006 .

[24]  Dnyanesh G. Rajpathak An ontology based text mining system for knowledge discovery from the diagnosis data in the automotive domain , 2013, Comput. Ind..

[25]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[26]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[27]  Lin Ma,et al.  Prognostic modelling options for remaining useful life estimation by industry , 2011 .

[28]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[29]  Jirachai Buddhakulsomsiri,et al.  Sequential pattern mining algorithm for automotive warranty data , 2009, Comput. Ind. Eng..

[30]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .