Fault detection and explanation through big data analysis on sensor streams

Abstract Fault prediction is an important topic for the industry as, by providing effective methods for predictive maintenance, allows companies to perform important time and cost savings. In this paper we describe an application developed to predict and explain door failures on metro trains. To this end, the aim was twofold: first, devising prediction techniques capable of early detecting door failures from diagnostic data; second, describing failures in terms of properties distinguishing them from normal behavior. Data pre-processing was a complex task aimed at overcoming a number of issues with the dataset, like size, sparsity, bias, burst effect and trust. Since failure premonitory signals did not share common patterns, but were only characterized as non-normal device signals, fault prediction was performed by using outlier detection. Fault explanation was finally achieved by exhibiting device features showing abnormal values. An experimental evaluation was performed to assess the quality of the proposed approach. Results show that high-degree outliers are effective indicators of incipient failures. Also, explanation in terms of abnormal feature values (responsible for outlierness) seems to be quite expressive.There are some aspects in the proposed approach that deserve particular attention. We introduce a general framework for the failure detection problem based on an abstract model of diagnostic data, along with a formal problem statement. They both provide the basis for the definition of an effective data pre-processing technique where the behavior of a device, in a given time frame, is summarized through a number of suitable statistics. This approach strongly mitigates the issues related to data errors/noise, thus enabling to perform an effective outlier detection. All this, in our view, provides the grounds of a general methodology for advanced prognostic systems.

[1]  Jay Lee,et al.  Recent advances and trends in predictive manufacturing systems in big data environment , 2013 .

[2]  Pascal Poncelet,et al.  Anomaly detection in monitoring sensor data for preventive maintenance , 2011, Expert Syst. Appl..

[3]  Chen Wang,et al.  An IoT Application for Fault Diagnosis and Prediction , 2015, 2015 IEEE International Conference on Data Science and Data Intensive Systems.

[4]  Frederik Janssen,et al.  Advances in Predictive Maintenance for a Railway Scenario - Project Techlok , 2015 .

[5]  Zhuang Wang,et al.  Log-based predictive maintenance , 2014, KDD.

[6]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.

[7]  Rita P. Ribeiro,et al.  Failure Prediction - An Application in the Railway Industry , 2014, Discovery Science.

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Eamonn J. Keogh,et al.  Efficient Long-Term Degradation Profiling in Time Series for Complex Physical Systems , 2015, KDD.

[10]  Stephen Jose Hanson,et al.  A Neural Network Autoassociator for Induction Motor Failure Prediction , 1995, NIPS.

[11]  Giuseppe Manco,et al.  Rialto: A Knowledge Discovery suite for data analysis , 2016, Expert Syst. Appl..

[12]  Huan Liu,et al.  Bias analysis in text classification for highly skewed data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[13]  Ying Peng,et al.  Current status of machine prognostics in condition-based maintenance: a review , 2010 .

[14]  Dunja Mladenic,et al.  Feature Selection for Unbalanced Class Distribution and Naive Bayes , 1999, ICML.

[15]  Ira Assent,et al.  Explaining Outliers by Subspace Separability , 2013, 2013 IEEE 13th International Conference on Data Mining.

[16]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[17]  Srinivas Katipamula,et al.  Review Article: Methods for Fault Detection, Diagnostics, and Prognostics for Building Systems—A Review, Part I , 2005 .

[18]  Markus Bohlin,et al.  Statistical Anomaly Detection for Train Fleets , 2012, AI Mag..

[19]  Hong-Zhong Huang,et al.  Support vector machine based estimation of remaining useful life: current research status and future trends , 2015, Journal of Mechanical Science and Technology.

[20]  Fuzhen Zhuang,et al.  Online Frequent Episode Mining , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[21]  Luigi Palopoli,et al.  Outlying property detection with numerical attributes , 2013, Data Mining and Knowledge Discovery.

[22]  Hong-Bae Jun,et al.  On condition based maintenance policy , 2015, J. Comput. Des. Eng..

[23]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[24]  Frank L. Lewis,et al.  Intelligent Fault Diagnosis and Prognosis for Engineering Systems , 2006 .

[25]  Frederik Janssen,et al.  On the Challenges of Real World Data in Predictive Maintenance Scenarios: A Railway Application , 2015, LWA.

[26]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Frederik Janssen,et al.  Learning to Predict Component Failures in Trains , 2014, LWA.

[28]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[29]  Enrico Zio,et al.  Extreme learning machines for predicting operation disruption events in railway systems , 2013 .