Feature Extraction and Ensemble Decision Tree Classifier in Plant Failure Detection

This paper describes a set of algorithms used to tackle the plant prognostic problem provided in the IEEE 2015 PHM Data Challenge. The task is to detect failure events by analyzing a dataset including sensor measurements and control reference signals of multiple plants without prior knowledge. There are two main difficulties lies in the data challenge. One is to identify which of the faults will occur. And the other is when the fault is going to happen. In this study, the authors tried to transform the task issue into a classification problem by three key steps including: 1) data cleansing and event time alignment; 2) feature extraction; 3) application of the ensemble decision tree classifiers. Results show that the proposed data-driven methods can effectively detect several types of the failure events, which may be promising in the real world plant prognostic applications.

[1]  Rolf Isermann,et al.  Fault-Diagnosis Applications: Model-Based Condition Monitoring: Actuators, Drives, Machinery, Plants, Sensors, and Fault-tolerant Systems , 2011 .

[2]  Raghunathan Rengaswamy,et al.  A review of process fault detection and diagnosis: Part I: Quantitative model-based methods , 2003, Comput. Chem. Eng..

[3]  S. Joe Qin,et al.  Data-driven Fault Detection and Diagnosis for Complex Industrial Processes , 2009 .

[4]  Niklaus E. Zimmermann,et al.  Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods , 2006 .

[5]  M. Cappo,et al.  Tracing the life history of individual barramundi using laser ablation MC-ICP-MS Sr-isotopic and Sr/Ba ratios in otoliths , 2005 .

[6]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[7]  Mark Schwabacher,et al.  A Survey of Data -Driven Prognostics , 2005 .

[8]  Rolf Isermann,et al.  Model-based fault-detection and diagnosis - status and applications , 2004, Annu. Rev. Control..

[9]  Liu Xiao-ying Fast Subsequence Matching in Time-series Database , 2008 .

[10]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[13]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[14]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[15]  Rolf Isermann,et al.  Trends in the Application of Model Based Fault Detection and Diagnosis of Technical Processes , 1996 .

[16]  J. Friedman Stochastic gradient boosting , 2002 .

[17]  Rolf Isermann,et al.  Process Fault Detection Based on Modeling and Estimation Methods , 1982 .

[18]  H. Diener,et al.  Aspirin and clopidogrel compared with clopidogrel alone after recent ischaemic stroke or transient ischaemic attack in high-risk patients (MATCH): randomised, double-blind, placebo-controlled trial , 2004, The Lancet.

[19]  Hamid Reza Karimi,et al.  Data-driven design of robust fault detection system for wind turbines , 2014 .

[20]  Rick L. Lawrence,et al.  Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysis , 2004 .

[21]  Steven X. Ding,et al.  A Survey of Fault Diagnosis and Fault-Tolerant Techniques—Part I: Fault Diagnosis With Model-Based and Signal-Based Approaches , 2015, IEEE Transactions on Industrial Electronics.

[22]  Rolf Isermann,et al.  Supervision, fault-detection and diagnosis methods – a short introduction , 2011 .

[23]  Paul M. Frank,et al.  Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: A survey and some new results , 1990, Autom..

[24]  Keith D. Shepherd,et al.  Rapid characterization of Organic Resource Quality for Soil and Livestock Management in Tropical Agroecosystems Using Near Infrared Spectroscopy. , 2003 .

[25]  Ian Postlethwaite,et al.  Survey and application of sensor fault detection and isolation schemes , 2011 .

[26]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[27]  B. Roe,et al.  Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.