Failure modeling in a gas turbine system: Combining classification with anomaly detection models for two data selection strategies

In this work, the sensor data from a gas turbine system is analyzed with the objective of failure modeling and prediction. Several maintenance incidents were recorded by the sensor system in two separate vehicles. Two approaches to selecting training data were used in the analysis. The first followed a traditional method of randomly selecting a certain percentage of data points to include in training. The second data selection strategy was to select certain incidents to include in training, with the remaining incidents unseen for testing. Using classifier and anomaly detection techniques, models of the system using 76 predictor variables were trained to distinguish between healthy and failed system states in a two-class problem. Significant differences in performance results were noted depending on the selection of data included in training. A rule-based classifier model was then applied to leverage the predictions from both the classifier and anomaly detection models yielding promising results. The construction of an ensemble model was an effective way to mitigate the challenges presented in the training strategies, where a single individual model would not succeed in both scenarios. The simplification of the system into two states could be regarded as restrictive when the ‘healthiness’ of a system is nuanced; however, despite this simplification, good performance and accurate predictions could still be achieved.

[1]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[2]  Julio J. Valdés,et al.  Failure Modeling of a Propulsion Subsystem: Unsupervised and Semi-Supervised Approaches to Anomaly Detection , 2019, Int. J. Pattern Recognit. Artif. Intell..

[3]  Stephen M. Omohundro,et al.  Five Balltree Construction Algorithms , 2009 .

[4]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[5]  Olga Fink,et al.  Unsupervised Fault Detection in Varying Operating Conditions , 2019, 2019 IEEE International Conference on Prognostics and Health Management (ICPHM).

[6]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[7]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[12]  Marc Parizeau,et al.  DEAP: a python framework for evolutionary algorithms , 2012, GECCO '12.

[13]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[14]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[15]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[18]  Rakia Jaziri,et al.  Hybrid approach for Anomaly Detection in Time Series Data , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[19]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Julio J. Valdés,et al.  Low-Dimensional Spaces for Relating Sensor Signals with Internal Data Structure in a Propulsion System , 2018 .

[22]  Song Fu,et al.  A novel unsupervised anomaly detection for gas turbine using Isolation Forest , 2019, 2019 IEEE International Conference on Prognostics and Health Management (ICPHM).