Unsupervised feature selection using swarm intelligence and consensus clustering for automatic fault detection and diagnosis in Heating Ventilation and Air Conditioning systems

Graphical abstractDisplay Omitted HighlightsOur algorithm aims to improve the feature quality in general fault diagnosis system.The algorithm filters out redundant features using consensus evolutionary clustering.The algorithm was tested on the ASHRAE-1312-RP experimental fault data.Sensitivity & specificity were >95%, with considerably less false positives up to as low as 1.6%. Various sensory and control signals in a Heating Ventilation and Air Conditioning (HVAC) system are closely interrelated which give rise to severe redundancies between original signals. These redundancies may cripple the generalization capability of an automatic fault detection and diagnosis (AFDD) algorithm. This paper proposes an unsupervised feature selection approach and its application to AFDD in a HVAC system. Using Ensemble Rapid Centroid Estimation (ERCE), the important features are automatically selected from original measurements based on the relative entropy between the low- and high-frequency features. The materials used is the experimental HVAC fault data from the ASHRAE-1312-RP datasets containing a total of 49 days of various types of faults and corresponding severity. The features selected using ERCE (Median normalized mutual information (NMI)=0.019) achieved the least redundancies compared to those selected using manual selection (Median NMI=0.0199) Complete Linkage (Median NMI=0.1305), Evidence Accumulation K-means (Median NMI=0.04) and Weighted Evidence Accumulation K-means (Median NMI=0.048). The effectiveness of the feature selection method is further investigated using two well-established time-sequence classification algorithms: (a) Nonlinear Auto-Regressive Neural Network with eXogenous inputs and distributed time delays (NARX-TDNN); and (b) Hidden Markov Models (HMM); where weighted average sensitivity and specificity of: (a) higher than 99% and 96% for NARX-TDNN; and (b) higher than 98% and 86% for HMM is observed. The proposed feature selection algorithm could potentially be applied to other model-based systems to improve the fault detection performance.

[1]  Steven W. Su,et al.  Intelligent outlier detection for HVAC system fault detection , 2012 .

[2]  Todd M. Rossi,et al.  A Statistical, Rule-Based Fault Detection and Diagnostic Method for Vapor Compression Air Conditioners , 1997 .

[3]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[4]  Guilherme De A. Barreto,et al.  A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network , 2006, 2006 Ninth Brazilian Symposium on Neural Networks (SBRN'06).

[5]  Hung T. Nguyen,et al.  Fast unsupervised learning method for rapid estimation of cluster centroids , 2012, 2012 IEEE Congress on Evolutionary Computation.

[6]  Ruxu Du,et al.  Model-based Fault Detection and Diagnosis of HVAC systems using Support Vector Machine method , 2007 .

[7]  Lin Wang,et al.  An effective and efficient differential evolution algorithm for the integrated stochastic joint replenishment and delivery model , 2012, Knowl. Based Syst..

[8]  Steven W. Su,et al.  Automatic Feature Selection Using Multiobjective Cluster Optimization for Fault Detection in a Heating Ventilation and Air Conditioning System , 2013, 2013 1st International Conference on Artificial Intelligence, Modelling and Simulation.

[9]  Steven T. Bushby,et al.  A rule-based fault detection method for air handling units , 2006 .

[10]  Tsaipei Wang,et al.  CA-Tree: A Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Sebastian Herkel,et al.  Black-box models for fault detection and performance monitoring of buildings , 2010 .

[12]  Andrew Kusiak,et al.  Multi-objective optimization of HVAC system with an evolutionary computation algorithm , 2011 .

[13]  I. J. Leontaritis,et al.  Input-output parametric models for non-linear systems Part II: stochastic non-linear systems , 1985 .

[14]  Matthew D. Wilkerson,et al.  ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking , 2010, Bioinform..

[15]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[16]  Andrew Kusiak,et al.  Modeling and optimization of HVAC energy consumption , 2010 .

[17]  Hung T. Nguyen,et al.  Data Clustering Using Variants of Rapid Centroid Estimation , 2014, IEEE Transactions on Evolutionary Computation.

[18]  Xinhua Yang,et al.  An improved self-adaptive differential evolution algorithm and its application , 2013 .

[19]  J Schein,et al.  Results from Field Testing of Embedded Air Handling Unit and Variable Air Volume Box Fault Detection Tools | NIST , 2006 .

[20]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Ahmad B. Rad,et al.  Fuzzy-genetic algorithm for automatic fault detection in HVAC systems , 2007, Appl. Soft Comput..

[22]  Steven T. Bushby,et al.  Results from Field Testing of Air Handling Unit and Variable Air Volume Box Fault Detection Tools , 2003 .

[23]  Stefano Monti,et al.  Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[24]  Hava T. Siegelmann,et al.  Computational capabilities of recurrent NARX neural networks , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[25]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[26]  F.J. Duarte,et al.  Weighting Cluster Ensembles in Evidence Accumulation Clustering , 2005, 2005 portuguese conference on artificial intelligence.

[27]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[28]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[29]  Hung T. Nguyen,et al.  An algorithm for scalable clustering: Ensemble Rapid Centroid Estimation , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[30]  Tsaipei Wang Comparing hard and fuzzy c-means for evidence-accumulation clustering , 2009, 2009 IEEE International Conference on Fuzzy Systems.