Optimal variable selection for effective statistical process monitoring

Abstract In a typical large-scale chemical process, hundreds of variables are measured. Since statistical process monitoring techniques typically involve dimensionality reduction, all measured variables are often provided as input without weeding out variables. Here, we demonstrate that incorporating measured variables that do not provide any additional information about faults degrades monitoring performance. We propose a stochastic optimization-based method to identify an optimal subset of measured variables for process monitoring. The benefits of the reduced monitoring model in terms of improved false alarm rate, missed detection rate, and detection delay is demonstrated through PCA based monitoring of the benchmark Tennessee Eastman Challenge problem.

[1]  Steven Guan,et al.  Feature selection for modular GA-based classification , 2004, Appl. Soft Comput..

[2]  Rajagopalan Srinivasan,et al.  STATE-SPECIFIC KEY VARIABLES FOR MONITORING MULTI-STATE PROCESSES , 2007 .

[3]  Rajagopalan Srinivasan,et al.  Multivariate Temporal Data Analysis Using Self-Organizing Maps. 2. Monitoring and Diagnosis of Multistate Operations , 2008 .

[4]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[5]  R. Rengaswamy,et al.  Comprehensive design of a sensor network for chemical plants based on various diagnosability and reliability criteria. 1. Framework , 2002 .

[6]  Xiaoxing Liu,et al.  An Entropy-based gene selection method for cancer classification using microarray data , 2005, BMC Bioinformatics.

[7]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Hao Tian,et al.  A new feature extraction and selection scheme for hybrid fault diagnosis of gearbox , 2011, Expert Syst. Appl..

[9]  Jinsong Zhao,et al.  An Online Fault Diagnosis Strategy for Full Operating Cycles of Chemical Processes , 2014 .

[10]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[11]  Weng Khuen Ho,et al.  Neural network systems for multi-dimensional temporal pattern classification , 2005, Comput. Chem. Eng..

[12]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[13]  Raghunathan Rengaswamy,et al.  A review of process fault detection and diagnosis: Part II: Qualitative models and search strategies , 2003, Comput. Chem. Eng..

[14]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Santosh K. Gupta,et al.  Multi-objective optimization of an industrial fluidized-bed catalytic cracking unit (FCCU) using genetic algorithm (GA) with the jumping genes operator , 2003, Comput. Chem. Eng..

[16]  Sachin C. Patwardhan,et al.  Plant-wide detection and diagnosis using correspondence analysis☆ , 2007 .

[17]  Leo H. Chiang,et al.  Canonical Variate Analysis , 2000 .

[18]  Yew Seng Ng,et al.  Evaluation of decision fusion strategies for effective collaboration among heterogeneous fault diagnostic methods , 2011, Comput. Chem. Eng..

[19]  In-Beum Lee,et al.  Fault Detection of Non-Linear Processes Using Kernel Independent Component Analysis , 2008 .

[20]  Aravind Seshadri,et al.  A FAST ELITIST MULTIOBJECTIVE GENETIC ALGORITHM: NSGA-II , 2000 .

[21]  Rajagopalan Srinivasan,et al.  Multivariate Temporal Data Analysis Using Self-Organizing Maps. 1. Training Methodology for Effective Visualization of Multistate Operations , 2008 .

[22]  Rajagopalan Srinivasan,et al.  Hierarchically Distributed Fault Detection and Identification through Dempster-Shafer Evidence Fusion , 2011 .

[23]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[24]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[25]  Raghunathan Rengaswamy,et al.  A review of process fault detection and diagnosis: Part III: Process history based methods , 2003, Comput. Chem. Eng..

[26]  Weng Khuen Ho,et al.  Context-based recognition of process states using neural networks , 2005 .

[27]  Rajagopalan Srinivasan,et al.  Visual exploration of multi-state operations using self-organizing map , 2008 .

[28]  Jin Wang,et al.  Fault Detection Using the k-Nearest Neighbor Rule for Semiconductor Manufacturing Processes , 2007, IEEE Transactions on Semiconductor Manufacturing.

[29]  Sirish L. Shah,et al.  Fault detection and diagnosis in process data using one-class support vector machines , 2009 .

[30]  Ying Liu,et al.  A Comparative Study on Feature Selection Methods for Drug Discovery , 2004, J. Chem. Inf. Model..

[31]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[32]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Raghunathan Rengaswamy,et al.  A review of process fault detection and diagnosis: Part I: Quantitative model-based methods , 2003, Comput. Chem. Eng..

[34]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[35]  R. Srinivasan,et al.  Immune-System-Inspired Approach to Process Monitoring and Fault Diagnosis , 2011 .

[36]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[37]  M. Bagajewicz,et al.  Instrumentation Design and Upgrade for Principal Components Analysis Monitoring , 2004 .

[38]  Alain Biem,et al.  Pattern recognition using discriminative feature extraction , 1997, IEEE Trans. Signal Process..

[39]  Abdessamad Kobi,et al.  Fault detection and identification with a new feature selection based on mutual information , 2008 .

[40]  David A. Landgrebe,et al.  Decision boundary feature extraction for neural networks , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[41]  E. F. Vogel,et al.  A plant-wide industrial process control problem , 1993 .

[42]  Richard S.H. Mah,et al.  Effect of redundancy on estimation accuracy in process data reconciliation , 1987 .

[43]  Mahesh Pal,et al.  Hybrid genetic algorithm for feature selection with hyperspectral data , 2013 .

[44]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[45]  Huan Liu,et al.  Handling Large Unsupervised Data via Dimensionality Reduction , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[46]  Miguel J. Bagajewicz,et al.  Design and retrofit of sensor networks in process plants , 1997 .

[47]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[48]  Richard D. Braatz,et al.  Fault Detection and Diagnosis in Industrial Systems , 2001 .

[49]  Fakhri Karray,et al.  Hierarchical genetic algorithm with new evaluation function and bi-coded representation for the selection of features considering their confidence rate , 2011, Appl. Soft Comput..

[50]  Thomas E. Marlin,et al.  Multivariate statistical monitoring of process operating performance , 1991 .

[51]  Belle R. Upadhyaya,et al.  Failure Detection Using a Fuzzy Neural Network with an Automatic Input Selection Algorithm , 2002 .

[52]  Leo H. Chiang,et al.  Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis , 2000 .

[53]  Yingwei Zhang,et al.  Fault Detection and Diagnosis of Nonlinear Processes Using Improved Kernel Independent Component Analysis (KICA) and Support Vector Machine (SVM) , 2008 .

[54]  K. A. Kosanovich,et al.  Applications of multivariate statistical methods to process monitoring and controller design , 1994 .

[55]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[56]  Jinsong Zhao,et al.  Fault Diagnosis of Batch Chemical Processes Using a Dynamic Time Warping (DTW)-Based Artificial Immune System , 2011 .

[57]  P. Baraldi,et al.  Selecting features for nuclear transients classification by means of genetic algorithms , 2006, IEEE Transactions on Nuclear Science.

[58]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.