Developing a robust classifier for fault detection in production environment

Abstract Recently, machine learning algorithms are widely applied to production such as failures identification, finished products inspection, and process monitoring. Applying these algorithms to fault detection makes it possible to eliminate additional tests or experiments which usually involve high expense and highly risk. However, when applying machine learning methods to the real world data, the class imbalance problem usually has been ignored. This problem is caused by imbalanced data, in which almost all the examples are labeled as one class whilst far fewer objects are labeled as the other class. When deal with such imbalanced data, a classifier induced from an imbalanced data set has high classification accuracy for the majority class, but an unacceptable error rate for the minority class. To solve this problem, this work proposed a novel method, called SOM (Self-Organizing Maps) based methodology. A process monitoring data has been provided to demonstrate the effectiveness of the proposed method. Experimental results indicated the proposed method outperforms traditional techniques, under-sampling and cluster based sampling.

[1]  Lakhmi C. Jain,et al.  Self-Organizing neural networks: recent advances and applications , 2001 .

[2]  Fugee Tsung,et al.  A kernel-distance-based multivariate control chart using support vector methods , 2003 .

[3]  Ratna Babu Chinnam,et al.  Support vector machines for recognizing shifts in correlated and other manufacturing processes , 2002 .

[4]  Yi-Hung Liu,et al.  Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machines , 2007, IEEE Transactions on Neural Networks.

[5]  Ping Yang,et al.  Fault diagnosis for boilers in thermal power plant by data mining , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[6]  George W. Irwin,et al.  Intelligent Control and Automation , 2006 .

[7]  Xuan Liu,et al.  Implementation of Fault Detection and Diagnosis System for Control Systems in Thermal Power Plants , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[8]  R. J. Alcock,et al.  Time-Series Similarity Queries Employing a Feature-Based Approach , 1999 .

[9]  Manoj Kumar Tiwari,et al.  Kernel distance-based robust support vector methods and its application in developing a robust K-chart , 2006 .

[10]  Chun-Chin Hsu,et al.  An information granulation based data mining approach for classifying imbalanced data , 2008, Inf. Sci..

[11]  Foster Provost,et al.  The effect of class distribution on classifier learning , 2001 .

[12]  Cem Ergün,et al.  Clustering Based Under-Sampling for Improving Speaker Verification Decisions Using AdaBoost , 2004, SSPR/SPR.

[13]  Lifeng Xi,et al.  A hybrid learning-based model for on-line monitoring and diagnosis of out-of-control signals in multivariate manufacturing processes , 2009 .

[14]  C.-T. Su,et al.  Using granular computing model to induce scheduling knowledge in dynamic manufacturing environments , 2008, Int. J. Comput. Integr. Manuf..

[15]  Chao-Ton Su,et al.  An Evaluation of the Robustness of MTS for Imbalanced Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[16]  Han-Pang Huang,et al.  Recognition of Electromyographic Signals Using Cascaded Kernel Learning Machine , 2007, IEEE/ASME Transactions on Mechatronics.

[17]  Lakhmi C. Jain,et al.  Self-Organizing Neural Networks , 2002 .

[18]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Li-Yen Shue,et al.  Business intelligence approach to supporting strategy-making of ISP service management , 2008, Expert Syst. Appl..

[20]  T. Warren Liao,et al.  Classification of weld flaws with imbalanced class data , 2008, Expert Syst. Appl..

[21]  Benoît Iung,et al.  Special issue on e-maintenance , 2006, Comput. Ind..

[22]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[23]  Long-Sheng Chen,et al.  A neural network based information granulation approach to shorten the cellular phone test process , 2006, Comput. Ind..

[24]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[25]  José Salvador Sánchez,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[26]  Zhengding Qiu,et al.  The effect of imbalanced data sets on LDA: A theoretical and empirical analysis , 2007, Pattern Recognit..

[27]  Sylviane Gentil,et al.  Recurrent neuro-fuzzy system for fault detection and isolation in nuclear reactors , 2005, Adv. Eng. Informatics.

[28]  Bo-Suk Yang,et al.  Development of an e-maintenance system integrating advanced techniques , 2006, Comput. Ind..

[29]  Anthony J. Bonner,et al.  Neural networks approaches for discovering the learnable correlation between gene function and gene expression in mouse , 2008, Neurocomputing.

[30]  Miguel A. Sanz-Bobi,et al.  SIMAP: Intelligent System for Predictive Maintenance: Application to the health condition monitoring of a windturbine gearbox , 2006, Comput. Ind..

[31]  Dragan Kukolj,et al.  DATA CLUSTERING USING A REORGANIZING NEURAL NETWORK , 2006, Cybern. Syst..