Machine learning-based anomaly detection via integration of manufacturing, inspection and after-sales service data

Purpose Quality management of products is an important part of manufacturing process. One way to manage and assure product quality is to use machine learning algorithms based on relationship among various process steps. The purpose of this paper is to integrate manufacturing, inspection and after-sales service data to make full use of machine learning algorithms for estimating the products’ quality in a supervised fashion. Proposed frameworks and methods are applied to actual data associated with heavy machinery engines. Design/methodology/approach By following Lenzerini’s formula, manufacturing, inspection and after-sales service data from various sources are integrated. The after-sales service data are used to label each engine as normal or abnormal. In this study, one-class classification algorithms are used due to class imbalance problem. To address multi-dimensionality of time series data, the symbolic aggregate approximation algorithm is used for data segmentation. Then, binary genetic algorithm-based wrapper approach is applied to segmented data to find the optimal feature subset. Findings By employing machine learning-based anomaly detection models, an anomaly score for each engine is calculated. Experimental results show that the proposed method can detect defective engines with a high probability before they are shipped. Originality/value Through data integration, the actual customer-perceived quality from after-sales service is linked to data from manufacturing and inspection process. In terms of business application, data integration and machine learning-based anomaly detection can help manufacturers establish quality management policies that reflect the actual customer-perceived quality by predicting defective engines.

[1]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[2]  Jianbo Yu,et al.  Fault Detection Using Principal Components-Based Gaussian Mixture Model for Semiconductor Manufacturing Processes , 2011, IEEE Transactions on Semiconductor Manufacturing.

[3]  Chenglin Wen,et al.  Fault Detection Using Random Projections and k-Nearest Neighbor Rule for Semiconductor Manufacturing Processes , 2015, IEEE Transactions on Semiconductor Manufacturing.

[4]  Mohammad Esmalifalak,et al.  A data mining approach for fault diagnosis: An application of anomaly detection algorithm , 2014 .

[5]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[6]  Sanjay Chawla,et al.  On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance , 2013, ICML.

[7]  Fugee Tsung,et al.  Statistical process control for multistage processes with binary outputs , 2013 .

[8]  Rushi Longadge,et al.  Class Imbalance Problem in Data Mining Review , 2013, ArXiv.

[9]  S. Malik Customer Satisfaction, Perceived Service Quality and Mediating Role of Perceived Value , 2012 .

[10]  Jaime A. Camelio,et al.  Real-time fault detection in manufacturing environments using face recognition techniques , 2010, Journal of Intelligent Manufacturing.

[11]  Kevin Wilkinson,et al.  Data integration flows for business intelligence , 2009, EDBT '09.

[12]  P. Konar,et al.  Bearing fault detection of induction motor using wavelet and Support Vector Machines (SVMs) , 2011, Appl. Soft Comput..

[13]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[14]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[15]  Jianwen Su,et al.  Data management perspectives on business process management: tutorial overview , 2013, SIGMOD '13.

[16]  Sami Othman,et al.  Support Vector Machines for Fault Detection in Wind Turbines , 2011 .

[17]  Eamonn J. Keogh,et al.  Finding surprising patterns in a time series database in linear time and space , 2002, KDD.

[18]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[19]  Shengwei Wang,et al.  Pattern recognition-based chillers fault detection method using Support Vector Data Description (SVDD) , 2013 .

[20]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[21]  Sungzoon Cho,et al.  A hybrid novelty score and its use in keystroke dynamics-based user authentication , 2009, Pattern Recognit..

[22]  D.H. Werner,et al.  Particle swarm optimization versus genetic algorithms for phased array synthesis , 2004, IEEE Transactions on Antennas and Propagation.

[23]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[24]  Jinsong Leng,et al.  Comparative Analysis of Genetic Algorithm and Particle Swam Optimization: An Application in Precision Agriculture , 2015 .

[25]  Thuy Thi Ngoc Vo,et al.  Factors Influencing Customer Perceived Quality and Purchase Intention toward Private Labels in the Vietnam Market: The Moderating Effects of Store Image , 2015 .

[26]  Jian Wang,et al.  Discriminative Feature Selection Based on Imbalance SVDD for Fault Detection of Semiconductor Manufacturing Processes , 2016, J. Circuits Syst. Comput..

[27]  Alon Y. Halevy,et al.  Data Integration for the Relational Web , 2009, Proc. VLDB Endow..

[28]  Jafar Zarei,et al.  Induction motors bearing fault detection using pattern recognition techniques , 2012, Expert Syst. Appl..

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[30]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[31]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[32]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[33]  Li Zhuo,et al.  A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine , 2008, Geoinformatics.

[34]  Andrea Calì,et al.  Data integration under integrity constraints , 2004, Inf. Syst..

[35]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[36]  Hyoungjoo Lee,et al.  Machine learning-based novelty detection for faulty wafer detection in semiconductor manufacturing , 2012, Expert Syst. Appl..