Prediction of Manufacturing Processes Errors: Gradient Boosted Trees Versus Deep Neural Networks

In this paper we investigate the use of machine learning techniques for optimizing manufacturing processes operation. More precisely we propose, compare and contrast two approaches for predicting errors in manufacturing processes. The first approach is based on machine learning algorithms while the second one uses deep learning techniques. Both approaches are validated using a dataset from literature, the SECOM dataset, which is representative for manufacturing processes. For the machine learning approach features are selected using the Multivariate Adaptive Regression Splines (MARS) algorithm and data is classified using the Gradient Boosted Trees (GBT) algorithm, while for the deep learning approach features are selected using a Support Vector Machine (SVM) algorithm and data is predicted using a Neural Network (NN). The evaluation results show that the best results are obtained using the deep learning approach.

[1]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[2]  M.T. Rahman,et al.  Face recognition using Gabor Filters , 2008, 2008 11th International Conference on Computer and Information Technology.

[3]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[4]  Nittaya Kerdprasop,et al.  Feature Selection and Boosting Techniques to Improve Fault Detection Accuracy in the Semiconductor Manufacturing Process , 2011 .

[5]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[6]  Charu C. Aggarwal,et al.  Neural Networks and Deep Learning , 2018, Springer International Publishing.

[7]  G Verdier,et al.  Adaptive Mahalanobis Distance and $k$ -Nearest Neighbor Rule for Fault Detection in Semiconductor Manufacturing , 2011, IEEE Transactions on Semiconductor Manufacturing.

[8]  Dorin Moldovan,et al.  Machine learning for sensor-based manufacturing processes , 2017, 2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP).

[9]  Nittaya Kerdprasop,et al.  A Data Mining Approach to Automate Fault Detection Model Development in the Semiconductor Manufacturing Process , 2011 .

[10]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[11]  Zhiqiang Ge,et al.  Semiconductor Manufacturing Process Monitoring Based on Adaptive Substatistical PCA , 2010, IEEE Transactions on Semiconductor Manufacturing.

[12]  Marko Grobelnik,et al.  Feature Selection Using Support Vector Machines , 2002 .

[13]  Balakrishnan Ramadoss,et al.  Predictive Models for Equipment Fault Detection in the Semiconductor Manufacturing Process , 2016 .

[14]  Jianmin Wang,et al.  Enriching Data Imputation with Extensive Similarity Neighbors , 2015, Proc. VLDB Endow..

[15]  Chih-Jen Lin,et al.  Feature Ranking Using Linear SVM , 2008, WCCI Causation and Prediction Challenge.

[16]  D. Senthil Kumar,et al.  Feature Selection using Multivariate Adaptive Regression Splines , 2016 .

[17]  Youngshin Han,et al.  Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process , 2016, ITCS 2016.

[18]  Philip S. Yu,et al.  SCREEN: Stream Data Cleaning under Speed Constraints , 2015, SIGMOD Conference.