Time Series Features Extraction Versus LSTM for Manufacturing Processes Performance Prediction

In this article is addressed the complexity of predicting the performance of manufacturing processes in cyber-physical systems in cases when the products go through hundreds of operations and when the data that is recorded when the manufacturing processes are performed is not enough in order to make accurate predictions and thus to determine those operations that might lead to low performance results. This research challenge is approached by comparing two methods namely, one method based on highly scalable hypothesis tests and machine learning predictors and one method based on a Long Short-Term Memory Recurrent Neural Network (LSTM RNN). In addition to the critical comparison of the two approaches in terms of performance, this research work addresses challenges such as the determination of the best threshold for distinguishing between performant and unperformant processes, the identification of the most frequent patterns in unperformant processes and the consideration of several techniques for replacing the missing data given the complexity of manufacturing processes.

[1]  Andrew Stranieri,et al.  Diagnostic with incomplete nominal/discrete data , 2015, Artif. Intell. Res..

[2]  Elçin Kartal-Koç,et al.  Model selection in multivariate adaptive regression splines (MARS) using information complexity as the fitness function , 2014, Machine Learning.

[3]  Fang-Xiang Wu,et al.  A logistic regression based algorithm for identifying human disease genes , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[4]  Pramod Ganjewar,et al.  Real-Time Dengue Prediction Using Naive Bayes Predicator in the IoT , 2018, 2018 International Conference on Inventive Research in Computing Applications (ICIRCA).

[5]  Youngshin Han,et al.  Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process , 2016, ITCS 2016.

[6]  K. Sujatha,et al.  Automation of metal charge calculations using support vector machine , 2015, 2015 International Conference on Man and Machine Interfacing (MAMI).

[7]  Yancheng Liu,et al.  Mechanical state prediction based on LSTM neural netwok , 2017, 2017 36th Chinese Control Conference (CCC).

[8]  Santanu Kumar Rath,et al.  Quality Assessment of Web Services Using Multivariate Adaptive Regression Splines , 2015, 2015 Asia-Pacific Software Engineering Conference (APSEC).

[9]  Kaustubh Patil,et al.  Multi-Layer Perceptron Classifier and Paillier Encryption Scheme for Friend Recommendation System , 2017, 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA).

[10]  Ling-Min He,et al.  A Comparison of Support Vector Machines Ensemble for Classification , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[11]  Nittaya Kerdprasop,et al.  Feature Selection and Boosting Techniques to Improve Fault Detection Accuracy in the Semiconductor Manufacturing Process , 2011 .

[12]  Andreas W. Kempa-Liehr,et al.  Distributed and parallel time series feature extraction for industrial big data applications , 2016, ArXiv.

[13]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[14]  Kesheng Wu,et al.  Extracting Baseline Electricity Usage Using Gradient Tree Boosting , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[15]  Kai Qian,et al.  The Impact of Data Preprocessing on the Performance of a Naive Bayes Classifier , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[16]  Nishant Kumar,et al.  Using big data to enhance the bosch production line performance: A Kaggle challenge , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[17]  Gaofeng Xu,et al.  Anode effect prediction based on support vector machine and K nearest neighbor , 2017, 2017 Chinese Automation Congress (CAC).

[18]  Zhicheng Ji,et al.  Improved chicken swarm optimization , 2015, 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).

[19]  Dorin Moldovan,et al.  Chicken Swarm Optimization and Deep Learning for Manufacturing Processes , 2018, 2018 17th RoEduNet Conference: Networking in Education and Research (RoEduNet).

[20]  Andreas W. Kempa-Liehr,et al.  Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh - A Python package) , 2018, Neurocomputing.

[21]  Abhilasha,et al.  A study on Effects of Intrinsic Characteristics of Datasets on Classification Performance , 2016 .

[22]  Christoph Flath,et al.  Applying Data Science for shop-Floor Performance Prediction , 2017, ECIS.

[23]  Dean R. De Cock,et al.  Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project , 2011 .

[24]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[25]  Shiliang Sun,et al.  An adaptive k-nearest neighbor algorithm , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[26]  Veena N. Jokhakar,et al.  A random forest based machine learning approach for mild steel defect diagnosis , 2016, 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).

[27]  Balakrishnan Ramadoss,et al.  Predictive Models for Equipment Fault Detection in the Semiconductor Manufacturing Process , 2016 .

[28]  Pradit Mittrapiyanuruk,et al.  Sugarcane Yield Grade Prediction using Random Forest and Gradient Boosting Tree Techniques , 2018, 2018 15th International Joint Conference on Computer Science and Software Engineering (JCSSE).

[29]  Meng Li,et al.  A Random Forest-based ensemble method for activity recognition , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[30]  Jun Liang,et al.  Residual Recurrent Neural Networks for Learning Sequential Representations , 2018, Inf..

[31]  Jiann-Shiun Yuan,et al.  An Experimental Evaluation of Fault Diagnosis from Imbalanced and Incomplete Data for Smart Semiconductor Manufacturing , 2018, Big Data Cogn. Comput..

[32]  Dorin Moldovan,et al.  Prediction of Manufacturing Processes Errors: Gradient Boosted Trees Versus Deep Neural Networks , 2018, 2018 IEEE 16th International Conference on Embedded and Ubiquitous Computing (EUC).

[33]  Nelson D. A. Mascarenhas,et al.  Multilayer Perceptron Classifier Combination for Identification of Materials on Noisy Soil Science Multispectral Images , 2007 .

[34]  Jiajun Chen,et al.  Decision Tree Construction Algorithm for Incomplete Information System , 2012 .

[35]  Nittaya Kerdprasop,et al.  Tool Sequence Analysis and Performance Prediction in the Wafer Fabrication Process , .

[36]  Chein-I Chang,et al.  Variants of Principal Components Analysis , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[37]  Nittaya Kerdprasop,et al.  Cluster-Based Sequence Analysis of Complex Manufacturing Process , 2012 .

[38]  Dorin Moldovan,et al.  Machine learning for sensor-based manufacturing processes , 2017, 2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP).

[39]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[40]  Leonardo Feltrin KNIME an Open Source Solution for Predictive Analytics in the Geosciences [Software and Data Sets] , 2015, IEEE Geoscience and Remote Sensing Magazine.