论文信息 - Process-Monitoring-for-Quality — Big Models - 字舞流文

Process-Monitoring-for-Quality — Big Models

Abstract Process Monitoring for Quality (PMQ) is a big data-driven quality philosophy aimed at defect detection (through binary classification) and empirical knowledge discovery. It was originally developed to solve a complex manufacturing quality problem. It is founded on Big Models, a predictive modeling paradigm based on machine learning, statistics and optimization, that includes a learning aspect that requires many models to be developed to find the final model. When dealing with big data, the data structure is not known in advance; therefore, there is no a priori distinction between learning algorithms, and a plethora of options to choose from. The learning scheme of Big Models is described, which is based on several well known learning algorithms with the capacity to effectively solve a wide spectrum of binary classification problems. The main challenges of manufacturing pattern recognition problems are discussed and addressed to provide a strong foundation to the Big Models learning paradigm. Finally, two defect detection case studies are presented with highly unbalanced data derived from real manufacturing systems to validate the proposal.

Ruben Morales-Menendez | Marcela Hernández-de-Menéndez | Carlos A. Escobar | Jeffrey A. Abell | R. Morales-Menéndez | Jeffrey Abell | Marcela Hernández-de-Menéndez

[1] David R. Anderson,et al. Model selection bias and Freedman’s paradox , 2010 .

[2] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[3] Huan Liu,et al. Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[4] Aman Kumar Sharma,et al. A Comparative Study of Classification Algorithms for Spam Email Data Analysis , 2011 .

[5] Ferat Sahin,et al. A survey on feature selection methods , 2014, Comput. Electr. Eng..

[6] Philip S. Yu,et al. Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[7] Glenn Fung,et al. A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[8] Sylvain Arlot,et al. A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[9] Ruchika Malhotra,et al. A systematic review of machine learning techniques for software fault prediction , 2015, Appl. Soft Comput..

[10] Richard Demo Souza,et al. A Survey of Machine Learning Techniques Applied to Self-Organizing Cellular Networks , 2017, IEEE Communications Surveys & Tutorials.

[11] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[12] Tae-Hyung Kim,et al. Feature selection for manufacturing process monitoring using cross-validation , 2013 .

[13] Ismail Mohamad,et al. Standardization and Its Effects on K-Means Clustering Algorithm , 2013 .

[14] Pierre Dupont,et al. Ensemble Logistic Regression for Feature Selection , 2011, PRIB.

[15] David H. Wolpert,et al. The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[16] Carlos A. Escobar,et al. Big Data-Driven Manufacturing—Process-Monitoring-for-Quality Philosophy , 2017 .

[17] Ronald D. Moen,et al. Clearing up myths about the Deming cycle and seeing how it keeps evolving , 2010 .

[18] Rubén Morales-Menéndez,et al. Machine Learning and Pattern Recognition Techniques for Information Extraction to Improve Production Control and Design Decisions , 2017, ICDM.

[19] David M. W. Powers,et al. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[20] Trevor Hastie,et al. Model Assessment and Selection , 2009 .

[21] Ian Davidson,et al. An Ensemble Technique for Stable Learners with Performance Bounds , 2004, AAAI.

[22] Sotiris B. Kotsiantis,et al. Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[23] Duc Truong Pham,et al. Machine-learning techniques and their applications in manufacturing , 2005 .

[24] Marko Robnik-Sikonja,et al. Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[25] João Francisco Valiati,et al. Document-level sentiment classification: An empirical comparison between SVM and ANN , 2013, Expert Syst. Appl..

[26] Venu Govindaraju,et al. Review of Classifier Combination Methods , 2008, Machine Learning in Document Analysis and Recognition.

[27] Sargur N. Srihari,et al. Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[28] R. Shibata. Statistical aspects of model selection , 1989 .

[29] Sayan Mukherjee,et al. Feature Selection for SVMs , 2000, NIPS.

[30] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.

[31] Ruben Morales-Menendez,et al. Machine learning techniques for quality control in high conformance manufacturing environment , 2018 .

[32] Thomas G. Dietterich. An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[33] Asha Gowda Karegowda,et al. Combining Akaike's Information Criterion (AIC) and the Golden-Section Search Technique to find Optimal Numbers of K-Nearest Neighbors , 2010 .

[34] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[35] Paul S. Bradley,et al. Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[36] Christine W. Chan,et al. Artificial intelligence for monitoring and supervisory control of process systems , 2007, Eng. Appl. Artif. Intell..

[37] Inci Batmaz,et al. A review of data mining applications for quality improvement in manufacturing industry , 2011, Expert Syst. Appl..

[38] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[39] Ruben Morales-Menendez,et al. Process-Monitoring-for-Quality—Applications , 2018 .

[40] Robert Tibshirani,et al. 1-norm Support Vector Machines , 2003, NIPS.

[41] Khairullah Khan,et al. A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[42] Honglak Lee,et al. Efficient L1 Regularized Logistic Regression , 2006, AAAI.

[43] Paul Terry,et al. Application of the GA/KNN method to SELDI proteomics data , 2004, Bioinform..

[44] James D. Malley,et al. Predictor correlation impacts machine learning algorithms: implications for genomic studies , 2009, Bioinform..

[45] Max Kuhn,et al. Applied Predictive Modeling , 2013 .

[46] Zixiang Xiong,et al. Optimal number of features as a function of sample size for various classification rules , 2005, Bioinform..

[47] Nikunj C. Oza,et al. Online Ensemble Learning , 2000, AAAI/IAAI.

[48] Ruben Morales-Menendez,et al. Process-Monitoring-for-Quality — A Model Selection Criterion , 2018 .

[49] Lifeng Xi,et al. A selective multiclass support vector machine ensemble classifier for engineering surface classification using high definition metrology , 2015 .

[50] S. Imandoust,et al. Application of K-Nearest Neighbor (KNN) Approach for Predicting Economic Events: Theoretical Background , 2013 .

[51] J. Havel,et al. Artificial neural networks in medical diagnosis , 2013 .

[52] Sotiris B. Kotsiantis,et al. Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.

[53] Vanathi Gopalakrishnan,et al. An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data , 2017, Data.