Machine Learning Algorithms in Heavy Process Manufacturing

In a global economy, manufacturers mainly compete with cost efficiency of production, as the price of raw materials are similar worldwide. Heavy industry has two big issues to deal with. On the one hand there is lots of data which needs to be analyzed in an effective manner, and on the other hand making big improvements via investments in cooperate structure or new machinery is neither economically nor physically viable. Machine learning offers a promising way for manufacturers to address both these problems as they are in an excellent position to employ learning techniques with their massive resource of historical production data. However, choosing modelling a strategy in this setting is far from trivial and this is the objective of this article. The article investigates characteristics of the most popular classifiers used in industry today. Support Vector Machines, Multilayer Perceptron, Decision Trees, Random Forests, and the meta-algorithms Bagging and Boosting are mainly investigated in this work. Lessons from real-world implementations of these learners are also provided together with future directions when different learners are expected to perform well. The importance of feature selection and relevant selection methods in an industrial setting are further investigated. Performance metrics have also been discussed for the sake of completion.

[1]  George C. Runger,et al.  Gene selection with guided regularized random forest , 2012, Pattern Recognit..

[2]  Abdelaziz Berrado,et al.  Modeling and characterizing of the thixoforming of steel process parameters – the case of forming load , 2010 .

[3]  Peter K. Sharpe,et al.  Dealing with missing values in neural network-based diagnostic systems , 1995, Neural Computing & Applications.

[4]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Sotiris B. Kotsiantis,et al.  Bagging Random Trees for Estimation of Tissue Softness , 2005, MLDM.

[8]  Eyke Hüllermeier An Empirical and Formal Analysis of Decision Trees for Ranking , 2008 .

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  Magali R. G. Meireles,et al.  A comprehensive review for industrial applicability of artificial neural networks , 2003, IEEE Trans. Ind. Electron..

[11]  Kurt Hornik,et al.  The support vector machine under test , 2003, Neurocomputing.

[12]  Ronald Harley,et al.  A random forest method for real-time price forecasting in New York electricity market , 2014, 2014 IEEE PES General Meeting | Conference & Exposition.

[13]  Chris Aldrich,et al.  Interpretation of nonlinear relationships between process variables by use of random forests , 2012 .

[14]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[15]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[16]  Sami M. Halawani,et al.  A STUDY OF DECISION TREE ENSEMBLES AND FEATURE SELECTION FOR STEEL PLATES FAULTS DETECTION , 2014 .

[17]  S. Sanguansintukul,et al.  Curl forecasting for paper quality in papermaking industry , 2008, 2008 Asia Simulation Conference - 7th International Conference on System Simulation and Scientific Computing.

[18]  Haisheng Li,et al.  Application of support vector machine method in prediction of Kappa number of kraft pulping process , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[19]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[20]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[21]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[22]  Anirudha Narain,et al.  Neural network based predictive control for nonlinear chemical process , 2010, 2010 INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[25]  Joakim Storck,et al.  A cost model for the effect of setup time reduction in stainless steel strip production , 2007 .

[26]  V. Vapnik Pattern recognition using generalized portrait method , 1963 .

[27]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[28]  Cao Feng,et al.  STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS , 1995 .

[29]  Mahmoud Reza Saybani Applications of support vector machines in oil refineries: A survey , 2011 .

[30]  Albert Y. Zomaya,et al.  A Review of Ensemble Methods in Bioinformatics , 2010, Current Bioinformatics.

[31]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[32]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[33]  George C. Runger,et al.  Feature selection via regularized trees , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[34]  Tony R. Martinez,et al.  The general inefficiency of batch training for gradient descent learning , 2003, Neural Networks.

[35]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[36]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[37]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[38]  Zhizhong Mao,et al.  Comparisons of element yield rate prediction using feed-forward neural networks and support vector machine , 2010, 2010 Chinese Control and Decision Conference.

[39]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[40]  Michael Ghil,et al.  Weather Regime Prediction Using Statistical Learning , 2005 .

[41]  J. Leeuw,et al.  Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods , 2009 .

[42]  S. R. Aghdam,et al.  A fast method of steel surface defect detection using decision trees applied to LBP based features , 2012, 2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA).

[43]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[44]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[45]  Ponnuthurai N. Suganthan,et al.  Modeling of steelmaking process with effective machine learning techniques , 2015, Expert Syst. Appl..

[46]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[47]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[48]  A. Rakotomamonjy Support Vector Machines and Area Under ROC curve , 2004 .

[49]  John J. Shynk,et al.  Statistical analysis of the single-layer backpropagation algorithm. II. MSE and classification performance , 1993, IEEE Trans. Signal Process..

[50]  Rocco A. Servedio,et al.  Random classification noise defeats all convex potential boosters , 2008, ICML '08.

[51]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[52]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[53]  Behzad Moshiri,et al.  Identification and model predictive control of continuous stirred tank reactor based on artificial neural networks , 2011 .

[54]  Ellen Coopersmith Making Decisions in the Oil and Gas Industry , 2001 .