Performance assessment and uncertainty quantification of predictive models for smart manufacturing systems

We review in this paper several methods from Statistical Learning Theory (SLT) for the performance assessment and uncertainty quantification of predictive models. Computational issues are addressed so to allow the scaling to large datasets and the application of SLT to Big Data analytics. The effectiveness of the application of SLT to manufacturing systems is exemplified by targeting the derivation of a predictive model for quality forecasting of products on an assembly line.

[1]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[2]  Hao Wang,et al.  PSVM : Parallelizing Support Vector Machines on Distributed Computers , 2007 .

[3]  Davide Anguita,et al.  Ship efficiency forecast based on sensors data collection: Improving numerical models through data analytics , 2015, OCEANS 2015 - Genova.

[4]  Davide Anguita,et al.  In-sample model selection for Support Vector Machines , 2011, The 2011 International Joint Conference on Neural Networks.

[5]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[6]  Klaus-Dieter Thoben,et al.  An approach to monitoring quality in manufacturing using supervised machine learning on product state data , 2013, Journal of Intelligent Manufacturing.

[7]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[8]  Davide Anguita,et al.  Fully Empirical and Data-Dependent Stability-Based Bounds , 2015, IEEE Transactions on Cybernetics.

[9]  David A. McAllester Some PAC-Bayesian Theorems , 1998, COLT' 98.

[10]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[11]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[12]  Davide Anguita,et al.  Condition Based Maintenance in Railway Transportation Systems Based on Big Data Streaming Analysis , 2015, INNS Conference on Big Data.

[13]  M. Opper,et al.  Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.

[14]  Lorenzo Rosasco,et al.  Are Loss Functions All the Same? , 2004, Neural Computation.

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Davide Anguita,et al.  Model Selection for Big Data: Algorithmic Stability and Bag of Little Bootstraps on GPUs , 2015, ESANN.

[17]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[18]  Davide Anguita,et al.  Global Rademacher Complexity Bounds: From Slow to Fast Convergence Rates , 2015, Neural Processing Letters.

[19]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[20]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[21]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[22]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[23]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[24]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[25]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[26]  Vladimir Koltchinskii,et al.  Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.

[27]  Vasant Dhar,et al.  Data science and prediction , 2012, CACM.

[28]  Isabelle Guyon,et al.  Model Selection: Beyond the Bayesian/Frequentist Divide , 2010, J. Mach. Learn. Res..

[29]  Tariq Rahim Soomro,et al.  Big Data Analysis: Apache Spark Perspective , 2015 .

[30]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[31]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[32]  Davide Anguita,et al.  K-Fold Cross Validation for Error Rate Estimate in Support Vector Machines , 2009, DMIN.

[33]  John Shawe-Taylor,et al.  Tighter PAC-Bayes bounds through distribution-dependent priors , 2013, Theor. Comput. Sci..

[34]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[35]  Purnamrita Sarkar,et al.  The Big Data Bootstrap , 2012, ICML.

[36]  David A. McAllester PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.

[37]  Babu Joseph,et al.  Predictive control of quality in a batch manufacturing process using artificial neural network models , 1993 .

[38]  Davide Anguita,et al.  Unlabeled patterns to tighten Rademacher complexity error bounds for kernel classifiers , 2014, Pattern Recognit. Lett..

[39]  Sankaran Mahadevan,et al.  Uncertainty quantification in performance evaluation of manufacturing processes , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[40]  Han-Xiong Li,et al.  A probabilistic support vector machine for uncertain data , 2009, 2009 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications.

[41]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[42]  Samuel Madden,et al.  From Databases to Big Data , 2012, IEEE Internet Comput..

[43]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[44]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[45]  Davide Anguita,et al.  SVM performance assessment for the control of injection moulding processes and plasticating extrusion , 2002, Int. J. Syst. Sci..

[46]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[47]  Fei Cheng,et al.  Facial Expression Recognition in JAFFE Dataset Based on Gaussian Process Classification , 2010, IEEE Transactions on Neural Networks.

[48]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[49]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[50]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[51]  Purnamrita Sarkar,et al.  A scalable bootstrap for massive data , 2011, 1112.5016.

[52]  Manfred K. Warmuth,et al.  Sample compression, learnability, and the Vapnik-Chervonenkis dimension , 1995, Machine Learning.

[53]  Inci Batmaz,et al.  A review of data mining applications for quality improvement in manufacturing industry , 2011, Expert Syst. Appl..

[54]  Jay Lee,et al.  Recent advances and trends in predictive manufacturing systems in big data environment , 2013 .

[55]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[56]  Anantha Narayanan,et al.  Towards a domain-specific framework for predictive analytics in manufacturing , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[57]  Shiliang Sun,et al.  A review of optimization methodologies in support vector machines , 2011, Neurocomputing.

[58]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[59]  Malik Magdon-Ismail,et al.  No Free Lunch for Noise Prediction , 2000, Neural Computation.

[60]  Davide Anguita,et al.  In-Sample and Out-of-Sample Model Selection and Error Estimation for Support Vector Machines , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[61]  Chih-Jen Lin,et al.  Subsampled Hessian Newton Methods for Supervised Learning , 2015, Neural Computation.

[62]  Davide Anguita,et al.  Big Data Analytics in the Cloud: Spark on Hadoop vs MPI/OpenMP on Beowulf , 2015, INNS Conference on Big Data.

[63]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[64]  Purnamrita Sarkar,et al.  Bootstrapping Big Data , 2011 .

[65]  Davide Anguita,et al.  In-sample Model Selection for Trimmed Hinge Loss Support Vector Machine , 2012, Neural Processing Letters.

[66]  T. Poggio,et al.  STABILITY RESULTS IN LEARNING THEORY , 2005 .

[67]  Tullio Tolio,et al.  Design and management of manufacturing systems for production quality , 2014 .

[68]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[69]  San Cristóbal Mateo,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996 .

[70]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[71]  Davide Anguita,et al.  Local Rademacher Complexity: Sharper risk bounds with and without unlabeled samples , 2015, Neural Networks.

[72]  L. Kilian,et al.  In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use? , 2002, SSRN Electronic Journal.