Evaluating machine learning models for engineering problems

The use of machine learning (ML), and in particular, artificial neural networks (ANN), in engineering applications has increased dramatically over the last years. However, by and large, the development of such applications or their report lack proper evaluation. Deficient evaluation practice was observed in the general neural networks community and again in engineering applications through a survey we conducted of articles published in AI in Engineering and elsewhere. This status hinders understanding and prevents progress. This article goal is to remedy this situation. First, several evaluation methods are discussed with their relative qualities. Second, these qualities are illustrated by using the methods to evaluate ANN performance in two engineering problems. Third, a systematic evaluation procedure for ML is discussed. This procedure will lead to better evaluation of studies, and consequently to improved research and practice in the area of ML in engineering applications.

[1]  Christos N. Schizas,et al.  Artificial neural networks in marine propeller design , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[2]  Yoram Reich,et al.  Machine Learning Techniques for Civil Engineering Problems , 1997 .

[3]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[4]  Sholom M. Weiss,et al.  Decision Tree Pruning: Biased or Optimal? , 1994, AAAI.

[5]  W. J. H. Verkooijen,et al.  Which method learns most from the data , 1995 .

[6]  W. H. Highleyman,et al.  The design and analysis of pattern recognition experiments , 1962 .

[7]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[8]  M. Stone Asymptotics for and against cross-validation , 1977 .

[9]  Sholom M. Weiss,et al.  Small Sample Error Rate Estimation for k-NN Classifiers , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[11]  San Cristóbal Mateo,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996 .

[12]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[13]  Lutz Prechelt,et al.  A quantitative study of experimental evaluations of neural network learning algorithms: Current research practice , 1996, Neural Networks.

[14]  P. BrazdilLIACC Characterization of Classiication Algorithms , 1995 .

[15]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[16]  Olivier Gascuel,et al.  Statistical Significance in Inductive Learning , 1992, ECAI.

[17]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Robert A. Stine,et al.  An Introduction to Bootstrap Methods , 1989 .

[19]  Charles Elkan,et al.  Estimating the Accuracy of Learned Concepts , 1993, IJCAI.

[20]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[21]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[22]  Arthur Flexer,et al.  Statistical evaluation of neural networks experiments: Minimum requirements and current practice , 1994 .

[23]  J. Congleton,et al.  Stress Corrosion Cracking of Sensitized Type 304 Stainless Steel in Doped High-Temperature Water , 1995 .

[24]  E N Hubble,et al.  A NEW USABLE PROPELLER SERIES , 1989 .

[25]  Yoram Reich,et al.  Measuring the value of knowledge , 1995, Int. J. Hum. Comput. Stud..

[26]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[27]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[28]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[29]  Howard B. Demuth,et al.  Neutral network toolbox for use with Matlab , 1995 .

[30]  João Gama,et al.  Characterization of Classification Algorithms , 1995, EPIA.