Model selection in neural networks

In this article, we examine how model selection in neural networks can be guided by statistical procedures such as hypothesis tests, information criteria and cross validation. The application of these methods in neural network models is discussed, paying attention especially to the identification problems encountered. We then propose five specification strategies based on different statistical procedures and compare them in a simulation study. As the results of the study are promising, it is suggested that a statistical analysis should become an integral part of neural network modeling.

[1]  H. White Consequences and Detection of Misspecified Nonlinear Regression Models , 1981 .

[2]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[3]  J. MacKinnon,et al.  Estimation and inference in econometrics , 1994 .

[4]  John E. Moody,et al.  Smoothing Regularizers for Projective Basis Function Networks , 1996, NIPS.

[5]  A. N. Burgess Non-linear model identification and statistical significance tests and their application to financial modelling , 1995 .

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  Ulrich Anders,et al.  Improving the pricing of options: a neural network approach , 1998 .

[8]  A. Neil Burgess,et al.  Neural networks in financial engineering: a study in methodology , 1997, IEEE Trans. Neural Networks.

[9]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[10]  H. White,et al.  An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks , 1989, International 1989 Joint Conference on Neural Networks.

[11]  Warren S. Sarle,et al.  Neural Networks and Statistical Models , 1994 .

[12]  D. Rumelhart,et al.  Predicting sunspots and exchange rates with connectionist networks , 1991 .

[13]  Clive W. J. Granger,et al.  Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests , 1993 .

[14]  Brian D. Ripley,et al.  Statistical aspects of neural networks , 1993 .

[15]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  R. Davies Hypothesis testing when a nuisance parameter is present only under the alternative , 1977 .

[18]  H. Akaike A new look at the statistical model identification , 1974 .

[19]  Jason Kingdon Intelligent systems and financial forecasting , 1997, Perspectives in neural computing.

[20]  Warren S. Sarle,et al.  Stopped Training and Other Remedies for Overfitting , 1995 .

[21]  Halbert White,et al.  Artificial neural networks: an econometric perspective ∗ , 1994 .

[22]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[23]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[24]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[25]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[26]  Peter Schmidt,et al.  The Theory and Practice of Econometrics , 1985 .

[27]  Ulrich Anders,et al.  Statistische neuronale Netze , 1997 .

[28]  Timo Teräsvirta,et al.  POWER OF THE NEURAL NETWORK LINEARITY TEST , 1993 .

[29]  H. White,et al.  Information criteria for selecting possibly misspecified parametric models , 1996 .

[30]  P. Phillips Partially Identified Econometric Models , 1988, Econometric Theory.

[31]  Maxwell B. Stinchcombe,et al.  CONSISTENT SPECIFICATION TESTING WITH NUISANCE PARAMETERS PRESENT ONLY UNDER THE ALTERNATIVE , 1998, Econometric Theory.

[32]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[33]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[34]  Adrian Pagan,et al.  Estimation, Inference and Specification Analysis. , 1996 .

[35]  R. Davies Hypothesis Testing when a Nuisance Parameter is Present Only Under the Alternatives , 1987 .

[36]  C. Granger,et al.  Modelling Nonlinear Economic Relationships , 1995 .

[37]  Norman R. Swanson,et al.  A Model-Selection Approach to Assessing the Information in the Term Structure Using Linear Models and Artificial Neural Networks , 1995 .

[38]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[39]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[40]  H. Akaike INFORMATION THEORY AS AN EXTENSION OF THE MAXIMUM LIKELIHOOD , 1973 .

[41]  G. Judge,et al.  The Theory and Practice of Econometrics (2nd ed.). , 1986 .