Model selection in Neural Networks: Some difficulties

This paper considers two related issues regarding feedforward Neural Networks (NNs). The first involves the question of whether the network weights corresponding to the best fitting network are unique. Our empirical tests suggest an answer in the negative, whether using standard Backpropagation algorithm or our preferred direct (non-gradient-based) search procedure. We also offer a theoretical analysis which suggests that there will almost inevitably be functional relationships between network weights. The second issue concerns the use of standard statistical approaches to testing the significance of weights or groups of weights. Treating feedforward NNs as an interesting way to carry out nonlinear regression suggests that statistical tests should be employed. According to our results, however, statistical tests can in practice be indeterminate. It is rather difficult to choose either the number of hidden layers or the number of nodes on this basis.

[1]  Malcolm James Beynon,et al.  Pruning neural networks by minimization of the estimated variance , 2000 .

[2]  Paul C. Kainen,et al.  Functionally Equivalent Feedforward Neural Networks , 1994, Neural Computation.

[3]  Vera Kurková,et al.  Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  Malcolm James Beynon,et al.  Neural networks and flexible approximations , 2000 .

[6]  N. A. Diamantidis,et al.  An interactive tool for knowledge base refinement , 1999, Expert Syst. J. Knowl. Eng..

[7]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[8]  H. White Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .

[9]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[10]  Luiz Moutinho,et al.  The impact of gender on car buyer satisfaction and loyalty: A neural network analysis , 1996 .

[11]  Ulrich Anders,et al.  Model selection in neural networks , 1999, Neural Networks.

[12]  Bruce Curry,et al.  Neural networks and non-linear statistical methods: an application to the modelling of price-quality relationships , 2002, Comput. Oper. Res..

[13]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[14]  Marie Cottrell,et al.  Neural modeling for time series: A statistical stepwise method for weight elimination , 1995, IEEE Trans. Neural Networks.

[15]  Clive W. J. Granger,et al.  Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests , 1993 .

[16]  Wilpen L. Gorr,et al.  Comparative study of artificial neural network and statistical models for predicting student grade point averages , 1994 .

[17]  B. Curry,et al.  Neural networks: a need for caution , 1997 .

[18]  Malcolm J. Beynon,et al.  Comparing neural network approximations for different functional forms , 1999, Expert Syst. J. Knowl. Eng..

[19]  P. Phillips Partially Identified Econometric Models , 1988, Econometric Theory.

[20]  Uwe Helmke,et al.  Existence and uniqueness results for neural network approximations , 1995, IEEE Trans. Neural Networks.

[21]  Timo Teräsvirta,et al.  POWER OF THE NEURAL NETWORK LINEARITY TEST , 1993 .

[22]  Bruce Curry Parameter redundancy in neural networks: an application of Chebyshev polynomials , 2007, Comput. Manag. Sci..

[23]  Shin'ichi Tamura,et al.  Capabilities of a four-layered feedforward neural network: four layers versus three , 1997, IEEE Trans. Neural Networks.

[24]  Bruce Curry,et al.  Neural networks and seasonality: Some technical considerations , 2007, Eur. J. Oper. Res..

[25]  H. Robbins A Stochastic Approximation Method , 1951 .

[26]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[27]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[28]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[29]  Bruce Curry,et al.  NEURAL NETWORKS AND BUSINESS FORECASTING: AN APPLICATION TO CROSS‐SECTIONAL AUDIT FEE DATA , 1998 .