Neural-network construction and selection in nonlinear modeling

We study how statistical tools which are commonly used independently can advantageously be exploited together in order to improve neural network estimation and selection in nonlinear static modeling. The tools we consider are the analysis of the numerical conditioning of the neural network candidates, statistical hypothesis tests, and cross validation. We present and analyze each of these tools in order to justify at what stage of a construction and selection procedure they can be most useful. On the basis of this analysis, we then propose a novel and systematic construction and selection procedure for neural modeling. We finally illustrate its efficiency through large-scale simulations experiments and real-world modeling problems.

[1]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[2]  Gerhard Paass,et al.  Assessing and Improving Neural Network Predictions by the Bootstrap Algorithm , 1992, NIPS.

[3]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[4]  Apostolos-Paul N. Refenes,et al.  Neural model identification, variable selection and model adequacy , 1999 .

[5]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.

[6]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[7]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[8]  Léon Personnaz,et al.  Jacobian Conditioning Analysis for Model Validation , 2004, Neural Computation.

[9]  Wray L. Buntine,et al.  Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.

[10]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[11]  David J. C. MacKay,et al.  Comparison of Approximate Methods for Handling Hyperparameters , 1999, Neural Computation.

[12]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[13]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[14]  George Cybenko,et al.  Ill-Conditioning in Neural Network Training Problems , 1993, SIAM J. Sci. Comput..

[15]  Jerome H. Friedman,et al.  An Overview of Predictive Learning and Function Approximation , 1994 .

[16]  D.G. Dudley,et al.  Dynamic system identification experiment design and data analysis , 1979, Proceedings of the IEEE.

[17]  Léon Personnaz,et al.  Construction of confidence intervals for neural networks based on least squares estimation , 2000, Neural Networks.

[18]  John Moody,et al.  Prediction Risk and Architecture Selection for Neural Networks , 1994 .

[19]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[20]  Léon Personnaz,et al.  On Cross Validation for Model Selection , 1999, Neural Computation.

[21]  Lars Kai Hansen,et al.  Linear unlearning for cross-validation , 1996, Adv. Comput. Math..

[22]  Gregory J. Wolff,et al.  Optimal Brain Surgeon: Extensions and performance comparisons , 1993, NIPS 1993.

[23]  I. J. Leontaritis,et al.  Model selection and validation methods for non-linear systems , 1987 .

[24]  Ulrich Anders,et al.  Model selection in neural networks , 1999, Neural Networks.

[25]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[26]  Léon Personnaz,et al.  No free lunch with the sandwich [sandwich estimator] , 2003, IEEE Trans. Neural Networks.

[27]  William H. Press,et al.  Numerical recipes in C , 2002 .

[28]  Jennie Si,et al.  A Systematic and Effective Supervised Learning Mechanism Based on Jacobian Rank Deficiency , 1998, Neural Computation.

[29]  Léon Personnaz,et al.  MLPs (Mono-Layer Polynomials and Multi-Layer Perceptrons) for Nonlinear Modeling , 2003, J. Mach. Learn. Res..

[30]  Léon Personnaz,et al.  CONSTRUCTION OF CONFIDENCE INTERVALS IN NEURAL MODELING USING A LINEAR TAYLOR EXPANSION , 1998 .

[31]  Douglas M. Bates,et al.  Nonlinear Regression Analysis and Its Applications , 1988 .

[32]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[33]  L. Personnaz,et al.  The selection of neural models of nonlinear dynamical systems by statistical tests , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[34]  D. Mackay,et al.  A Practical Bayesian Framework for Backprop Networks , 1991 .

[35]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[36]  Léon Personnaz,et al.  A statistical procedure for determining the optimal number of hidden neurons of a neural model , 2000 .

[37]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[38]  Harry Wechsler,et al.  From Statistics to Neural Networks: Theory and Pattern Recognition Applications , 1996 .