An efficient MDL-based construction of RBF networks

We propose a method for optimizing the complexity of Radial basis function (RBF) networks. The method involves two procedures: adaptation (training) and selection. The first procedure adaptively changes the locations and the width of the basis functions and trains the linear weights. The selection procedure performs the elimination of the redundant basis functions using an objective function based on the Minimum Description Length (MDL) principle. By iteratively combining these two procedures we achieve a controlled way of training and modifying RBF networks, which balances accuracy, training time, and complexity of the resulting network. We test the proposed method on function approximation and classification tasks, and compare it with some other recently proposed methods.

[1]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[2]  C. Reeves Modern heuristic techniques for combinatorial problems , 1993 .

[3]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[4]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[5]  Bruce A. Whitehead,et al.  Cooperative-competitive genetic evolution of radial basis function centers and widths for time series prediction , 1996, IEEE Trans. Neural Networks.

[6]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[7]  Andrzej Cichocki,et al.  Neural networks for optimization and signal processing , 1993 .

[8]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[9]  Kadir Liano,et al.  Robust error measure for supervised neural network learning with outliers , 1996, IEEE Trans. Neural Networks.

[10]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[11]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[12]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[13]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[14]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[15]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[16]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[17]  Pascal Fua,et al.  Objective Functions for Feature Discrimination , 1989, IJCAI.

[18]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[19]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[20]  Asim Roy,et al.  An algorithm to generate radial basis function (RBF)-like nets for classification problems , 1995, Neural Networks.

[21]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[22]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[23]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[24]  Josef Kittler,et al.  Complexity analysis of RBF networks for pattern recognition , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Bruce A. Whitehead,et al.  Evolving space-filling curves to distribute radial basis functions over an input space , 1994, IEEE Trans. Neural Networks.

[26]  Ales Leonardis,et al.  ExSel++: A General Framework to Extract Parametric Models , 1995, CAIP.

[27]  Ramesh C. Jain,et al.  A robust backpropagation learning algorithm for function approximation , 1994, IEEE Trans. Neural Networks.

[28]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[29]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[30]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[31]  Horst Bischof,et al.  Finding optimal neural networks for land use classification , 1998, IEEE Trans. Geosci. Remote. Sens..

[32]  Y Lu,et al.  A Sequential Learning Scheme for Function Approximation Using Minimal Radial Basis Function Neural Networks , 1997, Neural Computation.

[33]  Geoffrey E. Hinton,et al.  Learning Population Codes by Minimizing Description Length , 1993, Neural Computation.

[34]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[35]  D. Lowe,et al.  Adaptive radial basis function nonlinearities, and the problem of generalisation , 1989 .

[36]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[37]  Christopher M. Bishop,et al.  Curvature-driven smoothing: a learning algorithm for feedforward networks , 1993, IEEE Trans. Neural Networks.

[38]  Horst Bischof,et al.  Dealing with occlusions in the eigenspace approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.