Circular backpropagation networks for classification

The class of mapping networks is a general family of tools to perform a wide variety of tasks. This paper presents a standardized, uniform representation for this class of networks, and introduces a simple modification of the multilayer perceptron with interesting practical properties, especially well suited to cope with pattern classification tasks. The proposed model unifies the two main representation paradigms found in the class of mapping networks for classification, namely, the surface-based and the prototype-based schemes, while retaining the advantage of being trainable by backpropagation. The enhancement in the representation properties and the generalization performance are assessed through results about the worst-case requirement in terms of hidden units and about the Vapnik-Chervonenkis dimension and cover capacity. The theoretical properties of the network also suggest that the proposed modification to the multilayer perceptron is in many senses optimal. A number of experimental verifications also confirm theoretical results about the model's increased performances, as compared with the multilayer perceptron and the Gaussian radial basis functions network.

[1]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[2]  Federico Girosi,et al.  On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions , 1996, Neural Computation.

[3]  Richard M. Dudley,et al.  Some special vapnik-chervonenkis classes , 1981, Discret. Math..

[4]  Germano C. Vasconcelos,et al.  Investigating feedforward neural networks with respect to the rejection of spurious patterns , 1995, Pattern Recognit. Lett..

[5]  Adam Kowalczyk,et al.  Estimates of Storage Capacity of Multilayer Perceptron with Threshold Logic Hidden Units , 1997, Neural Networks.

[6]  Isabelle Guyon,et al.  Automatic Capacity Tuning of Very Large VC-Dimension Classifiers , 1992, NIPS.

[7]  Martin Anthony,et al.  Computational Learning Theory , 1992 .

[8]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[9]  Eduardo D. Sontag Sigmoids Distinguish More Efficiently Than Heavisides , 1989, Neural Computation.

[10]  Yih-Fang Huang,et al.  Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[11]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[12]  K. Lang,et al.  Learning to tell two spirals apart , 1988 .

[13]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[14]  Michael R. Berthold,et al.  Boosting the Performance of RBF Networks with Dynamic Decay Adjustment , 1994, NIPS.

[15]  Stephen M. Omohundro,et al.  Geometric learning algorithms , 1990 .

[16]  Robert P. W. Duin,et al.  A note on comparing classifiers , 1996, Pattern Recognit. Lett..

[17]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[18]  Louis ten Bosch,et al.  Speaker normalization for automatic speech recognition — An on-line approach , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[19]  Adam Kowalczyk,et al.  Counting Function Theorem for Multi-Layer Networks , 1993, NIPS.

[20]  Separable Regions On Hidden Nodes for Neural Nets , 1989 .

[21]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[22]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[23]  David Casasent,et al.  Minimum-cost associative processor for piecewise-hyperspherical classification , 1993, Neural Networks.

[24]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[25]  Qinghua Zhang,et al.  Wavelet networks , 1992, IEEE Trans. Neural Networks.

[26]  Eduardo D. Sontag,et al.  Shattering All Sets of k Points in General Position Requires (k 1)/2 Parameters , 1997, Neural Computation.

[27]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[28]  Peter J. W. Rayner,et al.  Generalization and PAC learning: some new results for the class of generalized single-layer networks , 1995, IEEE Trans. Neural Networks.

[29]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[30]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[31]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[32]  Neil Burgess,et al.  A Constructive Algorithm that Converges for Real-Valued Input Patterns , 1994, Int. J. Neural Syst..

[33]  S. K. Park,et al.  Random number generators: good ones are hard to find , 1988, CACM.

[34]  Robert O. Winder,et al.  Enumeration of Seven-Argument Threshold Functions , 1965, IEEE Trans. Electron. Comput..

[35]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[36]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[37]  Jacques de Villiers,et al.  Backpropagation neural nets with one and two hidden layers , 1993, IEEE Trans. Neural Networks.

[38]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.