Efficient perceptron learning using constrained steepest descent

An algorithm is proposed for training the single-layered perceptron. The algorithm follows successive steepest descent directions with respect to the perceptron cost function, taking care not to increase the number of misclassified patterns. The problem of finding these directions is stated as a quadratic programming task, to which a fast and effective solution is proposed. The resulting algorithm has no free parameters and therefore no heuristics are involved in its application. It is proved that the algorithm always converges in a finite number of steps. For linearly separable problems, it always finds a hyperplane that completely separates patterns belonging to different categories. Termination of the algorithm without separating all given patterns means that the presented set of patterns is indeed linearly inseparable. Thus the algorithm provides a natural criterion for linear separability. Compared to other state of the art algorithms, the proposed method exhibits substantially improved speed, as demonstrated in a number of demanding benchmark classification tasks.

[1]  James A. Anderson,et al.  Neurocomputing: Foundations of Research , 1988 .

[2]  Farid U. Dowla,et al.  Backpropagation Learning for Multilayer Feed-Forward Neural Networks Using the Conjugate Gradient Method , 1991, Int. J. Neural Syst..

[3]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[4]  David Casasent,et al.  Minimum-cost associative processor for piecewise-hyperspherical classification , 1993, Neural Networks.

[5]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[6]  Terrence J. Sejnowski,et al.  Learned classification of sonar targets using a massively parallel network , 1988, IEEE Trans. Acoust. Speech Signal Process..

[7]  John S. Denker,et al.  Strategies for Teaching Layered Networks Classification Tasks , 1987, NIPS.

[8]  Etienne Barnard,et al.  A comparison between criterion functions for linear classifiers, with an application to neural nets , 1989, IEEE Trans. Syst. Man Cybern..

[9]  Singiresu S. Rao,et al.  Optimization Theory and Applications , 1980, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  Paulo J. G. Lisboa,et al.  Translation, rotation, and scale invariant pattern recognition by high-order neural networks and moment classifiers , 1992, IEEE Trans. Neural Networks.

[11]  John R. Deller,et al.  Selective training of feedforward artificial neural networks using matrix perturbation theory , 1995, Neural Networks.

[12]  Lamberto Cesari,et al.  Optimization-Theory And Applications , 1983 .

[13]  Dennis J. Volper,et al.  Quadratic function nodes: Use, structure and training , 1990, Neural Networks.

[14]  S. Ergezinger,et al.  An accelerated learning algorithm for multilayer perceptrons: optimization layer by layer , 1995, IEEE Trans. Neural Networks.

[15]  Kevin Baker,et al.  Classification of radar returns from the ionosphere using neural networks , 1989 .

[16]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[17]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[18]  Eytan Domany,et al.  Learning by Choice of Internal Representations , 1988, Complex Syst..

[19]  Leon Bobrowski,et al.  Design of piecewise linear classifiers from formal neurons by a basis exchange technique , 1991, Pattern Recognit..

[20]  Jong-Shi Pang,et al.  Methods for quadratic programming: A survey☆ , 1983 .

[21]  L. BOBROWSKI,et al.  A method of synthesis of linear discriminant function in the case of nonseparability , 1984, Pattern Recognit..

[22]  Etienne Barnard Performance and generalization of the classification figure of merit criterion function , 1991, IEEE Trans. Neural Networks.

[23]  Etsuji Tomita,et al.  Separability of internal representations in multilayer perceptrons with application to learning , 1993, Neural Networks.

[24]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[25]  H. D. Block The perceptron: a model for brain functioning. I , 1962 .