Backpropagation separates when perceptrons do

Consideration is given to the behavior of the least-squares problem that arises when one attempts to train a feedforward net with no hidden neurons. It is assumed that the net has monotonic nonlinear output units. Under the assumption that a training set is separable, that is, that there is a set of achievable outputs for which the error is zero, the authors show that there are no nonglobal minima. More precisely, they assume that the error is of a threshold least-mean square (LMS) type, in that the error function is zero for values beyond the target value. The authors' proof gives, in addition, the following stronger result: the continuous gradient adjustment procedure is such that from any initial weight configuration a separating set of weights is obtained in finite time. Thus they have a precise analog of the perceptron learning theorem. The authors contrast their results with the more classical pattern recognition problem of threshold LMS with linear output units. >

[1]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2]  J. P. Lasalle The stability of dynamical systems , 1976 .

[3]  John S. Denker,et al.  Strategies for Teaching Layered Networks Classification Tasks , 1987, NIPS.

[4]  R. Raghavan,et al.  Gradient descent fails to separate , 1988, IEEE 1988 International Conference on Neural Networks.

[5]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..