Robust Trainability of Single Neurons

It is well known that (McCulloch-Pitts) neurons are efficiently trainable to learn an unknown halfspace from examples, using linear-programming methods. We want to analyze how the learning performance degrades when the representational power of the neuron is overstrained, i.e., if more complex concepts than just halfspaces are allowed. We show that the problem of learning a probably almost optimal weight vector for a neuron is so difficult that the minimum error cannot even be approximated to within a constant factor in polynomial time (unless RP = NP); we obtain the same hardness result for several variants of this problem. We considerably strengthen these negative results for neurons with binary weights 0 or 1. We also show that neither heuristical learning nor learning by sigmoidal neurons with a constant reject rate is efficiently possible (unless RP = NP).

[1]  Carsten Lund,et al.  Efficient probabilistically checkable proofs and applications to approximations , 1993, STOC.

[2]  Carsten Lund,et al.  Proof verification and hardness of approximation problems , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[3]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[4]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[5]  Franco P. Preparata,et al.  The Densest Hemisphere Problem , 1978, Theor. Comput. Sci..

[6]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[7]  Phokion G. Kolaitis,et al.  Approximation properties of NP minimization classes , 1991, [1991] Proceedings of the Sixth Annual Structure in Complexity Theory Conference.

[8]  Alessandro Panconesi,et al.  Completeness in Approximation Classes , 1989, Inf. Comput..

[9]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[10]  Kevin S. Van Horn,et al.  Learning as optimization , 1994 .

[11]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[12]  Linda Sellie,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[13]  J. Stephen Judd,et al.  Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.

[14]  Naoki Abe,et al.  Polynomial learnability of probabilistic concepts with respect to the Kullback-Leibler divergence , 1991, COLT '91.

[15]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[16]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[17]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.