Neural net algorithms that learn in polynomial time from examples and queries

An algorithm which trains networks using examples and queries is proposed. In a query, the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. The author's algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth two, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k=/<4. While target functions and input distributions can be described for which the algorithm will fail for larger k, it appears likely to work well in practice. Tests of a variant of the algorithm have consistently and rapidly learned random nets of this type. Computational efficiency figures are given. The algorithm can also be proved to learn intersections of k half-spaces in R(n) in time polynomial in both n and k. A variant of the algorithm can learn arbitrary depth layered threshold networks with n inputs and k units in the first hidden layer in time polynomial in the larger of n and k but exponential in the smaller of the two.

[1]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[2]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[3]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[4]  Leslie G. Valiant,et al.  Fast probabilistic algorithms for hamiltonian circuits and matchings , 1977, STOC '77.

[5]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[6]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[7]  Yann LeCun,et al.  Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[8]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[9]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[10]  Terrence J. Sejnowski,et al.  A 'Neural' Network that Learns to Play Backgammon , 1987, NIPS.

[11]  Terrence J. Sejnowski,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cognitive Sciences.

[12]  D. Angluin Queries and Concept Learning , 1988 .

[13]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[14]  K. Lang,et al.  Learning to tell two spirals apart , 1988 .

[15]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[16]  Yann LeCun,et al.  Improving the convergence of back-propagation learning with second-order methods , 1989 .

[17]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[18]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[19]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[20]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[21]  ERIC B. BAUM,et al.  On learning a union of half spaces , 1990, J. Complex..

[22]  Eric B. Baum,et al.  The Perceptron Algorithm is Fast for Nonmalicious Distributions , 1990, Neural Computation.

[23]  Eric B. Baum,et al.  A Polynomial Time Algorithm That Learns Two Hidden Unit Nets , 1990, Neural Computation.

[24]  Geoffrey E. Hinton,et al.  A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.

[25]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.