Generalization by Neural Networks

Neural networks have traditionally been applied to recog- nition problems, and most learning algorithms are tailored to those problems. We discuss the requirements of learning for generalization, where the traditional methods based on gradient descent have limited success. We present a new stochastic learning algorithm based on sim- ulated annealing in weight space. We verify the convergence properties and feasibility of the algorithm. We also describe an implementation of the algorithm and validation experiments. Index Terms-Learning, generalization, and neural networks.

[1]  J. Stephen Judd,et al.  Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.

[2]  K. Yamada,et al.  Handwritten numeral recognition by multilayered neural network with improved learning algorithm , 1989, International 1989 Joint Conference on Neural Networks.

[3]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[4]  H. Van Dyke Parunak,et al.  Material Handling: A Conservative Domain for Neural Connectivity and Propagation , 1987, AAAI.

[5]  Stephen I. Gallant,et al.  Connectionist expert systems , 1988, CACM.

[6]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[7]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[8]  J. S. Judd,et al.  Complexity of Connectionist Learning with Various Node Functions , 1987 .

[9]  J. Stephen Judd,et al.  Learning in neural networks , 1988, COLT '88.

[10]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[11]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[12]  R. Hecht-Nielsen,et al.  Neurocomputer applications , 1989 .

[13]  Jocelyn Sietsma,et al.  Creating artificial neural networks that generalize , 1991, Neural Networks.

[14]  J. Stephen Judd Memorization and Generalization , 1990 .

[15]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[16]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[17]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[18]  James R. Slagle,et al.  Using Artificial Neural Nets for Statistical Discovery: Observations after Using Backpropogation, Expert Systems, and Multiple-Linear Regression on Clinical Trial Data , 1989, Complex Syst..

[19]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[20]  Shashi Shekhar,et al.  Stochastic backpropagation: a learning algorithm for generalization problems , 1989, [1989] Proceedings of the Thirteenth Annual International Computer Software & Applications Conference.

[21]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[22]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.