论文信息 - Learning and generalization in feed-forward neural networks

Learning and generalization in feed-forward neural networks

Aspects of learning and generalization in feed-forward neural networks are studied. The networks are taught using the backpropagation learning algorithm. The performance of the algorithm is studied using a training set which can be made to have a variable difficulty. Using such a training set the performance is evaluated and improvements and modifications suggested. A simple classification of problem domain types is made and a particular class is suggested to be the most appropriate for the 3—layer feed-forward network to learn. This class is characterized by underlying regularities among the training set members, such that the mapping required for each pattern in the training set is consistent with all the other required pattern mappings. The suitability of this class of training sets is demonstrated with observation of the emergent properties of the network in actual learning speed and nature, and in the generalization ability displayed after learning an incomplete training set. This behaviour is contrasted with training sets not possessing the underlying properties of this class, from which it is concluded that this type of network is more effectively used for extracting salient information about a training set, given that underlying regularities exist, rather than for other classes of mappings. The dependence of generalization of the network on such problem domains is studied as a function of hidden layer size. It is shown that in general the number of different solutions available in the algorithm's search space increases rapidly with the hidden layer size. Despite this, it is shown that the generalization performance does not degrade correspondingly, but in fact remains at a steady high level. This observation suggests that the salient information about a training set is more likely to be extracted during learning, as opposed to merely mapping the patterns independently (which form a large set of other possible solutions), and that this information is stored in a distributed manner throughout all the weights of the network.

Frank J. Smieja

[1] John McCarthy,et al. Programs with common sense , 1960 .

[2] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[3] Victoria Y. Yoon,et al. Desknet: The dermatology expert system with knowledge-based network , 1988, Neural Networks.

[4] Michael G. Norman,et al. Neural Network Applications in the Edinburgh Concurrent Supercomputer Project , 1989, NATO Neurocomputing.

[5] B. M. Forrest. Restoration on binary images using networks of analogue neurons , 1988 .

[6] Solomon Kullback,et al. Information Theory and Statistics , 1960 .

[7] Geoffrey E. Hinton,et al. Learning sets of filters using back-propagation , 1987 .

[8] I. G. BONNER CLAPPISON. Editor , 1960, The Electric Power Engineering Handbook - Five Volume Set.

[9] T. Sejnowski,et al. Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[10] Peter Danielson. Artificial Intelligence and Natural Man , 1982 .

[11] Patrick Henry Winston,et al. The Society Theory of Thinking , 1982 .