Direct Transfer of Learned Information Among Neural Networks

A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such information into symbolic form (in which it may not be readily expressible), we investigate the direct transfer of information encoded as weights. Here, we focus on how transfer can be used to address the important problem of improving neural network learning speed. First we present an exploratory study of the somewhat surprising effects of pre-setting network weights on subsequent learning. Guided by hypotheses from this study, we sped up back-propagation learning for two speech recognition tasks. By transferring weights from smaller networks trained on subtasks, we achieved speedups of up to an order of magnitude compared with training starting with random weights, even taking into account the time to train the smaller networks. We include results on how transfer scales to a large phoneme recognition problem.

[1]  W. Fisher,et al.  An acoustic‐phonetic data base , 1987 .

[2]  Kiyohiro Shikano,et al.  Modularity and scaling in large phonemic neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[4]  Candace A. Kamm,et al.  Effect of neural network input span on phoneme classification , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[5]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[6]  Li-Min Fu,et al.  Recognition of semantically incorrect rules: a neural-network approach , 1990, IEA/AIE.

[7]  C. A. Kamm,et al.  Improving a phoneme classification neural network through problem decomposition , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.