Induced specialization of context units for temporal pattern recognition and reproduction

Additional inputs to a feedforward network, derived from the output of the hidden layer neurons, allow a feedforward network to deal with temporal pattern recognition and reproduction tasks. These 'network derived' or 'context' inputs augment the 'true' inputs to the network and allow the network to retain past information necessary for temporal sequence processing. The choice of which hidden neurons to retain to provide the context inputs is difficult. Use of all the hidden neurons increases the size of the overall network resulting in poorer generalization performance. The problem is complicated due to difficulty in choosing the number of hidden layer neurons in the first place. In this paper, we propose the use of regularization terms in the sum-of-squared error cost function. Assuming the hidden layer neurons are indexed 1,2,...,m, the regularization terms force the differentiation of hidden neurons 1 through m/sub 1/, and m/sub 2/ through m (where 1<m/sub 1/<m/sub 2/<m). Both m/sub 1/ and m/sub 2/ are controllable and allow fringe neurons to be used to provide the context inputs if the number of context units to use is known. When the number of context neurons to use cannot be determined, the regularization terms minimize m/sub 1/, and maximize m/sub 2/, while hidden neurons m/sub 1/ through m/sub 2/ are penalized for differentiation. An amplitude detection simulation is used to evaluate the efficacy of the proposed paradigm.

[1]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[2]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[3]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[4]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[5]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[6]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[7]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[8]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[9]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[10]  J.C. Principe,et al.  A New Neural Network Model For Temporal Processing , 1990, [1990] Proceedings of the Twelfth Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[11]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[12]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[13]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.