Global Reinforcement Learning in Neural Networks with Stochastic Synapses

We have found a more general formulation of the REINFORCE learning principle which had been proposed by R. J. Williams for the case of artificial neural networks with stochastic cells ("Boltzmann machines"). This formulation has enabled us to apply the principle to global reinforcement learning in networks with deterministic neural cells but stochastic synapses, and to suggest two groups of new learning rules for such networks, including simple local rules. Numerical simulations have shown that at least for several popular benchmark problems one of the new learning rules may provide results on a par with the best known global reinforcement techniques.

[1]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[2]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[3]  Geoffrey E. Hinton,et al.  Varieties of Helmholtz Machine , 1996, Neural Networks.

[4]  Wolfgang Maass,et al.  Dynamic Stochastic Synapses as Computational Units , 1997, Neural Computation.

[5]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[6]  Marwan A. Jabri,et al.  Summed Weight Neuron Perturbation: An O(N) Improvement Over Weight Perturbation , 1992, NIPS.

[7]  K. P. Unnikrishnan,et al.  Alopex: A Correlation-Based Learning Algorithm for Feedforward and Recurrent Neural Networks , 1994, Neural Computation.

[8]  Konstantin K. Likharev,et al.  Neuromorphic architectures for nanoelectronic circuits , 2004, Int. J. Circuit Theory Appl..

[9]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[10]  Marwan A. Jabri,et al.  Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks , 1992, IEEE Trans. Neural Networks.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[13]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[14]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[15]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..