Neural Plasticity and Minimal Topologies for Reward-Based Learning

Artificial neural networks for online learning problems are often implemented with synaptic plasticity to achieve adaptive behaviour. A common problem is that the overall learning dynamics are emergent properties strongly dependent on the correct combination of neural architectures, plasticity rules and environmental features. Which complexity in architectures and learning rules is required to match specific control and learning problems is not clear. Here a set of homosynaptic plasticity rules is applied to topologically unconstrained neural controllers while operating and evolving in dynamic reward-based scenarios. Performances are monitored on simulations of bee foraging problems and T-maze navigation. Varying reward locations compel the neural controllers to adapt their foraging strategies over time, fostering online reward-based learning. In contrast to previous studies, the results here indicate that reward-based learning in complex dynamic scenarios can be achieved with basic plasticity rules and minimal topologies.

[1]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[2]  Dario Floreano,et al.  Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios , 2008, ALIFE.

[3]  Y. Niv,et al.  Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002 .

[4]  Isaac Meilijson,et al.  Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..

[5]  Bruno A. Olshausen,et al.  Book Review , 2003, Journal of Cognitive Neuroscience.

[6]  Dario Floreano,et al.  Evolving neuromodulatory topologies for reinforcement learning-like problems , 2007, 2007 IEEE Congress on Evolutionary Computation.

[7]  Olaf Sporns,et al.  An Embodied Model of Learning, Plasticity, and Reward , 2002, Adapt. Behav..

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[10]  S. J. Martin,et al.  Synaptic plasticity and memory: an evaluation of the hypothesis. , 2000, Annual review of neuroscience.

[11]  Wulfram Gerstner,et al.  SPIKING NEURON MODELS Single Neurons , Populations , Plasticity , 2002 .

[12]  Jean-Marc Fellous,et al.  Computational Models of Neuromodulation , 1998, Neural Computation.

[13]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[14]  Dario Floreano,et al.  Evolution of Plastic Control Networks , 2001, Auton. Robots.

[15]  Dario Floreano,et al.  Neuroevolution: from architectures to learning , 2008, Evol. Intell..