Neuromodulated Learning in Deep Neural Networks

In the brain, learning signals change over time and synaptic location, and are applied based on the learning history at the synapse, in the complex process of neuromodulation. Learning in artificial neural networks, on the other hand, is shaped by hyper-parameters set before learning starts, which remain static throughout learning, and which are uniform for the entire network. In this work, we propose a method of deep artificial neuromodulation which applies the concepts of biological neuromodulation to stochastic gradient descent. Evolved neuromodulatory dynamics modify learning parameters at each layer in a deep neural network over the course of the network's training. We show that the same neuromodulatory dynamics can be applied to different models and can scale to new problems not encountered during evolution. Finally, we examine the evolved neuromodulation, showing that evolution found dynamic, location-specific learning strategies.

[1]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[2]  Hervé Luga,et al.  A comparison of genetic regulatory network dynamics and encoding , 2017, GECCO.

[3]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[4]  M. Farries,et al.  Reinforcement learning with modulated spike timing dependent synaptic plasticity. , 2007, Journal of neurophysiology.

[5]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[6]  Jeff Clune,et al.  Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks , 2017, PloS one.

[7]  Kenneth O. Stanley,et al.  Differentiable plasticity: training plastic neural networks with backpropagation , 2018, ICML.

[8]  K. Fuxe,et al.  Understanding wiring and volume transmission , 2010, Brain Research Reviews.

[9]  W. Gerstner,et al.  Neuromodulated Spike-Timing-Dependent Plasticity, and Theory of Three-Factor Learning Rules , 2016, Front. Neural Circuits.

[10]  Sylvain Cussat-Blanc,et al.  Genetically-regulated Neuromodulation Facilitates Multi-Task Reinforcement Learning , 2015, GECCO.

[11]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[14]  A. Bonci,et al.  Role of Dopamine Neurons in Reward and Aversion: A Synaptic Plasticity Perspective , 2015, Neuron.

[15]  Jordan B. Pollack,et al.  Robot coverage control by evolved neuromodulation , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[16]  Jordan B. Pollack,et al.  Gene Regulatory Network Evolution Through Augmenting Topologies , 2015, IEEE Transactions on Evolutionary Computation.

[17]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[18]  E. Izhikevich Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.