Real-time Hebbian Learning from Autoencoder Features for Control Tasks

Neural plasticity and in particular Hebbian learning play an important role in many research areas related to artficial life. By allowing artificial neural networks (ANNs) to adjust their weights in real time, Hebbian ANNs can adapt over their lifetime. However, even as researchers improve and extend Hebbian learning, a fundamental limitation of such systems is that they learn correlations between preexisting static features and network outputs. A Hebbian ANN could in principle achieve significantly more if it could accumulate new features over its lifetime from which to learn correlations. Interestingly, autoencoders, which have recently gained prominence in deep learning, are themselves in effect a kind of feature accumulator that extract meaningful features from their inputs. The insight in this paper is that if an autoencoder is connected to a Hebbian learning layer, then the resulting Realtime Autoencoder-Augmented Hebbian Network (RAAHN) can actually learn new features (with the autoencoder) while simultaneously learning control policies from those new features (with the Hebbian layer) in real time as an agent experiences its environment. In this paper, the RAAHN is shown in a simulated robot maze navigation experiment to enable a controller to learn the perfect navigation strategy significantly more often than several Hebbian-based variant approaches that lack the autoencoder. In the long run, this approach opens up the intriguing possibility of real-time deep learning for control.

[1]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[2]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Jonathan Baxter The evolution of learning algorithms for artificial neural networks , 1993 .

[5]  Dario Floreano,et al.  Evolving neuromodulatory topologies for reinforcement learning-like problems , 2007, 2007 IEEE Congress on Evolutionary Computation.

[6]  Jochen J. Steil,et al.  Rare Neural Correlations Implement Robotic Conditioning with Delayed Rewards and Disturbances , 2013, Front. Neurorobot..

[7]  Y. Niv,et al.  Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002 .

[8]  Risto Miikkulainen,et al.  Evolving adaptive neural networks with and without adaptive synapses , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[9]  Sebastian Risi,et al.  A unified approach to evolving plasticity and neural geometry , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[10]  Alan D. Blair,et al.  Evolving Plastic Neural Networks for Online Learning: Review and Future Directions , 2012, Australasian Conference on Artificial Intelligence.

[11]  Dario Floreano,et al.  Evolutionary robots with on-line self-organization and behavioral fitness , 2000, Neural Networks.

[12]  Jochen J. Steil,et al.  Solving the Distal Reward Problem with Rare Correlations , 2013, Neural Computation.

[13]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[14]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[15]  Ben Jones,et al.  Novelty of behaviour as a basis for the neuro-evolution of operant reward learning , 2009, GECCO.

[16]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[17]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[18]  Risto Miikkulainen,et al.  Self-Organization of Spatiotemporal Receptive Fields and Laterally Connected Direction and Orientation Maps , 2002, Neurocomputing.

[19]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[20]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[22]  Sebastian Risi,et al.  Indirectly Encoding Neural Plasticity as a Pattern of Local Rules , 2010, SAB.

[23]  Dario Floreano,et al.  Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios , 2008, ALIFE.

[24]  Charles E. Hughes,et al.  Evolving plastic neural networks with novelty search , 2010, Adapt. Behav..

[25]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[26]  Kenneth O. Stanley,et al.  From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation , 2012, Neural Networks.

[27]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[28]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[30]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.