论文信息 - A dual-memory architecture for reinforcement learning on neuromorphic platforms

A dual-memory architecture for reinforcement learning on neuromorphic platforms

Reinforcement learning (RL) is a foundation of learning in biological systems and provides a framework to address numerous challenges with real-world artificial intelligence applications. Efficient implementations of RL techniques could allow for agents deployed in edge-use cases to gain novel abilities, such as improved navigation, understanding complex situations and critical decision making. Toward this goal, we describe a flexible architecture to carry out RL on neuromorphic platforms. This architecture was implemented using an Intel neuromorphic processor and demonstrated solving a variety of tasks using spiking dynamics. Our study proposes a usable solution for real-world RL applications and demonstrates applicability of the neuromorphic platforms for RL problems.

Wilkie Olin-Ammentorp | Yury Sokolov | Maxim Bazhenov

[1] Julien Dupeyroux. A toolbox for neuromorphic sensing in robotics , 2021, ArXiv.

[2] Wulfram Gerstner,et al. SPIKING NEURON MODELS Single Neurons , Populations , Plasticity , 2002 .

[3] Alois Knoll,et al. Neuromorphic implementations of neurobiological learning algorithms for spiking neural networks , 2015, Neural Networks.

[4] J. W. Rudy,et al. The hippocampal indexing theory and episodic memory: Updating the index , 2007, Hippocampus.

[5] Carver A. Mead,et al. Neuromorphic electronic systems , 1990, Proc. IEEE.

[6] ZhangYunqi,et al. The Architectural Implications of Autonomous Driving , 2018 .

[7] Jane X. Wang,et al. Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.

[10] Emre Neftci,et al. Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[11] S. Herculano‐Houzel,et al. The search for true numbers of neurons and glial cells in the human brain: A review of 150 years of cell counting , 2016, The Journal of comparative neurology.

[12] Peer Neubert,et al. A comparison of vector symbolic architectures , 2020, Artificial Intelligence Review.

[13] G. Einevoll,et al. From grid cells to place cells: A mathematical model , 2006, Hippocampus.

[14] Lingjia Tang,et al. The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.

[15] James L. McClelland,et al. What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.

[16] P. Glimcher. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.

[17] Christian K. Machens,et al. Efficient codes and balanced networks , 2016, Nature Neuroscience.

[18] David C Rowland,et al. Place cells, grid cells, and memory. , 2015, Cold Spring Harbor perspectives in biology.

[19] Catherine D. Schuman,et al. A Survey of Neuromorphic Computing and Neural Networks in Hardware , 2017, ArXiv.

[20] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[21] Mingguo Zhao,et al. A system hierarchy for brain-inspired computing , 2020, Nature.

[22] Amir Hussain,et al. Applications of Deep Learning and Reinforcement Learning to Biological Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[23] P. Haydon. Glia: listening and talking to the synapse , 2001, Nature Reviews Neuroscience.

[24] E. Syková,et al. Astroglial networks scale synaptic activity and plasticity , 2011, Proceedings of the National Academy of Sciences.

[25] Peer Neubert,et al. An Introduction to Hyperdimensional Computing for Robotics , 2019, KI - Künstliche Intelligenz.

[26] A. F. Adams,et al. The Survey , 2021, Dyslexia in Higher Education.

[27] Hong Wang,et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[28] Paul W. Glimcher. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis (Proceedings of the National Academy of Sciences of the United States of America (2011) 108, S3, (15647-15654) DOI: 10.1073/pnas.1014269108) , 2011 .

[29] Timothée Masquelier,et al. Deep Learning in Spiking Neural Networks , 2018, Neural Networks.

[30] Johannes Schemmel,et al. Reward-based learning under hardware constraints—using a RISC processor embedded in a neuromorphic substrate , 2013, Front. Neurosci..

[31] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[32] R. Reid,et al. Temporal Coding of Visual Information in the Thalamus , 2000, The Journal of Neuroscience.

[33] Garrick Orchard,et al. Advancing Neuromorphic Computing With Loihi: A Survey of Results and Outlook , 2021, Proceedings of the IEEE.

[34] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.

[35] Wolfgang Maass,et al. Eligibility traces provide a data-inspired alternative to backpropagation through time , 2019 .

[36] G. Buzsáki. The Brain from Inside Out , 2019 .

[37] Bruno A. Olshausen,et al. Resonator networks for factoring distributed representations of data structures , 2020, ArXiv.

[38] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[39] Shih-Chii Liu,et al. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification , 2017, Front. Neurosci..

[40] J. Born,et al. The memory function of sleep , 2010, Nature Reviews Neuroscience.

[41] Victor Talpaert,et al. Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.

[42] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[43] S. Furber,et al. Comparison of Artificial and Spiking Neural Networks on Digital Hardware , 2021, Frontiers in Neuroscience.

[44] Eric L. Denovellis,et al. Hippocampal replay of experience at real-world speeds , 2020, bioRxiv.

[45] Denis Mareschal,et al. A complementary learning systems approach to temporal difference learning , 2019, Neural Networks.

[46] Nancy A. Lynch,et al. Winner-Take-All Computation in Spiking Neural Networks , 2019, ArXiv.