论文信息 - A 55-nm, 1.0–0.4V, 1.25-pJ/MAC Time-Domain Mixed-Signal Neuromorphic Accelerator With Stochastic Synapses for Reinforcement Learning in Autonomous Mobile Robots

A 55-nm, 1.0–0.4V, 1.25-pJ/MAC Time-Domain Mixed-Signal Neuromorphic Accelerator With Stochastic Synapses for Reinforcement Learning in Autonomous Mobile Robots

Reinforcement learning (RL) is a bio-mimetic learning approach, where agents can learn about an environment by performing specific tasks without any human supervision. RL is inspired by behavioral psychology, where agents take actions to maximize a cumulative reward. In this paper, we present an RL neuromorphic accelerator capable of performing obstacle avoidance in a mobile robot at the edge of the cloud. We propose an energy-efficient time-domain mixed-signal (TD-MS) computational framework. In TD-MS computation, we demonstrate that the energy to compute is proportional to the importance of the computation. We leverage the unique properties of stochastic networks and recent advances in Q-learning in the proposed RL implementation. The 55-nm test chip implements RL using a three-layered fully connected neural network and consumes a peak power of 690 $\mu \text{W}$ .

[1] Gert Cauwenberghs. A Nonlinear Noise-Shaping Delta-Sigma Modulator with On-Chip Reinforcement Learning* , 1999 .

[2] Naveen Verma,et al. A machine-learning classifier implemented in a standard 6T SRAM array , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).

[3] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[5] Marian Verhelst,et al. An always-on 3.8μJ/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[6] Hoi-Jun Yoo,et al. UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[7] Zhengya Zhang,et al. A 640M pixel/s 3.65mW sparse event-driven neuromorphic object recognition processor with on-chip learning , 2015, 2015 Symposium on VLSI Circuits (VLSI Circuits).

[8] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[9] Arijit Raychowdhury,et al. A 55nm time-domain mixed-signal neuromorphic accelerator with stochastic synapses and embedded reinforcement learning for autonomous micro-robots , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[10] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[11] Yichuan Tang,et al. Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[12] K. Doya. Complementary roles of basal ganglia and cerebellum in learning and motor control , 2000, Current Opinion in Neurobiology.

[13] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[14] S. Simon Wong,et al. 24.2 A 2.5GHz 7.7TOPS/W switched-capacitor matrix multiplier with co-designed local memory in 40nm , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[15] Gu-Yeon Wei,et al. 14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[16] Boris Murmann,et al. An 8-bit, 16 input, 3.2 pJ/op switched-capacitor dot product circuit in 28-nm FDSOI CMOS , 2016, 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC).

[17] Arijit Raychowdhury,et al. A 65nm compressive-sensing time-based ADC with embedded classification and INL-aware training for arrhythmia detection , 2017, 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[18] Jun-Seok Park,et al. 14.6 A 1.42TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[19] Siddharth Joshi,et al. Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines , 2015, Front. Neurosci..

[20] Naveen Verma,et al. 18.4 A matrix-multiplying ADC implementing a machine-learning classifier directly with data conversion , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[21] Wei Pan,et al. Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[22] Hoi-Jun Yoo,et al. 14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[23] Michael P. Flynn,et al. A 3.43TOPS/W 48.9pJ/pixel 50.1nJ/classification 512 analog neuron sparse coding neural network with on-chip learning and classification in 40nm CMOS , 2017, 2017 Symposium on VLSI Circuits.

[24] L Poole David,et al. Artificial Intelligence: Foundations of Computational Agents , 2010 .

[25] Gu-Yeon Wei,et al. DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications , 2018, IEEE Journal of Solid-State Circuits.

[26] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .