Tianjic: A Unified and Scalable Chip Bridging Spike-Based and Continuous Neural Computation

Toward the long-standing dream of artificial intelligence, two successful solution paths have been paved: 1) neuromorphic computing and 2) deep learning. Recently, they tend to interact for simultaneously achieving biological plausibility and powerful accuracy. However, models from these two domains have to run on distinct substrates, i.e., neuromorphic platforms and deep learning accelerators, respectively. This architectural incompatibility greatly compromises the modeling flexibility and hinders promising interdisciplinary research. To address this issue, we build a unified model description framework and a unified processing architecture (Tianjic), which covers the full stack from software to hardware. By implementing a set of integration and transformation operations, Tianjic is able to support spiking neural networks, biological dynamic neural networks, multilayered perceptron, convolutional neural networks, recurrent neural networks, and so on. A compatible routing infrastructure enables homogeneous and heterogeneous scalability on a decentralized many-core network. Several optimization methods are incorporated, such as resource and data sharing, near-memory processing, compute/access skipping, and intra-/inter-core pipeline, to improve performance and efficiency. We further design streaming mapping schemes for efficient network deployment with a flexible tradeoff between execution throughput and resource overhead. A 28-nm prototype chip is fabricated with >610-GB/s internal memory bandwidth. A variety of benchmarks are evaluated and compared with GPUs and several existing specialized platforms. In summary, the fully unfolded mapping can achieve significantly higher throughput and power efficiency; the semi-folded mapping can save 30x resources while still presenting comparable performance on average. Finally, two hybrid-paradigm examples, a multimodal unmanned bicycle and a hybrid neural network, are demonstrated to show the potential of our unified architecture. This article paves a new way to explore neural computing.

[1]  Dong Wang,et al.  Development of a neuromorphic computing system , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[2]  Matthew Cook,et al.  Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[3]  Geoffrey Zweig,et al.  The microsoft 2016 conversational speech recognition system , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Yoshua Bengio,et al.  Dendritic error backpropagation in deep cortical microcircuits , 2017, ArXiv.

[5]  Eric S. Chung,et al.  A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[6]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[7]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[8]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[9]  Song Han,et al.  ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.

[10]  Hoi-Jun Yoo,et al.  UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[11]  Dharmendra S. Modha,et al.  Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference , 2018, ArXiv.

[12]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Anthony Maida,et al.  BP-STDP: Approximating Backpropagation using Spike Timing Dependent Plasticity , 2017, Neurocomputing.

[15]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[16]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[17]  Robert A. Legenstein,et al.  Long short-term memory and Learning-to-learn in networks of spiking neurons , 2018, NeurIPS.

[18]  Pieter R. Roelfsema,et al.  Control of synaptic plasticity in deep cortical networks , 2018, Nature Reviews Neuroscience.

[19]  Bernard Brezzo,et al.  TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[20]  Lei Zhang,et al.  Neuromorphic accelerators: A comparison between neuroscience and machine-learning approaches , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[22]  Lei Deng,et al.  Fast Object Tracking on a Many-Core Neural Network Chip , 2018, Front. Neurosci..

[23]  Mingguo Zhao,et al.  Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.

[24]  Gu-Yeon Wei,et al.  Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[25]  Lei Deng,et al.  Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[28]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[30]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[31]  Max Welling,et al.  Deep Spiking Networks , 2016, ArXiv.

[32]  Steve B. Furber,et al.  The SpiNNaker Project , 2014, Proceedings of the IEEE.

[33]  Johannes Schemmel,et al.  A wafer-scale neuromorphic hardware system for large-scale neural modeling , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[34]  Yuan Xie,et al.  DRISA: A DRAM-based Reconfigurable In-Situ Accelerator , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[35]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[36]  Gang Pan,et al.  Spiking Deep Residual Network , 2018, ArXiv.

[37]  Wenguang Chen,et al.  Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler , 2017, ASPLOS.

[38]  Giacomo Indiveri,et al.  A reconfigurable on-line learning spiking neuromorphic processor comprising 256 neurons and 128K synapses , 2015, Front. Neurosci..

[39]  Rodrigo Alvarez-Icaza,et al.  Neurogrid: A Mixed-Analog-Digital Multichip System for Large-Scale Neural Simulations , 2014, Proceedings of the IEEE.

[40]  Lei Deng,et al.  SemiMap: A Semi-Folded Convolution Mapping for Speed-Overhead Balance on Crossbars , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[41]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[42]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[43]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[44]  Kaushik Roy,et al.  ReStoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing , 2019, Front. Neurosci..

[45]  Chris Eliasmith,et al.  Spiking Deep Networks with LIF Neurons , 2015, ArXiv.

[46]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Si Wu,et al.  Dynamics and Computation of Continuous Attractors , 2008, Neural Computation.

[48]  Shenghuo Zhu,et al.  Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM , 2017, AAAI.

[49]  Shuchang Zhou,et al.  Effective Quantization Methods for Recurrent Neural Networks , 2016, ArXiv.

[50]  Shimon Ullman,et al.  Using neuroscience to develop artificial intelligence , 2019, Science.

[51]  L. F. Abbott,et al.  Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.

[52]  Lei Deng,et al.  Direct Training for Spiking Neural Networks: Faster, Larger, Better , 2018, AAAI.

[53]  Trevor Bekolay,et al.  A Large-Scale Model of the Functioning Brain , 2012, Science.

[54]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[55]  Hongbin Zha,et al.  Alternating Multi-bit Quantization for Recurrent Neural Networks , 2018, ICLR.

[56]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[57]  Hojjat Adeli,et al.  Spiking Neural Networks , 2009, Int. J. Neural Syst..

[58]  Wulfram Gerstner,et al.  Neuronal Dynamics: From Single Neurons To Networks And Models Of Cognition , 2014 .

[59]  Samuel Palermo,et al.  A Reconfigurable 16/32 Gb/s Dual-Mode NRZ/PAM4 SerDes in 65-nm CMOS , 2017, IEEE Journal of Solid-State Circuits.

[60]  Leibo Liu,et al.  A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications , 2018, IEEE Journal of Solid-State Circuits.

[61]  Zhengya Zhang,et al.  A Sparse Coding Neural Network ASIC With On-Chip Learning for Feature Extraction and Encoding , 2015, IEEE Journal of Solid-State Circuits.

[62]  Andrew S. Cassidy,et al.  Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.

[63]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[64]  Finale Doshi-Velez,et al.  Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models , 2016, ArXiv.

[65]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[66]  Konrad P. Körding,et al.  Toward an Integration of Deep Learning and Neuroscience , 2016, bioRxiv.

[67]  Xiaochen Peng,et al.  NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[68]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  E. Vianello,et al.  Variability-tolerant Convolutional Neural Network for Pattern Recognition applications based on OxRAM synapses , 2014, 2014 IEEE International Electron Devices Meeting.

[70]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[71]  H. Francis Song,et al.  Machine Theory of Mind , 2018, ICML.

[72]  Timothy P Lillicrap,et al.  Towards deep learning with segregated dendrites , 2016, eLife.

[73]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[74]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[75]  Wolfgang Maass,et al.  Networks of Spiking Neurons: The Third Generation of Neural Network Models , 1996, Electron. Colloquium Comput. Complex..

[76]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[77]  Nikola K. Kasabov,et al.  Spiking neural network methodology for modelling, classification and understanding of EEG spatio-temporal data measuring cognitive processes , 2015, Inf. Sci..

[78]  Leibo Liu,et al.  Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[79]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[80]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[81]  Syed B. Huq John Goldie An Overview of LVDS Technology , 1996 .

[82]  Yann LeCun,et al.  1.1 Deep Learning Hardware: Past, Present, and Future , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[83]  Wofgang Maas,et al.  Networks of spiking neurons: the third generation of neural network models , 1997 .

[84]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.

[85]  Wei Wang,et al.  Hierarchical Chunking of Sequential Memory on Neuromorphic Architecture with Reduced Synaptic Plasticity , 2016, Front. Comput. Neurosci..

[86]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[87]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[88]  Kaushik Roy,et al.  Going Deeper in Spiking Neural Networks: VGG and Residual Architectures , 2018, Front. Neurosci..

[89]  D. Querlioz,et al.  Visual Pattern Extraction Using Energy-Efficient “2-PCM Synapse” Neuromorphic Architecture , 2012, IEEE Transactions on Electron Devices.

[90]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Wolfgang Maass,et al.  Noise as a Resource for Computation and Learning in Networks of Spiking Neurons , 2014, Proceedings of the IEEE.

[92]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[93]  Zhenzhi Wu,et al.  GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework , 2017, Neural Networks.

[94]  Yiran Chen,et al.  PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[95]  Yuan Xie,et al.  Rethinking the performance comparison between SNNS and ANNS , 2020, Neural Networks.

[96]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  William J. Dally,et al.  A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm , 2019, 2019 Symposium on VLSI Circuits.

[98]  Ifije E. Ohiorhenuan,et al.  Sparse coding and high-order correlations in fine-scale cortical networks , 2010, Nature.

[99]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[100]  Sudhakar Yalamanchili,et al.  Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[101]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.