Benchmarking Delay and Energy of Neural Inference Circuits

Neural network circuits and architectures are currently under active research for applications to artificial intelligence and machine learning. Their physical performance metrics (area, time, and energy) are estimated. Various types of neural networks (artificial, cellular, spiking, and oscillator) are implemented with multiple CMOS and beyond-CMOS (spintronic, ferroelectric, and resistive memory) devices. A consistent and transparent methodology is proposed and used to benchmark this comprehensive set of options across several application cases. Promising architecture/device combinations are identified.

[1]  Azad Naeemi,et al.  Non-Boolean Computing Benchmarking for Beyond-CMOS Devices Based on Cellular Neural Network , 2016, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[2]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[3]  Kaushik Roy,et al.  Performance analysis and benchmarking of all-spin spiking neural networks (Special session paper) , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  I. Guyon,et al.  Handwritten digit recognition: applications of neural network chips and automatic learning , 1989, IEEE Communications Magazine.

[6]  Steve B. Furber,et al.  Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation , 2016, Front. Neurosci..

[7]  Xiaochen Peng,et al.  NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  Peter Blouw,et al.  Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware , 2018, NICE '19.

[9]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[10]  Dmitri E. Nikonov,et al.  Overview of Beyond-CMOS Devices and a Uniform Methodology for Their Benchmarking , 2013, Proceedings of the IEEE.

[11]  François W. Primeau,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014 .

[12]  Shimeng Yu,et al.  Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.

[13]  Indranil Saha,et al.  journal homepage: www.elsevier.com/locate/neucom , 2022 .

[14]  Dan W. Hammerstrom,et al.  Performance/price estimates for cortex-scale hardware: A design space exploration , 2011, Neural Networks.

[15]  Steven J. Plimpton,et al.  Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator , 2017, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Gert Cauwenberghs,et al.  Large-Scale Neuromorphic Spiking Array Processors: A Quest to Mimic the Brain , 2018, Front. Neurosci..

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Jennifer Hasler,et al.  Finding a roadmap to achieve large neuromorphic hardware systems , 2013, Front. Neurosci..

[20]  Catherine D. Schuman,et al.  A Survey of Neuromorphic Computing and Neural Networks in Hardware , 2017, ArXiv.

[21]  Giacomo Indiveri,et al.  Memory and Information Processing in Neuromorphic Systems , 2015, Proceedings of the IEEE.

[22]  Steve Furber,et al.  Power-efficient simulation of detailed cortical microcircuits on SpiNNaker , 2012, Journal of Neuroscience Methods.

[23]  Kaushik Roy,et al.  Exploiting Inherent Error Resiliency of Deep Neural Networks to Achieve Extreme Energy Efficiency Through Mixed-Signal Neurons , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Kaushik Roy,et al.  Magnetic Tunnel Junction Based Long-Term Short-Term Stochastic Synapse for a Spiking Neural Network with On-Chip STDP Learning , 2016, Scientific Reports.

[25]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[26]  Tayfun Gokmen,et al.  The Next Generation of Deep Learning Hardware: Analog Computing , 2019, Proceedings of the IEEE.

[27]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[28]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[29]  Kaushik Roy,et al.  Hybrid Spintronic-CMOS Spiking Neural Network With On-Chip Learning: Devices, Circuits and Systems , 2015, ArXiv.

[30]  Steve Furber,et al.  Large-scale neuromorphic computing systems , 2016, Journal of neural engineering.

[31]  Dmitri E. Nikonov,et al.  Benchmarking of Beyond-CMOS Exploratory Devices for Logic Integrated Circuits , 2015, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[32]  Eugenio Culurciello,et al.  Evaluation of neural network architectures for embedded systems , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[33]  Joel Emer,et al.  A method to estimate the energy consumption of deep neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[34]  Xiaochen Peng,et al.  NeuroSim+: An integrated device-to-algorithm framework for benchmarking synaptic devices and array architectures , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[35]  Lei Zhang,et al.  Neuromorphic accelerators: A comparison between neuroscience and machine-learning approaches , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[36]  Steven J. Plimpton,et al.  Achieving ideal accuracies in analog neuromorphic computing using periodic carry , 2017, 2017 Symposium on VLSI Technology.