An Artificial Neural Network Processor With a Custom Instruction Set Architecture for Embedded Applications

This article presents the design and implementation of an embedded programmable processor with a custom instruction set architecture for efficient realization of artificial neural networks (ANNs). The ANN processor architecture is scalable, supporting an arbitrary number of layers and number of artificial neurons (ANs) per layer. Moreover, the processor supports ANNs with arbitrary interconnect structures among ANs to realize both feed-forward and dynamic recurrent networks. The processor architecture is customizable in which the numerical representation of inputs, outputs, and signals among ANs can be parameterized to an arbitrary fixed-point format. An ASIC implementation of the designed programmable ANN processor for networks with up to 512 ANs and 262,000 interconnects is presented and is estimated to occupy 2.23 mm2 of silicon area and consume 1.25 mW of power from a 1.6 V supply while operating at 74 MHz in a standard 32-nm CMOS technology. In order to assess and compare the efficiency of the designed ANN processor, we have designed and implemented a dedicated reconfigurable hardware architecture for the direct realization of ANNs. Characteristics and implementation results of the designed programmable ANN processor and the dedicated ANN hardware on a Xilinx Artix-7 field-programmable gate array (FPGA) are presented and compared using two benchmarks, the MNIST benchmark using a feed-forward ANN and a movie review sentiment analysis benchmark using a recurrent neural network.

[1]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[2]  Refet Firat Yazicioglu,et al.  An implantable 455-active-electrode 52-channel CMOS neural probe , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[3]  David A. Patterson,et al.  Motivation for and Evaluation of the First Tensor Processing Unit , 2018, IEEE Micro.

[4]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[5]  Wayne Luk,et al.  FPGA Accelerated Simulation of Biologically Plausible Spiking Neural Networks , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[6]  Hoi-Jun Yoo,et al.  A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application , 2019, IEEE Transactions on Circuits and Systems I: Regular Papers.

[7]  Walter Senn,et al.  Fast and deep neuromorphic learning with time-to-first-spike coding , 2019, ArXiv.

[8]  G. Indiveri,et al.  Neuromorphic architectures for spiking deep neural networks , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[9]  Christos-Savvas Bouganis,et al.  fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[10]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[11]  Asit K. Mishra,et al.  From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Michael Pfeiffer,et al.  Deep Learning With Spiking Neurons: Opportunities and Challenges , 2018, Front. Neurosci..

[13]  David Bol,et al.  MorphIC: A 65-nm 738k-Synapse/mm$^2$ Quad-Core Binary-Weight Digital Neuromorphic Processor With Stochastic Spike-Driven Online Learning , 2019, IEEE Transactions on Biomedical Circuits and Systems.

[14]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[15]  Francis R. Willett,et al.  High performance communication by people with paralysis using an intracortical brain-computer interface , 2017, eLife.

[16]  David Bol,et al.  A 0.086-mm2 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28-nm CMOS , 2019, IEEE Trans. Biomed. Circuits Syst..

[17]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[18]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[19]  Amirhossein Alimohammad,et al.  Frameworks for Efficient Brain-Computer Interfacing , 2019, IEEE Transactions on Biomedical Circuits and Systems.

[20]  Eugene M. Izhikevich,et al.  Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[21]  Xuegong Zhou,et al.  A high performance FPGA-based accelerator for large-scale convolutional neural networks , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[22]  Timothy G. Constandinou,et al.  On-Probe Neural Interface ASIC for Combined Electrical Recording and Optogenetic Stimulation , 2018, IEEE Transactions on Biomedical Circuits and Systems.

[23]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[24]  Paolo Meloni,et al.  An FPGA Platform for Real-Time Simulation of Spiking Neuronal Networks , 2017, Front. Neurosci..

[25]  S. Herculano‐Houzel The Human Brain in Numbers: A Linearly Scaled-up Primate Brain , 2009, Front. Hum. Neurosci..

[26]  Rajesh P. N. Rao,et al.  Towards neural co-processors for the brain: combining decoding and encoding in brain–computer interfaces , 2018, Current Opinion in Neurobiology.

[27]  Nikhil Ketkar,et al.  Introduction to PyTorch , 2021, Deep Learning with Python.

[28]  Jean-Michel Muller,et al.  Elementary Functions: Algorithms and Implementation , 1997 .

[29]  Leibo Liu,et al.  An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width , 2019, IEEE Journal of Solid-State Circuits.

[30]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[31]  Srinjoy Mitra,et al.  A Neural Probe With Up to 966 Electrodes and Up to 384 Configurable Channels in 0.13 $\mu$m SOI CMOS , 2017, IEEE Transactions on Biomedical Circuits and Systems.

[32]  Nicholas T. Carnevale,et al.  Simulation of networks of spiking neurons: A review of tools and strategies , 2006, Journal of Computational Neuroscience.

[33]  Jonathan R Wolpaw,et al.  Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Eberhard E. Fetz,et al.  Dynamic neural network models of sensorimotor behavior , 1993 .

[35]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Halil Özcan Gülçür,et al.  Toward Building Hybrid Biological/in silico Neural Networks for Motor Neuroprosthetic Control , 2015, Front. Neurorobot..

[37]  Saeed Reza Kheradpisheh,et al.  S4NN: temporal backpropagation for spiking neural networks with one spike per neuron , 2020, Int. J. Neural Syst..

[38]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[39]  R. Normann,et al.  Thermal Impact of an Active 3-D Microelectrode Array Implanted in the Brain , 2007, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[40]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41]  Andrew S. Whitford,et al.  Cortical control of a prosthetic arm for self-feeding , 2008, Nature.

[42]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[43]  Hoi-Jun Yoo,et al.  14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[44]  Luca Citi,et al.  Decoding of grasping information from neural signals recorded using peripheral intrafascicular interfaces , 2011, Journal of NeuroEngineering and Rehabilitation.

[45]  W. R. Howard The Nature of Mathematical Modeling , 2006 .

[46]  Hoi-Jun Yoo,et al.  A 141.4 mW Low-Power Online Deep Neural Network Training Processor for Real-time Object Tracking in Mobile Devices , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[47]  Nicolas Y. Masse,et al.  Reach and grasp by people with tetraplegia using a neurally controlled robotic arm , 2012, Nature.

[48]  Eugene M. Izhikevich,et al.  Which model to use for cortical spiking neurons? , 2004, IEEE Transactions on Neural Networks.

[49]  Steffen Paul,et al.  Design and implementation of a neurocomputing ASIP for environmental monitoring in WSN , 2012, 2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012).

[50]  Eric S. Chung,et al.  A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[51]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[52]  G. Bi,et al.  Synaptic modification by correlated activity: Hebb's postulate revisited. , 2001, Annual review of neuroscience.

[53]  Rastislav J. R. Struharik,et al.  Implementation of application specific instruction-set processor for the artificial neural network acceleration using LISA ADL , 2017, 2017 IEEE East-West Design & Test Symposium (EWDTS).

[54]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[55]  M. Vellasco,et al.  VLSI architectures for neural networks , 1989, IEEE Micro.

[56]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[57]  K Lehnertz,et al.  Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[58]  David Bol,et al.  A 0.086-mm$^2$ 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28-nm CMOS , 2018, IEEE Transactions on Biomedical Circuits and Systems.

[59]  Gert Cauwenberghs,et al.  Memristor for computing: Myth or reality? , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[60]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[61]  K. M. Curtis,et al.  Piecewise linear approximation applied to nonlinear function of a neural network , 1997 .