LSTM-type Neural Network Implementation on a Processor Based on Neuromatrix and RISC Cores for Resource-Limited Applications

In this paper, we consider the implementation of a recurrent artificial neural network LSTM on the NM6407 digital signal processor (DSP) that is optimized for performing vector and matrix calculations. It contains two NeuroMatrix NMC4 cores, each of which includes RISC processor and vector coprocessor. The architectural features and processor resources are considered, as well as an assessment of its performance in the implementation of the LSTM network to solve a typical classification problem. The implementation of this type of network on the NM6407 tensor core accelerated computations by a factor of 15–350 compared to a scalar processor.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  L Eysymont Alexey,et al.  Heterogeneous Multicore System on Chip With 512Gfl ops Peak Performance , 2018 .

[5]  Indranil Saha,et al.  journal homepage: www.elsevier.com/locate/neucom , 2022 .

[6]  Fernando Morgado Dias,et al.  Artificial Neural Networks Processor - A Hardware Implementation Using a FPGA , 2004, FPL.

[7]  Suleyman Serdar Kozat,et al.  Efficient Online Learning Algorithms Based on LSTM Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Ralf C. Staudemeyer,et al.  Understanding LSTM - a tutorial into Long Short-Term Memory Recurrent Neural Networks , 2019, ArXiv.

[9]  Nikolaus Correll,et al.  Embedded Neural Networks for Robot Autonomy , 2019, ISRR.

[10]  Shaohua Li,et al.  Autonomous exploration of mobile robots through deep neural networks , 2017 .

[11]  Xinbo Chen,et al.  Evaluating the Energy Efficiency of Deep Convolutional Neural Networks on CPUs and GPUs , 2016, 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom).

[12]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[13]  V Mushkaev Sergey,et al.  Computing Resources of the Floating Point NeuroMatrix Processors in Processing BIG Data Streams , 2018 .

[14]  Yu Wang,et al.  A new concept using LSTM Neural Networks for dynamic system identification , 2017, 2017 American Control Conference (ACC).

[15]  V Cherniko Alexander,et al.  High-performance NMC4 Vector Processor Core for Fixed and Floating Point Calculations , 2018 .

[16]  Seul Jung,et al.  Hardware implementation of a real time neural network controller with a DSP and an FPGA , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[17]  Hoi-Jun Yoo,et al.  A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application , 2019, IEEE Transactions on Circuits and Systems I: Regular Papers.