Survey of deployment locations and underlying hardware architectures for contemporary deep neural networks

This article overviews the emerging use of deep neural networks in data analytics and explores which type of underlying hardware and architectural approach is best used in various deployment locations when implementing deep neural networks. The locations which are discussed are in the cloud, fog, and dew computing (dew computing is performed by end devices). Covered architectural approaches include multicore processors (central processing unit), manycore processors (graphics processing unit), field programmable gate arrays, and application-specific integrated circuits. The proposed classification in this article divides the existing solutions into 12 different categories, organized in two dimensions. The proposed classification allows a comparison of existing architectures, which are predominantly cloud-based, and anticipated future architectures, which are expected to be hybrid cloud-fog-dew architectures for applications in Internet of Things and Wireless Sensor Networks. Researchers interested in studying trade-offs among data processing bandwidth, data processing latency, and processing power consumption would benefit from the classification made in this article.

[1]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[2]  Jiangchuan Liu,et al.  When deep learning meets edge computing , 2017, 2017 IEEE 25th International Conference on Network Protocols (ICNP).

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Qun Li,et al.  Fog Computing: Platform and Applications , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[5]  Yiran Chen,et al.  A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[7]  Christopher Joseph Pal,et al.  Brain tumor segmentation with Deep Neural Networks , 2015, Medical Image Anal..

[8]  Eriko Nurvitadhi,et al.  Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.

[9]  Veljko M. Milutinovic,et al.  Paradigm Shift in Big Data SuperComputing: DataFlow vs. ControlFlow , 2014, Journal of Big Data.

[10]  Avinash Sodani,et al.  Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[11]  Nikko Strom,et al.  Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.

[12]  Ambrosio Toval,et al.  Security in cloud computing: A mapping study , 2015 .

[13]  Yu Wang,et al.  A Survey of FPGA-Based Neural Network Accelerator , 2017, 1712.08934.

[14]  Pranav Gokhale,et al.  Applications of Convolutional Neural Networks , 2016 .

[15]  Weisong Shi,et al.  The Promise of Edge Computing , 2016, Computer.

[16]  Mateo Valero,et al.  New Benchmarking Methodology and Programming Model for Big Data Processing , 2015, Int. J. Distributed Sens. Networks.

[17]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[18]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[19]  Hartmut Neven,et al.  Classification with Quantum Neural Networks on Near Term Processors , 2018, 1802.06002.

[20]  BuyyaRajkumar,et al.  Next generation cloud computing , 2018 .

[21]  Xavier Masip-Bruin,et al.  What is a Fog Node A Tutorial on Current Concepts towards a Common Definition , 2016, ArXiv.

[22]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[23]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[24]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[25]  Veljko M. Milutinovic,et al.  Chapter One - A Systematic Approach to Generation of New Ideas for PhD Research in Computing , 2017, Adv. Comput..

[26]  Berin Martini,et al.  A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[27]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[28]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[29]  G. Rakocevic,et al.  A classification and comparison of Data Mining algorithms for Wireless Sensor Networks , 2012, 2012 IEEE International Conference on Industrial Technology.

[30]  Rezaur Rahman,et al.  Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers , 2013 .

[31]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[32]  Jeffrey S. Vetter,et al.  NVIDIA Tensor Core Programmability, Performance & Precision , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[33]  Xavier Masip-Bruin,et al.  Fog-to-cloud Computing (F2C): The key technology enabler for dependable e-health services deployment , 2016, 2016 Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net).

[34]  Luping Shi,et al.  Memristor devices for neural networks , 2018, Journal of Physics D: Applied Physics.

[35]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[36]  Bernard Brezzo,et al.  TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[37]  James Demmel,et al.  ImageNet Training in Minutes , 2017, ICPP.

[38]  Rajkumar Buyya,et al.  Next generation cloud computing: New trends and research directions , 2017, Future Gener. Comput. Syst..

[39]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[40]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[41]  Chia Yee Ooi,et al.  hpFog: A FPGA-Based Fog Computing Platform , 2017, 2017 International Conference on Networking, Architecture, and Storage (NAS).

[42]  Teruo Higashino,et al.  Edge-centric Computing: Vision and Challenges , 2015, CCRV.