Partitioning Convolutional Neural Networks to Maximize the Inference Rate on Constrained IoT Devices

Billions of devices will compose the IoT system in the next few years, generating a huge amount of data. We can use fog computing to process these data, considering that there is the possibility of overloading the network towards the cloud. In this context, deep learning can treat these data, but the memory requirements of deep neural networks may prevent them from executing on a single resource-constrained device. Furthermore, their computational requirements may yield an unfeasible execution time. In this work, we propose Deep Neural Networks Partitioning for Constrained IoT Devices, a new algorithm to partition neural networks for efficient distributed execution. Our algorithm can optimize the neural network inference rate or the number of communications among devices. Additionally, our algorithm accounts appropriately for the shared parameters and biases of Convolutional Neural Networks. We investigate the inference rate maximization for the LeNet model in constrained setups. We show that the partitionings offered by popular machine learning frameworks such as TensorFlow or by the general-purpose framework METIS may produce invalid partitionings for very constrained setups. The results show that our algorithm can partition LeNet for all the proposed setups, yielding up to 38% more inferences per second than METIS.

[1]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[2]  Juan Carlos Herrera-Lozada,et al.  Columnar cactus recognition in aerial images using a deep learning approach , 2019, Ecol. Informatics.

[3]  Robertas Damasevicius,et al.  Modelling of Internet of Things units for estimating security-energy-performance relationships for quality of service and environment awareness , 2016, Secur. Commun. Networks.

[4]  Xinyu Yang,et al.  A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications , 2017, IEEE Internet of Things Journal.

[5]  Rajkumar Buyya,et al.  Distributed data stream processing and edge computing: A survey on resource elasticity and future directions , 2017, J. Netw. Comput. Appl..

[6]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[7]  Steven Bohez,et al.  Multi-fidelity deep neural networks for adaptive inference in the internet of multimedia things , 2019, Future Gener. Comput. Syst..

[8]  Yanan Xu,et al.  Background classification method based on deep learning for intelligent automotive radar target detection , 2019, Future Gener. Comput. Syst..

[9]  Luis Rodero-Merino,et al.  Finding your Way in the Fog: Towards a Comprehensive Definition of Fog Computing , 2014, CCRV.

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[12]  H. Andrés Neyem,et al.  Towards a practical framework for code offloading in the Internet of Things , 2019, Future Gener. Comput. Syst..

[13]  Andreas Gerstlauer,et al.  DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  Andrea Calimera,et al.  Layer-Wise Compressive Training for Convolutional Neural Networks , 2018, Future Internet.

[15]  Weishan Zhang,et al.  Embedded Deep Learning for Ship Detection and Recognition , 2019, Future Internet.

[16]  Helmi Zulhaidi Mohd Shafri,et al.  Young and mature oil palm tree detection and counting using convolutional neural network deep learning method , 2019, International Journal of Remote Sensing.

[17]  Yasir Mehmood,et al.  Internet-of-Things-Based Smart Cities: Recent Advances and Challenges , 2017, IEEE Communications Magazine.

[18]  Mahdi H. Miraz,et al.  Internet of Nano-Things, Things and Everything: Future Growth Trends , 2018, Future Internet.

[19]  Robertas Damasevicius,et al.  Multi-sink distributed power control algorithm for Cyber-physical-systems in coal mine tunnels , 2019, Comput. Networks.

[20]  K. Lewis,et al.  Pareto analysis in multiobjective optimization using the collinearity theorem and scaling method , 2001 .

[21]  Ao Tang,et al.  A Real-Time Hand Posture Recognition System Using Deep Neural Networks , 2015, ACM Trans. Intell. Syst. Technol..

[22]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[23]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[24]  Filip De Turck,et al.  Graph partitioning algorithms for optimizing software deployment in mobile cloud computing , 2013, Future Gener. Comput. Syst..