PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices

The increasing number of Internet-of-Things (IoT) devices will generate unprecedented data in the upcoming years. Fog computing may prevent the saturation of the network infrastructure by processing data at the edge or within these devices. Consequently, the machine intelligence built almost exclusively on the cloud can be scattered to the edge devices. While deep learning techniques can adequately process IoT-massive data volumes, their high resource-demanding nature poses a trade-off for execution on resource-constrained devices. This paper proposes and evaluates the performance of the PArtitioning Networks for COnstrained DEvices (PANCODE), a novel algorithm that employs a multilevel approach to partition large convolutional neural networks for distributed execution on constrained IoT devices. Experimental results with the LeNet and AlexNet models show that our algorithm can produce partitionings that achieve up to 2173.53 times more inferences per second than the Best Fit algorithm and up to 1.37 times less communication than the second-best approach. We also show that the METIS state-of-the-art framework only produces invalid partitionings in more constrained setups. The results indicate that our algorithm achieves higher inference rates and low communication costs in convolutional neural networks distributed among constrained and exceptionally very constrained devices.

[1]  S. S. Gill,et al.  iFaaSBus: A Security- and Privacy-Based Lightweight Framework for Serverless Computing Using IoT and Machine Learning , 2022, IEEE Transactions on Industrial Informatics.

[2]  Tao Han,et al.  DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices , 2022, 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[3]  Mounir Hamdi,et al.  Pervasive AI for IoT Applications: A Survey on Resource-Efficient Distributed Artificial Intelligence , 2021, IEEE Communications Surveys & Tutorials.

[4]  Marcin Wozniak,et al.  IoT and Interpretable Machine Learning Based Framework for Disease Prediction in Pearl Millet , 2021, Sensors.

[5]  E. Friedman,et al.  Partitioning RSFQ Circuits for Current Recycling , 2021, IEEE Transactions on Applied Superconductivity.

[6]  Andreas Gerstlauer,et al.  DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices , 2021, International Journal of Parallel Programming.

[7]  Shahira M. Habashy,et al.  Energy-Efficient Task Partitioning for Real-Time Scheduling on Multi-Core Platforms , 2021, Comput..

[8]  Luc Vandendorpe,et al.  An Energy-Efficient Fine-Grained Deep Neural Network Partitioning Scheme for Wireless Collaborative Fog Computing , 2021, IEEE Access.

[9]  Massimo Merenda,et al.  Edge Machine Learning for AI-Enabled IoT Devices: A Review , 2020, Sensors.

[10]  Roger Immich,et al.  Fog Computing on Constrained Devices: Paving the Way for the Future IoT , 2020, Advances in Edge Computing.

[11]  Asifullah Khan,et al.  Channel Boosted Convolutional Neural Network for Classification of Mitotic Nuclei using Histopathological Images , 2020, 2020 17th International Bhurban Conference on Applied Sciences and Technology (IBCAST).

[12]  Stamatis Voliotis,et al.  Tackling Faults in the Industry 4.0 Era—A Survey of Machine-Learning Solutions and Key Aspects , 2019, Sensors.

[13]  Edson Borin,et al.  Partitioning Convolutional Neural Networks to Maximize the Inference Rate on Constrained IoT Devices , 2019, Future Internet.

[14]  Marco Gruteser,et al.  Edge Assisted Real-time Object Detection for Mobile Augmented Reality , 2019, MobiCom.

[15]  Steven Bohez,et al.  Multi-fidelity deep neural networks for adaptive inference in the internet of multimedia things , 2019, Future Gener. Comput. Syst..

[16]  Yanan Xu,et al.  Background classification method based on deep learning for intelligent automotive radar target detection , 2019, Future Gener. Comput. Syst..

[17]  Mianxiong Dong,et al.  AAIoT: Accelerating Artificial Intelligence in IoT Systems , 2019, IEEE Wireless Communications Letters.

[18]  Alexander Aiken,et al.  Beyond Data and Model Parallelism for Deep Neural Networks , 2018, SysML.

[19]  Shancang Li,et al.  A Heuristic Offloading Method for Deep Learning Edge Services in 5G Networks , 2019, IEEE Access.

[20]  Andreas Gerstlauer,et al.  DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21]  Edson Borin,et al.  Partitioning Convolutional Neural Networks for Inference on Constrained Internet-of-Things Devices , 2018, 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).

[22]  Rupak Majumdar,et al.  Special Session: Embedded Software for Robotics: Challenges and Future Directions , 2018, 2018 International Conference on Embedded Software (EMSOFT).

[23]  Mahdi H. Miraz,et al.  Internet of Nano-Things, Things and Everything: Future Growth Trends , 2018, Future Internet.

[24]  Guihua Wen,et al.  Competitive Inner-Imaging Squeeze and Excitation for Residual Network , 2018, ArXiv.

[25]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[26]  Asifullah Khan,et al.  A New Channel Boosted Convolution Neural Network using Transfer Learning , 2018, ArXiv.

[27]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[28]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Yasir Mehmood,et al.  Internet-of-Things-Based Smart Cities: Recent Advances and Challenges , 2017, IEEE Communications Magazine.

[30]  Tarek F. Abdelzaher,et al.  DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework , 2017, SenSys.

[31]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[32]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[33]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[34]  Xinyu Yang,et al.  A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications , 2017, IEEE Internet of Things Journal.

[35]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[36]  Reena Panda,et al.  Data partitioning strategies for graph workloads on heterogeneous clusters , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[37]  Amir H. Payberah,et al.  Distributed Vertex-Cut Partitioning , 2014, DAIS.

[38]  Carsten Bormann,et al.  Terminology for Constrained-Node Networks , 2014, RFC.

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Marilyn Wolf,et al.  Program Design and Analysis , 2012 .

[41]  François Pellegrini,et al.  Distillating knowledge about SCOTCH , 2009, Combinatorial Scientific Computing.

[42]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[43]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[44]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[45]  David S. Johnson,et al.  Approximation Algorithms for Bin-Packing — An Updated Survey , 1984 .

[46]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.

[47]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..