Accelerating Linux and Android applications on low‐power devices through remote GPGPU offloading

Low‐power devices are usually highly constrained in terms of CPU computing power, memory, and GPGPU resources for real‐time applications to run. In this paper, we describe RAPID, a complete framework suite for computation offloading to help low‐powered devices overcome these limitations. RAPID supports CPU and GPGPU computation offloading on Linux and Android devices. Moreover, the framework implements lightweight secure data transmission of the offloading operations. We present the architecture of the framework, showing the integration of the CPU and GPGPU offloading modules. We show by extensive experiments that the overhead introduced by the security layer is negligible. We present the first benchmark results showing that Java/Android GPGPU code offloading is possible. Finally, we show the adoption of the GPGPU offloading into BioSurveillance, a commercial real‐time face recognition application. The results show that, thanks to RAPID, BioSurveillance is being successfully adapted to run on low‐power devices. The proposed framework is highly modular and exposes a rich application programming interface to developers, making it highly versatile while hiding the complexity of the underlying networking layer.

[1]  Chuck Yoo,et al.  VADI: GPU Virtualization for an Automotive Platform , 2016, IEEE Transactions on Industrial Informatics.

[2]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[3]  Seiichi Gohshi,et al.  Real-time super resolution algorithm for security cameras , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[4]  Giulio Giunta,et al.  Using grid computing based components in on demand environmental data delivery , 2007, UPGRADE '07.

[5]  Pan Hui,et al.  ThinkAir: Dynamic resource allocation and parallel execution in the cloud for mobile code offloading , 2012, 2012 Proceedings IEEE INFOCOM.

[6]  Thomas Zefferer,et al.  POWER: A cloud-based mobile augmentation approach for web- and cross-platform applications , 2015, 2015 IEEE 4th International Conference on Cloud Networking (CloudNet).

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Hyuck Han,et al.  Feasibility of the Computation Task Offloading to GPGPU-enabled Devices in Mobile Cloud , 2015, 2015 International Conference on Cloud and Autonomic Computing.

[9]  Xu Chen,et al.  COMET: Code Offload by Migrating Execution Transparently , 2012, OSDI.

[10]  Federico Silla,et al.  Enabling CUDA acceleration within virtual machines using rCUDA , 2011, 2011 18th International Conference on High Performance Computing.

[11]  Giulio Giunta,et al.  Enabling Android-Based Devices to High-End GPGPUs , 2016, ICA3PP.

[12]  Wei Gao,et al.  Minimizing Context Migration in Mobile Code Offload , 2017, IEEE Transactions on Mobile Computing.

[13]  Vikram K. Narayana,et al.  GPU Resource Sharing and Virtualization on High Performance Computing Systems , 2011, 2011 International Conference on Parallel Processing.

[14]  Sergio Iserte,et al.  Remote GPU Virtualization: Is It Useful? , 2016, 2016 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB).

[15]  Pan Hui,et al.  Clone2clone (c2): Enable Peer-to-peer Networking Smartpones on the Cloud Clone2clone (c2c): Enable Peer-to-peer Networking of Smartphones on the Cloud , 2022 .

[16]  Ian T. Foster,et al.  SOLE: Linking Research Papers with Science Objects , 2012, IPAW.

[17]  Giulio Giunta,et al.  A Grid Computing Based Virtual Laboratory for Environmental Simulations , 2007, PPAM.

[18]  Giulio Giunta,et al.  Virtualizing CUDA Enabled GPGPUs on ARM Clusters , 2015, PPAM.

[19]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[20]  Aravind Srinivasan,et al.  Enabling energy-aware collaborative mobile data offloading for smartphones , 2013, 2013 IEEE International Conference on Sensing, Communications and Networking (SECON).

[21]  Jürgen Teich,et al.  Code generation for embedded heterogeneous architectures on android , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[22]  Carlos Reaño,et al.  Reducing the performance gap of remote GPU virtualization with InfiniBand Connect-IB , 2016, 2016 IEEE Symposium on Computers and Communication (ISCC).

[23]  Giulio Giunta,et al.  A GPGPU Transparent Virtualization Component for High Performance Computing Clouds , 2010, Euro-Par.

[24]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[25]  Mikyung Kang,et al.  Heterogeneous Cloud Computing , 2011, 2011 IEEE International Conference on Cluster Computing.

[26]  James Demmel,et al.  Benchmarking GPUs to tune dense linear algebra , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  H. T. Mouftah,et al.  Trustworthy Sensing for Public Safety in Cloud-Centric Internet of Things , 2014, IEEE Internet of Things Journal.

[29]  Salvatore Cuomo,et al.  Toward a Multi-level Parallel Framework on GPU Cluster with PetSC-CUDA for PDE-based Optical Flow Computation , 2015, ICCS.

[30]  Thomas Zefferer,et al.  Paving the Way for Security in Cloud-Based Mobile Augmentation Systems , 2015, 2015 3rd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering.

[31]  Raffaele Montella,et al.  Virtualizing General Purpose GPUs for High Performance Cloud Computing: An Application to a Fluid Simulator , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.

[32]  Byung-Gon Chun,et al.  CloneCloud: elastic execution between mobile device and cloud , 2011, EuroSys '11.

[33]  Jakob Jonsson,et al.  PKCS #1: RSA Cryptography Specifications Version 2.2 , 2016, RFC.

[34]  Gustavo Alonso,et al.  Dynamic Software Deployment from Clouds to Mobile Devices , 2012, Middleware.

[35]  Alistair A. Young,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.

[36]  Carlos Reaño,et al.  A Performance Comparison of CUDA Remote GPU Virtualization Frameworks , 2015, 2015 IEEE International Conference on Cluster Computing.

[37]  Raffaele Montella,et al.  Modeling and computational issues for air/water quality problems: a grid computing approach , 2005 .

[38]  Yunheung Paek,et al.  Precise execution offloading for applications with dynamic behavior in mobile cloud computing , 2016, Pervasive Mob. Comput..

[39]  Francisco Javier García Blas,et al.  A General-Purpose Virtualization Service for HPC on Cloud Computing: An Application to GPUs , 2011, PPAM.

[40]  Pan Hui,et al.  Clone2Clone (C2C): Peer-to-Peer Networking of Smartphones on the Cloud , 2013, HotCloud.

[41]  Henri E. Bal,et al.  Cuckoo: A Computation Offloading Framework for Smartphones , 2010, MobiCASE.

[42]  Vivek Sarkar,et al.  JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA , 2009, Euro-Par.

[43]  Alec Wolman,et al.  MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[44]  Timothy Stapko CHAPTER 4 – The Secure Sockets Layer , 2008 .

[45]  Cheol-Ho Hong,et al.  On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework , 2017, International Journal of Parallel Programming.

[46]  Alan O. Freier,et al.  Internet Engineering Task Force (ietf) the Secure Sockets Layer (ssl) Protocol Version 3.0 , 2022 .

[47]  Michael Scott,et al.  PKCS #12: Personal Information Exchange Syntax v1.1 , 2014, RFC.

[48]  Jieun Choi,et al.  Data-Locality Aware Scientific Workflow Scheduling Methods in HPC Cloud Environments , 2016, International Journal of Parallel Programming.

[49]  Rosa M. Badia,et al.  COMPSs-Mobile: Parallel Programming for Mobile Cloud Computing , 2016, Journal of Grid Computing.

[50]  Giulio Giunta,et al.  A Grid Computing Based Virtual Laboratory for Environmental Simulations , 2006, Euro-Par.

[51]  Almerico Murli,et al.  A multi‐grained distributed implementation of the parallel Block Conjugate Gradient algorithm , 2010, Concurr. Comput. Pract. Exp..

[52]  Vanish Talwar,et al.  GViM: GPU-accelerated virtual machines , 2009, HPCVirt '09.

[53]  Paulo Maciel,et al.  Planning Mobile Cloud Infrastructures Using Stochastic Petri Nets and Graphic Processing Units , 2015, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom).

[54]  Jianbin Fang,et al.  A Comprehensive Performance Comparison of CUDA and OpenCL , 2011, 2011 International Conference on Parallel Processing.

[55]  François Armand,et al.  Shared device driver model for virtualized mobile handsets , 2008, MobiVirt '08.