A DNN-based object detection system on mobile cloud computing

With the development of big data and the improvement of computing power, deep learning has made a very prominent breakthrough in computer vision. However, the computational overhead of the Deep Neural Network (DNN) for video processing in mobile devices is extremely high. To address the problem above, this paper combines smartphones with the cloud to realize a DNN-based object detection system. The main contributions are three-fold. (i) A model scheduling algorithm is proposed to adaptively select the operating environment (cloud or mobile) according to the conditions of network and mobile devices. (ii) To meet the hardware requirements of mobile devices, the compact model variants are trained and generated with a small loss of precision. (iii) To reduce the latency, the outputs of DNN models are used to process (add bounding boxes and annotations) the video directly. Test results for runtime and precision show that our system outperforms the state-of-the-art in both detection accuracy and running speed.

[1]  Alec Wolman,et al.  MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints , 2016, MobiSys.

[2]  Soheil Ghiasi,et al.  CNNdroid: GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android , 2015, ACM Multimedia.

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[5]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[6]  Nicholas D. Lane,et al.  DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[7]  Shuvra S. Bhattacharyya,et al.  Resource-constrained implementation and optimization of a deep neural network for vehicle classification , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[8]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[9]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[10]  Roberto Cipolla,et al.  MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving , 2016, 2018 IEEE Intelligent Vehicles Symposium (IV).

[11]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).