EdgeDuet: Tiling Small Object Detection for Edge Assisted Autonomous Mobile Vision

Accurate, real-time object detection on resource-constrained devices enables autonomous mobile vision applications such as traffic surveillance, situational awareness, and safety inspection, where it is crucial to detect both small and large objects in crowded scenes. Prior studies either perform object detection locally on-board or offload the task to the edge/cloud. Local object detection yields low accuracy on small objects since it operates on low-resolution videos to fit in mobile memory. Offloaded object detection incurs high latency due to uploading high-resolution videos to the edge/cloud. Rather than either pure local processing or offloading, we propose to detect large objects locally while offloading small object detection to the edge. The key challenge is to reduce the latency of small object detection. Accordingly, we develop EdgeDuet, the first edge-device collaborative framework for enhancing small object detection with tile-level parallelism. It optimizes the offloaded detection pipeline in tiles rather than the entire frame for high accuracy and low latency. Evaluations on drone vision datasets under LTE, WiFi 2.4GHz, WiFi 5GHz show that EdgeDuet outperforms local object detection in small object detection accuracy by 233.0%. It also improves the detection accuracy by 44.7% and latency by 34.2% over the state-of-the-art offloading schemes.

[1]  Miaomiao Liu,et al.  Continuous, Real-Time Object Detection on Mobile Devices without Offloading , 2020, 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS).

[2]  Aakanksha Chowdhery,et al.  Server-Driven Video Streaming for Deep Learning Inference , 2020, SIGCOMM.

[3]  Danyang Li,et al.  Edge Assisted Mobile Semantic Visual SLAM , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.

[4]  Juheon Yi,et al.  EagleEye: wearable camera-based person identification in crowded urban spaces , 2020, MobiCom.

[5]  George Iosifidis,et al.  Measurement-driven Analysis of an Edge-Assisted Object Recognition System , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[6]  Haibin Ling,et al.  Vision Meets Drones: Past, Present and Future , 2020, ArXiv.

[7]  Guohong Cao,et al.  FastVA: Deep Learning Video Analytics Through Edge Processing and NPU in Mobile , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.

[8]  Li Zhang,et al.  iVR: Integrated Vision and Radio Localization with Zero Human Effort , 2019 .

[9]  Marco Gruteser,et al.  Edge Assisted Real-time Object Detection for Mobile Augmented Reality , 2019, MobiCom.

[10]  F. Ozge Unel,et al.  The Power of Tiling for Small Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Tianshu Chu,et al.  Neural Networks Meet Physical Networks: Distributed Inference Between Edge Devices and the Cloud , 2018, HotNets.

[12]  Xiao Zeng,et al.  NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision , 2018, MobiCom.

[13]  Zhuo Chen,et al.  Bandwidth-Efficient Live Video Analytics for Drones Via Edge Computing , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[14]  Franz Franchetti,et al.  Fast and accurate object detection in high resolution 4K and 8K video using GPUs , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[15]  Ion Stoica,et al.  Chameleon: scalable adaptation of video analytics , 2018, SIGCOMM.

[16]  Edward A. Lee,et al.  AWStream: adaptive wide-area streaming analytics , 2018, SIGCOMM.

[17]  Lothar Thiele,et al.  Multi-Task Zipping via Layer-wise Neuron Sharing , 2018, NeurIPS.

[18]  Zhenming Liu,et al.  DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[19]  Matti Siekkinen,et al.  Latency and throughput characterization of convolutional neural networks for mobile computer vision , 2018, MMSys.

[20]  Larry S. Davis,et al.  Dynamic Zoom-in Network for Fast Object Detection in Large Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[22]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[23]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[24]  Timo Hämäläinen,et al.  Kvazaar: Open-Source HEVC/H.265 Encoder , 2016, ACM Multimedia.

[25]  Alec Wolman,et al.  MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints , 2016, MobiSys.

[26]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[27]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Paramvir Bahl,et al.  Glimpse: Continuous, Real-Time Object Recognition on Mobile Devices , 2015, SenSys.

[29]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Minhua Zhou,et al.  An Overview of Tiles in HEVC , 2013, IEEE Journal of Selected Topics in Signal Processing.

[32]  Konstantinos Kanistras,et al.  A survey of unmanned aerial vehicles (UAVs) for traffic monitoring , 2013, 2013 International Conference on Unmanned Aircraft Systems (ICUAS).

[33]  Luc Van Gool,et al.  Augmented faces , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[34]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.