Distributed and Efficient Object Detection via Interactions Among Devices, Edge, and Cloud

With the rapid development of Internet-of-Things and communication techniques, media transmission in surveillance applications is gradually relying on wireless networks. Meanwhile, the emergence of edge computing has pushed the media data analysis from the cloud to the edge of the network to achieve fast response for delay-sensitive media processing tasks. Object detection is a representative delay-sensitive image processing task in surveillance applications, but faces significant challenges in this context. For example, how to compress images for transmission in wireless environment without compromising the detection accuracy, and how to integrate and update local inference models online in an edge computing-based object detection system. In this paper, we propose an object detection architecture based on edge computing to achieve distributed and efficient object detection for surveillance applications. Under this architecture, we develop an adaptive Region-of-Interest-based image compression scheme for end devices to efficiently compress their captured images for wireless transmission but not to sacrifice the object detection accuracy of edge servers. Furthermore, we carefully design distributed and communication-efficient interactions among end devices, edge servers, and the cloud to dynamically optimize the object detection accuracy online. Extensive simulation results demonstrate that our proposed architecture not only achieves a competitive detection accuracy to traditional cloud-based objective detection solution with reduced response delay but also significantly improves the image transmission efficiency with adaptive image compression ratio.

[1]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1992 .

[5]  Cewu Lu,et al.  Contour Box: Rejecting Object Proposals without Explicit Closed Contours , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Junsong Yuan,et al.  Adobe Boxes: Locating Object Proposals Using Object Adobes , 2016, IEEE Transactions on Image Processing.

[7]  Matti Pietikäinen,et al.  Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.

[8]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[9]  Ju Ren,et al.  Distributed and Efficient Object Detection in Edge Computing: Challenges and Solutions , 2018, IEEE Network.

[10]  Qun Li,et al.  Fog Computing: Platform and Applications , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[11]  Philip H. S. Torr,et al.  Efficient online structured output learning for keypoint-based object tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Pablo G. Tahoces,et al.  Image compression: Maxshift ROI encoding options in JPEG2000 , 2008, Comput. Vis. Image Underst..

[13]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[15]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[16]  Ju Ren,et al.  Serving at the Edge: A Scalable IoT Architecture Based on Transparent Computing , 2017, IEEE Network.

[17]  Nei Kato,et al.  State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow’s Intelligent Network Traffic Control Systems , 2017, IEEE Communications Surveys & Tutorials.

[18]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[19]  H. Nothdurft Salience of Feature Contrast , 2005 .

[20]  Nei Kato,et al.  Hybrid Method for Minimizing Service Delay in Edge Cloud Computing Through VM Migration and Transmission Power Control , 2017, IEEE Transactions on Computers.

[21]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Bin Yang,et al.  CRAFT Objects from Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Haitian Pang,et al.  Joint Sponsor Scheduling in Cellular and Edge Caching Networks for Mobile Video Delivery , 2018, IEEE Transactions on Multimedia.

[27]  Cordelia Schmid,et al.  Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Santiago Manen,et al.  Online Video SEEDS for Temporal Window Objectness , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Hui Guo,et al.  A Survey on Emerging Computing Paradigms for Big Data , 2017 .

[31]  Francis Eng Hock Tay,et al.  Scale-Aware Pixelwise Object Proposal Networks , 2016, IEEE Transactions on Image Processing.

[32]  Derek Hoiem,et al.  Category-Independent Object Proposals with Diverse Ranking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[34]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[35]  Jitendra Malik,et al.  Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation. , 2017, IEEE transactions on pattern analysis and machine intelligence.

[36]  Ning Zhang,et al.  Content Popularity Prediction Towards Location-Aware Mobile Edge Caching , 2018, IEEE Transactions on Multimedia.

[37]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[38]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Vladlen Koltun,et al.  Geodesic Object Proposals , 2014, ECCV.

[40]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.