Dynamic prioritization of surveillance video data in real-time automated detection systems

Abstract Automated object detection systems are a key component of modern surveillance applications. These systems rely on computationally expensive computer vision algorithms that perform object detection on visual data recorded by surveillance cameras. Due to the security and safety implications of these systems, this visual data st be processed accurately and in real-time. However, many of the frames that are created by the surveillance cameras may be of low importance, providing little or no useful information to the object detection system. Sub-sampling surveillance data by prioritizing important camera frames can greatly reduce unnecessary computation. Consequently, several works have explored dynamic visual data sub-sampling using various modalities of information (ie. spatial or temporal information) for prioritization. Few works, however, have combined and evaluated different modalities of information together for real-time prioritization of visual surveillance data. This work evaluates several individual and combined prioritization metrics derived from different modalities of information for use with a modern deep learning-based object detection algorithm. Both processing time and object detection rate are measured and used to rank the prioritization metrics. A novel approach that uses the historical detection confidences created by the object detection algorithm was demonstrated to be the best standalone prioritization metric. Additionally, a novel ensemble method that uses a KNN regressor to combine the best of the previously evaluated metrics to create a dynamic prioritization method is presented. This ensemble approach is shown to increase the object detection rate by up to 60% as compared to a static sub-sampling baseline as demonstrated using three publicly available datasets. The increased object detection rate was achieved while meeting the real-time constraints of the automated object detection system.

[1]  D. T. Ahmed,et al.  Dynamic prioritization of multi-sensor feeds for resource limited surveillance systems , 2012, 2012 IEEE International Instrumentation and Measurement Technology Conference Proceedings.

[2]  Sung Wook Baik,et al.  Efficient object-based surveillance image search using spatial pooling of convolutional features , 2017, J. Vis. Commun. Image Represent..

[3]  Md. Yusuf Sarwar Uddin,et al.  PhotoNet: A Similarity-Aware Picture Delivery Service for Situation Awareness , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[4]  Alexandre Bernardino,et al.  The HDA+ Data Set for Research on Fully Automated Re-identification Systems , 2014, ECCV Workshops.

[5]  Luiz Affonso Guedes,et al.  Research Trends in Wireless Visual Sensor Networks When Exploiting Prioritization , 2015, Sensors.

[6]  Erik Scheme,et al.  Design considerations for the processing system of a CNN-based automated surveillance system , 2019, Expert Syst. Appl..

[7]  Ahmet M. Kondoz,et al.  Depth based object prioritisation for 3D video communication over Wireless LAN , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[8]  Cewu Lu,et al.  Online Video Object Detection Using Association LSTM , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Saibal Mukhopadhyay,et al.  An Energy-Efficient Wireless Video Sensor Node for Moving Object Surveillance , 2015, IEEE Transactions on Multi-Scale Computing Systems.

[10]  Khan Muhammad,et al.  Cost-Effective Video Summarization Using Deep CNN With Hierarchical Weighted Fusion for IoT Surveillance Networks , 2020, IEEE Internet of Things Journal.

[11]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[12]  Y. Roodt,et al.  Automated surveillance and detection of foreign stationary objects , 2011, IEEE Africon '11.

[13]  Stefan Behnel,et al.  Cython: The Best of Both Worlds , 2011, Computing in Science & Engineering.

[14]  Sebnem Baydere,et al.  Priority Encoding of Image Data in Wireless Multimedia Sensor Networks for Border Surveillance , 2010, ISCIS.

[15]  Hugo Proença,et al.  Dynamic camera scheduling for visual surveillance in crowded scenes using Markov random fields , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[16]  Larry S. Davis,et al.  ReMotENet: Efficient Relevant Motion Event Detection for Large-Scale Home Surveillance Videos , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[18]  Lucia Maddalena,et al.  Scene background initialization: A taxonomy , 2017, Pattern Recognit. Lett..

[19]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ying Wang,et al.  Latency-Aware Adaptive Video Summarization for Mobile Edge Clouds , 2020, IEEE Transactions on Multimedia.

[21]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[22]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[23]  James Llinas,et al.  An introduction to multisensor data fusion , 1997, Proc. IEEE.

[24]  Aakanksha Chowdhery,et al.  The Design and Implementation of a Wireless Video Surveillance System , 2015, MobiCom.

[25]  Sung Wook Baik,et al.  Intelligent Embedded Vision for Summarization of Multiview Videos in IIoT , 2020, IEEE Transactions on Industrial Informatics.

[26]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[27]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[28]  Liviu Iftode,et al.  Large-Scale Situation Awareness With Camera Networks and Multimodal Sensing , 2012, Proceedings of the IEEE.

[29]  Burhan Ahmad Mudassar Design and implementation of a content aware image processing module on FPGA , 2015 .

[30]  Yi-Ping Hung,et al.  Abandoned Object Detection via Temporal Consistency Modeling and Back-Tracing Verification for Visual Surveillance , 2015, IEEE Transactions on Information Forensics and Security.

[31]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32]  Pramod K. Varshney,et al.  Automatic camera selection and fusion for outdoor surveillance under changing weather conditions , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[33]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Xiaogang Wang,et al.  Intelligent multi-camera video surveillance: A review , 2013, Pattern Recognit. Lett..

[35]  Luc Van Gool,et al.  WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[37]  Sebnem Baydere,et al.  Low-cost prioritization of image blocks in wireless sensor networks for border surveillance , 2014, J. Netw. Comput. Appl..

[38]  Sung Wook Baik,et al.  Efficient CNN based summarization of surveillance videos for resource-constrained devices , 2020, Pattern Recognit. Lett..

[39]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Peter Bailis,et al.  NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale , 2017, Proc. VLDB Endow..

[41]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Andrea Cavallaro,et al.  Content and task-based view selection from multiple video streams , 2009, Multimedia Tools and Applications.

[43]  Jun Chen,et al.  Energy-Efficient Image Compressive Transmission for Wireless Camera Networks , 2016, IEEE Sensors Journal.

[44]  Sung Wook Baik,et al.  Saliency-directed prioritization of visual data in wireless surveillance networks , 2015, Inf. Fusion.

[45]  Sharath Pankanti,et al.  Relative Attributes for Large-Scale Abandoned Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  Mubarak Shah,et al.  Automated Visual Surveillance in Realistic Scenarios , 2007, IEEE MultiMedia.