A hybrid framework combining background subtraction and deep neural networks for rapid person detection

Currently, the number of surveillance cameras is rapidly increasing responding to security issues. But constructing an intelligent detection system is not easy because it needs high computing performance. This study aims to construct a real-world video surveillance system that can effectively detect moving person using limited resources. To this end, we propose a simple framework to detect and recognize moving objects using outdoor CCTV video footages by combining background subtraction and Convolutional Neural Networks (CNNs). A background subtraction algorithm is first applied to each video frame to find the regions of interest (ROIs). A CNN classification is then carried out to classify the obtained ROIs into one of the predefined classes. Our approach much reduces the computation complexity in comparison to other object detection algorithms. For the experiments, new datasets are constructed by filming alleys and playgrounds, places where crimes are likely to occur. Different image sizes and experimental settings are tested to construct the best classifier for detecting people. The best classification accuracy of 0.85 was obtained for a test set from the same camera with training set and 0.82 with different cameras.

[1]  Armin B. Cremers,et al.  Informed Haar-Like Features Improve Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Thierry Bouwmans,et al.  Recent Advanced Statistical Background Modeling for Foreground Detection - A Systematic Survey , 2011 .

[3]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[5]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[6]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[7]  Ezzeddine Zagrouba,et al.  Abnormal behavior recognition for intelligent video surveillance systems: A review , 2018, Expert Syst. Appl..

[8]  Lucia Maddalena,et al.  A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications , 2008, IEEE Transactions on Image Processing.

[9]  Zhiming Luo,et al.  Interactive deep learning method for segmenting moving objects , 2017, Pattern Recognit. Lett..

[10]  Tieniu Tan,et al.  A real-time object detecting and tracking system for outdoor night surveillance , 2008, Pattern Recognit..

[11]  Antoine Vacavant,et al.  A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos , 2014, Comput. Vis. Image Underst..

[12]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Nigel J. B. McFarlane,et al.  Segmentation and tracking of piglets in images , 1995, Machine Vision and Applications.

[14]  Thierry Bouwmans,et al.  Robust PCA via Principal Component Pursuit: A review for a comparative evaluation in video surveillance , 2014, Comput. Vis. Image Underst..

[15]  Robin Singh Sidhu,et al.  Smart surveillance system for detecting interpersonal crime , 2016, 2016 International Conference on Communication and Signal Processing (ICCSP).

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Baharak Shakeri Aski,et al.  Intelligent video surveillance for monitoring fall detection of elderly in home environments , 2008, 2008 11th International Conference on Computer and Information Technology.

[19]  Alan F. Smeaton,et al.  Background Modelling in Infrared and Visible Spectrum Video for People Tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[20]  Xiaogang Wang,et al.  Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Thierry Bouwmans,et al.  Foreground Detection Using the Choquet Integral , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[22]  Borko Furht,et al.  Neural Network Approach to Background Modeling for Video Object Segmentation , 2007, IEEE Transactions on Neural Networks.

[23]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Hongxun zhang,et al.  Fusing Color and Texture Features for Background Model , 2006, FSKD.

[26]  Mark E Hallenbeck,et al.  Extracting Roadway Background Image , 2006 .

[27]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Thierry Bouwmans,et al.  Traditional and recent approaches in background modeling for foreground detection: An overview , 2014, Comput. Sci. Rev..

[29]  Sergio A. Velastin,et al.  Intelligent distributed surveillance systems: a review , 2005 .

[30]  Lucia Maddalena,et al.  A fuzzy spatial coherence-based approach to background/foreground separation for moving object detection , 2010, Neural Computing and Applications.

[31]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[33]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Chih-Yang Lin,et al.  Three-Pronged Compensation and Hysteresis Thresholding for Moving Object Detection in Real-Time Video Surveillance , 2017, IEEE Transactions on Industrial Electronics.

[35]  Piotr Dollár,et al.  Crosstalk Cascades for Frame-Rate Pedestrian Detection , 2012, ECCV.

[36]  Jan-Olof Eklundh,et al.  Statistical background subtraction for a mobile observer , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[37]  Dinesh Manocha,et al.  MixedPeds: Pedestrian Detection in Unannotated Videos Using Synthetically Generated Human-Agents for Training , 2018, AAAI.

[38]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[39]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Jitendra Malik,et al.  Semantic segmentation using regions and parts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Jitendra Malik,et al.  Actions and Attributes from Wholes and Parts , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[43]  Xiaogang Wang,et al.  Switchable Deep Network for Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[45]  Duan-Yu Chen,et al.  Motion-based unusual event detection in human crowds , 2011, J. Vis. Commun. Image Represent..

[46]  Xiaogang Wang,et al.  Object Detection from Video Tubelets with Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Bing Li,et al.  Multi-Cue Illumination Estimation via a Tree-Structured Group Joint Sparse Representation , 2015, International Journal of Computer Vision.

[49]  Chong-Min Kyung,et al.  A Low-Complexity Pedestrian Detection Framework for Smart Video Surveillance Systems , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[50]  Xiaogang Wang,et al.  Pedestrian Parsing via Deep Decompositional Network , 2013, 2013 IEEE International Conference on Computer Vision.

[51]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[52]  Peter H. N. de With,et al.  Automatic video-based human motion analyzer for consumer surveillance system , 2009, IEEE Transactions on Consumer Electronics.

[53]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Xiaogang Wang,et al.  Pedestrian detection aided by deep learning semantic tasks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[56]  S. Bianco,et al.  How Far Can You Get By Combining Change Detection Algorithms? , 2015, ICIAP.

[57]  Luis Miguel Bergasa,et al.  Expert video-surveillance system for real-time detection of suspicious behaviors in shopping malls , 2015, Expert Syst. Appl..

[58]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[59]  Daniela Moctezuma,et al.  HoGG: Gabor and HoG-based human detection for surveillance in non-controlled environments , 2013, Neurocomputing.

[60]  Thierry Bouwmans,et al.  Background Subtraction For Visual Surveillance: A Fuzzy Approach , 2012 .

[61]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.