Adding context information to video analysis for surveillance applications

Smart surveillance systems become more meaningful if they both grow in reliability and robustness, while simultaneously offering a higher semantic level of understanding. To achieve a higher level of semantic scene understanding, the objects and their actions have to be interpreted in the given context, so that the extraction of contextual information is required. This chapter explores several techniques for extracting the contextual information such as spatial, motion, depth and co-occurrence, depending on applications. Afterwards, the chapter provides specific case studies to evaluate the usefulness of context information, based on: (1) region labeling of the surroundings of objects, (2) motion analysis of the water for moving ships, (3) traffic sign recognition for safety event evaluation and (4) the use of depth signals for obstacle detection. The chapter shows that the previous cases can be solved in an improved way with respect to robustness and semantic understanding. Case studies indicate up to 6.8% improvement of reliable correct object understanding and the novel possibility of labeling scene events as safe/unsafe depending on the object behavior and the detected surrounding context. In this chapter, it is shown that using contextual information improves automated video surveillance analysis, as it not only improves the reliability of moving object detection, but also enables scene understanding that is far beyond object understanding.

[1]  Larry S. Davis,et al.  Learning What and How of Contextual Models for Scene Labeling , 2010, ECCV.

[2]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Isabelle Bloch,et al.  Using relative spatial relationships to improve individual region recognition , 2005 .

[4]  Shih-Fu Chang,et al.  Single color extraction and image query , 1995, Proceedings., International Conference on Image Processing.

[5]  Vincent Charvillat,et al.  Context modeling in computer vision: techniques, implications, and applications , 2010, Multimedia Tools and Applications.

[6]  Rob G. J. Wijnhoven,et al.  Online learning for ship detection in maritime surveillance , 2010 .

[7]  Arturo de la Escalera,et al.  Traffic sign recognition and analysis for intelligent vehicles , 2003, Image Vis. Comput..

[8]  Josiane Zerubia,et al.  Texture feature analysis using a gauss-Markov model in hyperspectral image classification , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Joni-Kristian Kämäräinen,et al.  Fundamental frequency Gabor filters for object recognition , 2002, Object recognition supported by user interaction for service robots.

[10]  A. Broggi,et al.  Lateral vehicles detection using monocular high resolution cameras on TerraMax™ , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[11]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[12]  Ivo Creusen,et al.  A Frequency-Domain Implementation of a Sliding-Window Traffic Sign Detector for Large Scale Panoramic Datasets , 2013 .

[13]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[14]  Peter H. N. de With,et al.  Context modeling combined with motion analysis for moving ship detection in port surveillance , 2013, J. Electronic Imaging.

[15]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[16]  Frank Nielsen,et al.  Statistical region merging , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[18]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Peter H. N. de With,et al.  Exploiting street-level panoramic images for large-scale automated surveying of traffic signs , 2014, Machine Vision and Applications.

[20]  Yuan-Kai Wang,et al.  A robust vehicle detection approach , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[21]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[22]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[23]  Peter H. N. de With,et al.  Optimal Performance-Efficiency Trade-off for Bag of Words Classification of Road Signs , 2014, 2014 22nd International Conference on Pattern Recognition.

[24]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[25]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[26]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[27]  Takeo Kanade,et al.  A multibaseline stereo system with active illumination and real-time image acquisition , 1995, Proceedings of IEEE International Conference on Computer Vision.

[28]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Hai-Yan Zhang,et al.  Multiple moving objects detection and tracking based on optical flow in polar-log images , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[31]  David Zhang,et al.  Palmprint feature extraction using 2-D Gabor filters , 2003, Pattern Recognit..

[32]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[33]  Wilfried Philips,et al.  Advanced Concepts for Intelligent Vision Systems , 2011, Lecture Notes in Computer Science.

[34]  Peter H. N. de With,et al.  Flexible Multi-modal Graph-Based Segmentation , 2013, ACIVS.

[35]  Jake K. Aggarwal,et al.  Robust Vehicle Detection for Tracking in Highway Surveillance Videos Using Unsupervised Learning , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[36]  Thomas Mauthner,et al.  Semantic Image Classification using Consistent Regions and Individual Context , 2009, BMVC.

[37]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[38]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[39]  Liang-Gee Chen,et al.  Survey on Block Matching Motion Estimation Algorithms and Architectures with New Results , 2006, J. VLSI Signal Process..

[40]  David Zhang,et al.  A survey of graph theoretical approaches to image segmentation , 2013, Pattern Recognit..

[41]  Alessandro Perina,et al.  Multiple-shot person re-identification by chromatic and epitomic analyses , 2012, Pattern Recognit. Lett..

[42]  Peter H. N. de With,et al.  Color exploitation in hog-based traffic sign detection , 2010, 2010 IEEE International Conference on Image Processing.

[43]  Xiaohui Liu,et al.  Real-time traffic sign recognition from video by class-specific discriminative features , 2010, Pattern Recognit..

[44]  Aslani Sepehr,et al.  Optical Flow Based Moving Object Detection and Tracking for Traffic Surveillance , 2014, ICEE 2014.

[45]  Pushmeet Kohli,et al.  Graph Cut Based Inference with Co-occurrence Statistics , 2010, ECCV.

[46]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[48]  Mingjing Li,et al.  Color texture moments for content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.

[49]  Hai Wei,et al.  Automated intelligent video surveillance system for ships , 2009, Defense + Commercial Sensing.

[50]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[51]  Peter H. N. de With,et al.  ViCoMo: visual context modeling for scene understanding in video surveillance , 2013, J. Electronic Imaging.

[52]  Joost van de Weijer,et al.  Harmony Potentials , 2011, International Journal of Computer Vision.

[53]  Jong-Nam Kim,et al.  Multiple Ship Detection and Tracking Using Background Registration and Morphological Operations , 2010, FGIT-SIP/MulGraB.

[54]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[55]  Bao Water region and multiple ship detection for port surveillance , 2012 .

[56]  Peter H. N. de With,et al.  Ship detection in port surveillance based on context and motion saliency analysis , 2013, Electronic Imaging.

[57]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Peter H. N. de With,et al.  Robust detection, classification and positioning of traffic signs from street-level panoramic images for inventory purposes , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[59]  Damien Garcia,et al.  Robust smoothing of gridded data in one and higher dimensions with missing values , 2010, Comput. Stat. Data Anal..