Object detection in video sequences by a temporal modular self-adaptive SOM

A video segmentation algorithm that takes advantage of using a background subtraction (BS) model with low learning rate (LLR) or a BS model with high learning rate (HLR) depending on the video scene dynamics is presented in this paper. These BS models are based on a neural network architecture, the self-organized map (SOM), and the algorithm is termed temporal modular self-adaptive SOM, TMSA_SOM. Depending on the type of scenario, the TMSA_SOM automatically classifies and processes each video into one of four different specialized modules based on an initial sequence analysis. This approach is convenient because unlike state-of-the-art (SoA) models, our proposed model solves different situations that may occur in the video scene (severe dynamic background, initial frames with dynamic objects, static background, stationary objects, etc.) with a specialized module. Furthermore, TMSA_SOM automatically identifies whether the scene has drastically changed (e.g., stationary objects of interest become dynamic or drastic illumination changes have occurred) and automatically detects when the scene has become stable again and uses this information to update the background model in a fast way. The proposed model was validated with three different video databases: Change Detection, BMC, and Wallflower. Findings showed a very competitive performance considering metrics commonly used in the literature to compare SoA models. TMSA_SOM also achieved the best results on two perceptual metrics, Ssim and D-Score, and obtained the best performance on the global quality measure, FSD (based on F-Measure, Ssim, and D-Score), demonstrating its robustness with different and complicated non-controlled scenarios. TMSA_SOM was also compared against SoA neural network approaches obtaining the best average performance on Re, Pr, and F-Measure.

[1]  Lucia Maddalena,et al.  The SOBS algorithm: What are the limits? , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Thierry Bouwmans,et al.  Robust PCA via Principal Component Pursuit: A review for a comparative evaluation in video surveillance , 2014, Comput. Vis. Image Underst..

[3]  Selim Aksoy,et al.  Detection of Compound Structures Using a Gaussian Mixture Model With Spectral and Spatial Constraints , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Thierry Chateau,et al.  A Benchmark Dataset for Outdoor Foreground/Background Extraction , 2012, ACCV Workshops.

[6]  Matteo Matteucci,et al.  Background subtraction by combining Temporal and Spatio-Temporal histograms in the presence of camera movement , 2013, Machine Vision and Applications.

[7]  Lucia Maddalena,et al.  Stopped Object Detection by Learning Foreground Model in Videos , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Borko Furht,et al.  Neural Network Approach to Background Modeling for Video Object Segmentation , 2007, IEEE Transactions on Neural Networks.

[9]  Hyeran Byun,et al.  A unified approach to background adaptation and initialization in public scenes , 2013, Pattern Recognit..

[10]  Nicu Sebe,et al.  Complex Event Detection via Multi-source Video Attributes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Bing-Fei Wu,et al.  A Real-Time Vision System for Nighttime Vehicle Detection and Traffic Surveillance , 2011, IEEE Transactions on Industrial Electronics.

[12]  Mario Ignacio Chacon Murguia,et al.  An Adaptive Unsupervised Neural Network Based on Perceptual Mechanism for Dynamic Object Detection in Videos with Real Scenarios , 2014, Neural Processing Letters.

[13]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.

[14]  King Ngi Ngan,et al.  Video Segmentation and Its Applications , 2011 .

[15]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[17]  Laure Tougne,et al.  A testing framework for background subtraction algorithms comparison in intrusion detection context , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[18]  Massimo De Gregorio,et al.  Change Detection with Weightless Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Shengping Zhang,et al.  Dynamic background modeling and subtraction using spatio-temporal local binary patterns , 2008, 2008 15th IEEE International Conference on Image Processing.

[20]  Gerhard Rigoll,et al.  Background segmentation with feedback: The Pixel-Based Adaptive Segmenter , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Jong-Il Park,et al.  Computer Vision - ACCV 2012 Workshops , 2012, Lecture Notes in Computer Science.

[22]  Rubén Heras Evangelio,et al.  Complementary background models for the detection of static and moving objects in crowded environments , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[23]  Enrique Mérida Casermeiro,et al.  Video Object Segmentation with Multivalued Neural Networks , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[24]  Mario Ignacio Chacon Murguia,et al.  Improvement of a neural-fuzzy motion detection vision model for complex scenario conditions , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[25]  Tao Xiang,et al.  Background Subtraction with Dirichlet Processes , 2012, ECCV.

[26]  Fatih Murat Porikli,et al.  Changedetection.net: A new change detection benchmark dataset , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[27]  Juan Miguel Ortiz-de-Lazcano-Lobato,et al.  A Competitive Neural Network for Multiple Object Tracking in Video Sequence Analysis , 2012, Neural Processing Letters.

[28]  Tony X. Han,et al.  Ensemble Video Object Cut in Highly Dynamic Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Cordelia Schmid,et al.  Event Retrieval in Large Video Collections with Circulant Temporal Encoding , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Xiaochun Cao,et al.  Video Editing with Temporal, Spatial and Appearance Consistency , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Mario Ignacio Chacon Murguia,et al.  An Adaptive Neural-Fuzzy Approach for Object Detection in Dynamic Backgrounds for Surveillance Systems , 2012, IEEE Transactions on Industrial Electronics.

[32]  Nizar Bouguila,et al.  Finite asymmetric generalized Gaussian mixture models learning for infrared object detection , 2013, Comput. Vis. Image Underst..

[33]  John W. Fisher,et al.  A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Mario Ignacio Chacon Murguia,et al.  A DTCNN Approach on Video Analysis: Dynamic and Static Object Segmentation , 2014, Recent Advances on Hybrid Approaches for Designing Intelligent Systems.

[35]  Martin Kleinsteuber,et al.  pROST: a smoothed $$\ell _p$$ℓp-norm robust online subspace tracking method for background subtraction in video , 2013, Machine Vision and Applications.

[36]  Changick Kim,et al.  Background subtraction using hybrid feature coding in the bag-of-features framework , 2013, Pattern Recognit. Lett..

[37]  Antoine Vacavant,et al.  A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos , 2014, Comput. Vis. Image Underst..

[38]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).