3DCD: Scene Independent End-to-End Spatiotemporal Feature Learning Framework for Change Detection in Unseen Videos

Change detection is an elementary task in computer vision and video processing applications. Recently, a number of supervised methods based on convolutional neural networks have reported high performance over the benchmark dataset. However, their success depends upon the availability of certain proportions of annotated frames from test video during training. Thus, their performance on completely unseen videos or scene independent setup is undocumented in the literature. In this work, we present a scene independent evaluation (SIE) framework to test the supervised methods in completely unseen videos to obtain generalized models for change detection. In addition, a scene dependent evaluation (SDE) is also performed to document the comparative analysis with the existing approaches. We propose a fast (speed-25 fps) and lightweight (0.13 million parameters, model size-1.16 MB) end-to-end 3D-CNN based change detection network (3DCD) with multiple spatiotemporal learning blocks. The proposed 3DCD consists of a gradual reductionist block for background estimation from past temporal history. It also enables motion saliency estimation, multi-schematic feature encoding-decoding, and finally foreground segmentation through several modular blocks. The proposed 3DCD outperforms the existing state-of-the-art approaches evaluated in both SIE and SDE setup over the benchmark CDnet 2014, LASIESTA and SBMI2015 datasets. To the best of our knowledge, this is a first attempt to present results in clearly defined SDE and SIE setups in three change detection datasets.

[1]  Sachin Chaudhary,et al.  MsEDNet: Multi-Scale Deep Saliency Learning for Moving Object Detection , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[2]  Nam Ik Cho,et al.  Learning Background Subtraction by Video Synthesis and Multi-scale Recurrent Networks , 2018, ACCV.

[3]  David Suter,et al.  A consensus-based method for tracking: Modelling background scenario and foreground appearance , 2007, Pattern Recognit..

[4]  Yimin Yang,et al.  A 3D CNN-LSTM-Based Image-to-Image Foreground Segmentation , 2020, IEEE Transactions on Intelligent Transportation Systems.

[5]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[6]  Lucia Maddalena,et al.  A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications , 2008, IEEE Transactions on Image Processing.

[7]  Jaemyun Kim,et al.  Background Subtraction Based on Fusion of Color and Local Patterns , 2018, ACCV.

[8]  Hanqing Lu,et al.  Pixelwise Deep Sequence Learning for Moving Object Detection , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Murari Mandal,et al.  MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos , 2020, ACM Multimedia.

[10]  Weimin Tan,et al.  Foreground Detection in Surveillance Video with Fully Convolutional Semantic Network , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[11]  Gerhard Rigoll,et al.  Background segmentation with feedback: The Pixel-Based Adaptive Segmenter , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Subrahmanyam Murala,et al.  MSFgNet: A Novel Compact End-to-End Deep Network for Moving Object Detection , 2019, IEEE Transactions on Intelligent Transportation Systems.

[13]  Kun Yu,et al.  DenseASPP for Semantic Segmentation in Street Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[15]  Huiyu Zhou,et al.  Spatial mixture of Gaussians for dynamic background modelling , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[16]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Guo-Jun Qi,et al.  Differential Recurrent Neural Networks for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[19]  Ming Zhu,et al.  Multiscale Fully Convolutional Network for Foreground Object Detection in Infrared Videos , 2018, IEEE Geoscience and Remote Sensing Letters.

[20]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Guo-Jun Qi,et al.  Hierarchically Gated Deep Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Nam Ik Cho,et al.  Multi-scale Recurrent Encoder-Decoder Network for Dense Temporal Classification , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[23]  Long Ang Lim,et al.  Foreground segmentation using convolutional neural networks for multiscale feature encoding , 2018, Pattern Recognit. Lett..

[24]  Xiaobo Lu,et al.  WeSamBE: A Weight-Sample-Based Method for Background Subtraction , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Z. Zivkovic Improved adaptive Gaussian mixture model for background subtraction , 2004, ICPR 2004.

[26]  Narciso García,et al.  Real-time nonparametric background subtraction with tracking-based foreground update , 2018, Pattern Recognit..

[27]  Yassine Ruichek,et al.  BSCGAN: Deep Background Subtraction with Conditional Generative Adversarial Networks , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[28]  Marcos Ortega,et al.  An end-to-end deep learning approach for simultaneous background modeling and subtraction , 2019, BMVC.

[29]  Jianfei Cai,et al.  Background Subtraction Based on Deep Pixel Distribution Learning , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[30]  Lucia Maddalena,et al.  The SOBS algorithm: What are the limits? , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[31]  Bin Wang,et al.  A Fast Self-Tuning Background Subtraction Algorithm , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[32]  Piotr Bilinski,et al.  Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Subrahmanyam Murala,et al.  FgGAN: A Cascaded Unpaired Learning for Background Estimation and Foreground Segmentation , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[34]  Yuan Hu,et al.  Dynamic Feature Fusion for Semantic Edge Detection , 2019, IJCAI.

[35]  Santosh Kumar Vipparthi,et al.  MotionRec: A Unified Deep Framework for Moving Object Recognition , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[36]  Tao Xiang,et al.  Background Subtraction with DirichletProcess Mixture Models , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[38]  Gerhard Rigoll,et al.  A deep convolutional neural network for video sequence background subtraction , 2018, Pattern Recognit..

[39]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Murari Mandal,et al.  CANDID: Robust Change Dynamics and Deterministic Update Policy for Dynamic Background Subtraction , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[41]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[42]  Rui Wang,et al.  Static and Moving Object Detection Using Flux Tensor with Split Gaussian Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[43]  Prakash Ishwar,et al.  BSUV-Net: A Fully-Convolutional Neural Network for Background Subtraction of Unseen Videos , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[44]  Murari Mandal,et al.  SSSDET: Simple Short and Shallow Network for Resource Efficient Vehicle Detection in Aerial Scenes , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[45]  Fatih Murat Porikli,et al.  CDnet 2014: An Expanded Change Detection Benchmark Dataset , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[46]  Mario Ignacio Chacon Murguia,et al.  Auto-Adaptive Parallel SOM Architecture with a modular analysis for dynamic object segmentation in videos , 2016, Neurocomputing.

[47]  Guillaume-Alexandre Bilodeau,et al.  A Self-Adjusting Approach to Change Detection Based on Background Word Consensus , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[48]  Marc Van Droogenbroeck,et al.  Deep background subtraction with scene-specific convolutional neural networks , 2016, 2016 International Conference on Systems, Signals and Image Processing (IWSSIP).

[49]  Narciso García,et al.  Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA , 2016, Comput. Vis. Image Underst..

[50]  Atsushi Shimada,et al.  Simple background subtraction constraint for weakly supervised background subtraction network , 2019, 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[51]  Peng Gao,et al.  Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Chang-Su Kim,et al.  Background subtraction using encoder-decoder structured convolutional neural network , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[53]  Jae Wook Jeon,et al.  Change Detection by Training a Triplet Network for Motion Feature Extraction , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Yan Yan,et al.  Multiscale Cascaded Scene-Specific Convolutional Neural Networks for Background Subtraction , 2018, PCM.

[55]  Yuansheng Luo,et al.  Deep Background Modeling Using Fully Convolutional Network , 2018, IEEE Transactions on Intelligent Transportation Systems.

[56]  Murari Mandal,et al.  AVDNet: A Small-Sized Vehicle Detection Network for Aerial Visual Data , 2019, IEEE Geoscience and Remote Sensing Letters.

[57]  Murari Mandal,et al.  ANTIC: antithetic isomeric cluster patterns for medical image retrieval and change detection , 2019, IET Comput. Vis..

[58]  Long Ang Lim,et al.  Learning multi-scale features for foreground segmentation , 2018, Pattern Analysis and Applications.

[59]  Zhan-Li Sun,et al.  An Effective Subsuperpixel-Based Approach for Background Subtraction , 2020, IEEE Transactions on Industrial Electronics.

[60]  Guillaume-Alexandre Bilodeau,et al.  SuBSENSE: A Universal Change Detection Method With Local Adaptive Sensitivity , 2015, IEEE Transactions on Image Processing.

[61]  Lucia Maddalena,et al.  Towards Benchmarking Scene Background Initialization , 2015, ICIAP Workshops.

[62]  Marc Van Droogenbroeck,et al.  Semantic background subtraction , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[63]  Zhiming Luo,et al.  Interactive deep learning method for segmenting moving objects , 2017, Pattern Recognit. Lett..

[64]  Narciso García,et al.  Improved background modeling for real-time spatio-temporal non-parametric moving object detection strategies , 2013, Image Vis. Comput..

[65]  Hasan Sajid,et al.  Universal Multimode Background Subtraction , 2017, IEEE Transactions on Image Processing.

[66]  Simone Bianco,et al.  Combination of Video Change Detection Algorithms by Genetic Programming , 2017, IEEE Transactions on Evolutionary Computation.

[67]  Hao Hu,et al.  State-Frequency Memory Recurrent Neural Networks , 2017, ICML.

[68]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[69]  Guillaume-Alexandre Bilodeau,et al.  Improving background subtraction using Local Binary Similarity Patterns , 2014, IEEE Winter Conference on Applications of Computer Vision.

[70]  Mubarak Shah,et al.  Real-World Anomaly Detection in Surveillance Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[71]  Hasan Sajid,et al.  Background subtraction for static & moving camera , 2015, 2015 IEEE International Conference on Image Processing (ICIP).