Cross-scene foreground segmentation with supervised and unsupervised model communication

Abstract In this paper 1 , we investigate cross-scene video foreground segmentation via supervised and unsupervised model communication. Traditional unsupervised background subtraction methods often face the challenging problem of updating the statistical background model online. In contrast, supervised foreground segmentation methods, such as those that are based on deep learning, rely on large amounts of training data, thereby limiting their cross-scene performance. Our method leverages segmented masks from a cross-scene trained deep model (spatio-temporal attention model (STAM), pyramid scene parsing network (PSPNet), or DeepLabV3+) to seed online updates for the statistical background model (CPB), thereby refining the foreground segmentation. More flexible than methods that require scene-specific training and more data-efficient than unsupervised models, our method outperforms state-of-the-art approaches on CDNet2014, WallFlower, and LIMU according to our experimental results. The proposed framework can be integrated into a video surveillance system in a plug-and-play form to realize cross-scene foreground segmentation.

[1]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[2]  Tiziana D'Orazio,et al.  Advances in Background Updating and Shadow Removing for Motion Detection Algorithms , 2005, CAIP.

[3]  Xudong Jiang,et al.  Semantic Correlation Promoted Shape-Variant Context for Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Guillaume-Alexandre Bilodeau,et al.  SuBSENSE: A Universal Change Detection Method With Local Adaptive Sensitivity , 2015, IEEE Transactions on Image Processing.

[6]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Dong Liang,et al.  Adaptive local spatial modeling for online change detection under abrupt dynamic background , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[8]  Gang Wang,et al.  Boundary-Aware Feature Propagation for Scene Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Tiziana D'Orazio,et al.  Moving object segmentation by background subtraction and temporal analysis , 2006, Image Vis. Comput..

[10]  Xudong Jiang,et al.  Semantic Segmentation With Context Encoding and Multi-Path Decoding , 2020, IEEE Transactions on Image Processing.

[11]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  I. Haritaoglu,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002 .

[13]  Fatih Murat Porikli,et al.  Changedetection.net: A new change detection benchmark dataset , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Zhiming Luo,et al.  Interactive deep learning method for segmenting moving objects , 2017, Pattern Recognit. Lett..

[15]  Dong Liang,et al.  A Co-occurrence Background Model with Hypothesis on Degradation Modification for Object Detection in Strong Background Changes , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[16]  Xiying Li,et al.  A Fully Convolutional Encoder–Decoder Spatial–Temporal Network for Real-Time Background Subtraction , 2019, IEEE Access.

[17]  Marc Van Droogenbroeck,et al.  Semantic background subtraction , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[18]  Yaser Sheikh,et al.  Bayesian modeling of dynamic scenes for object detection , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Guangming Shi,et al.  Robust Foreground Estimation via Structured Gaussian Scale Mixture Modeling , 2018, IEEE Transactions on Image Processing.

[20]  Andrew Blake,et al.  A Probabilistic Background Model for Tracking , 2000, ECCV.

[21]  Thierry Chateau,et al.  A Benchmark Dataset for Outdoor Foreground/Background Extraction , 2012, ACCV Workshops.

[22]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[23]  Gerhard Rigoll,et al.  A deep convolutional neural network for video sequence background subtraction , 2018, Pattern Recognit..

[24]  Dong Liang,et al.  Foreground detection based on co-occurrence background model with hypothesis on degradation modification in dynamic scenes , 2019, Signal Process..

[25]  Huiyu Zhou,et al.  Spatio-Temporal Attention Model for Foreground Detection in Cross-Scene Surveillance Videos , 2019, Sensors.

[26]  Dong Liang,et al.  Co-occurrence probability-based pixel pairs background model for robust object detection in dynamic scenes , 2015, Pattern Recognit..

[27]  Gerhard Rigoll,et al.  A Deep Convolutional Neural Network for Background Subtraction , 2017, ArXiv.

[28]  L. Davis,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002, Proc. IEEE.

[29]  Gerhard Rigoll,et al.  Background segmentation with feedback: The Pixel-Based Adaptive Segmenter , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[30]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[31]  Long Ang Lim,et al.  Learning multi-scale features for foreground segmentation , 2018, Pattern Analysis and Applications.

[32]  Ashish Ghosh,et al.  Real-time record sensitive background classifier (RSBC) , 2019, Expert Syst. Appl..

[33]  Gang Wang,et al.  Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Long Ang Lim,et al.  Foreground segmentation using convolutional neural networks for multiscale feature encoding , 2018, Pattern Recognit. Lett..

[35]  Jianfei Cai,et al.  Background Subtraction Based on Deep Pixel Distribution Learning , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[36]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Marc Van Droogenbroeck,et al.  Deep background subtraction with scene-specific convolutional neural networks , 2016, 2016 International Conference on Systems, Signals and Image Processing (IWSSIP).

[40]  Kevin I-Kai Wang,et al.  SuperBE: computationally light background estimation with superpixels , 2018, Journal of Real-Time Image Processing.

[41]  Xiaoqin Zhang,et al.  Incremental Tensor Subspace Learning and Its Applications to Foreground Segmentation and Tracking , 2011, International Journal of Computer Vision.

[42]  Luís Corte-Real,et al.  BMOG: boosted Gaussian Mixture Model with controlled complexity for background subtraction , 2018, Pattern Analysis and Applications.