Joint Motion Segmentation and Background Estimation in Dynamic Scenes

We propose a joint foreground-background mixture model (FBM) that simultaneously performs background estimation and motion segmentation in complex dynamic scenes. Our FBM consist of a set of location-specific dynamic texture (DT) components, for modeling local background motion, and set of global DT components, for modeling consistent foreground motion. We derive an EM algorithm for estimating the parameters of the FBM. We also apply spatial constraints to the FBM using an Markov random field grid, and derive a corresponding variational approximation for inference. Unlike existing approaches to background subtraction, our FBM does not require a manually selected threshold or a separate training video. Unlike existing motion segmentation techniques, our FBM can segment foreground motions over complex background with mixed motions, and detect stopped objects. Since most dynamic scene datasets only contain videos with a single foreground object over a simple background, we develop a new challenging dataset with multiple foreground objects over complex dynamic backgrounds. In experiments, we show that jointly modeling the background and foreground segments with FBM yields significant improvements in accuracy on both background estimation and motion segmentation, compared to state-of-the-art methods.

[1]  Ying-li Tian,et al.  Robust Salient Motion Detection with Complex Background for Real-Time Video Surveillance , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[2]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[3]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[4]  Xiaowei Zhou,et al.  Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  W. Eric L. Grimson,et al.  Background Subtraction for Temporally Irregular Dynamic Textures , 2008, 2008 IEEE Workshop on Applications of Computer Vision.

[6]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[7]  Michael J. Black,et al.  A Fully-Connected Layered Model of Foreground and Background Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Stan Sclaroff,et al.  Segmenting foreground objects from a dynamic textured background via a robust Kalman filter , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Allen R. Hanson,et al.  Improvements in Joint Domain-Range Modeling for Background Subtraction , 2012, BMVC.

[10]  Afshin Dehghan,et al.  Improving an Object Detector and Extracting Regions Using Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jing Hua,et al.  Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  René Vidal,et al.  Using global bag of features models in random fields for joint categorization and segmentation of objects , 2011, CVPR 2011.

[13]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Nikos Paragios,et al.  Background modeling and subtraction of dynamic scenes , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[17]  Nuno Vasconcelos,et al.  Variational layered dynamic textures , 2009, CVPR.

[18]  Zoran Zivkovic,et al.  Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[19]  Nuno Vasconcelos,et al.  Generalized Stauffer–Grimson background subtraction for dynamic scenes , 2011, Machine Vision and Applications.

[20]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[22]  L. Hubert,et al.  Comparing partitions , 1985 .

[23]  Nuno Vasconcelos,et al.  Layered Dynamic Textures , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25]  Daniel Cremers,et al.  Dynamic texture segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  Yaser Sheikh,et al.  Bayesian modeling of dynamic scenes for object detection , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.