Learning color and locality cues for moving object detection and segmentation

This paper presents an algorithm for automatically detecting and segmenting a moving object from a monocular video. Detecting and segmenting a moving object from a video with limited object motion is challenging. Since existing automatic algorithms rely on motion to detect the moving object, they cannot work well when the object motion is sparse and insufficient. In this paper, we present an unsupervised algorithm to learn object color and locality cues from the sparse motion information. We first detect key frames with reliable motion cues and then estimate moving sub-objects based on these motion cues using a Markov Random Field (MRF) framework. From these sub-objects, we learn an appearance model as a color Gaussian Mixture Model. To avoid the false classification of background pixels with similar color to the moving objects, the locations of these sub-objects are propagated to neighboring frames as locality cues. Finally, robust moving object segmentation is achieved by combining these learned color and locality cues with motion cues in a MRF framework. Experiments on videos with a variety of object and camera motion demonstrate the effectiveness of this algorithm.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[3]  Steven K. Feiner,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 1990 .

[4]  Patrick Bouthemy,et al.  A region-level graph labeling approach to motion-based segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[7]  Mubarak Shah,et al.  Object based segmentation of video using color, motion and spatial information , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  L. Davis,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002, Proc. IEEE.

[9]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[11]  Yaser Sheikh,et al.  Bayesian object detection in dynamic scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Yang Wang,et al.  A dynamic conditional random field model for object segmentation in image sequences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Shyjan Mahamud,et al.  Comparing Belief Propagation and Graph Cuts for Novelty Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Mei Han,et al.  Video object segmentation by motion-based sequential feature clustering , 2006, MM '06.

[15]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[16]  A. Criminisi,et al.  Bilayer Segmentation of Live Video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.

[19]  Irfan A. Essa,et al.  Tree-based Classifiers for Bilayer Video Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Patrick Pérez,et al.  Detection and segmentation of moving objects in highly dynamic scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Wei Xiong,et al.  Moving Object Extraction with a Hand-held Camera , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22]  Xun Xu,et al.  A Loopy Belief Propagation approach for robust background estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.