Real-Time Motion Segmentation of Sparse Feature Points at Any Speed

We present a real-time incremental approach to motion segmentation operating on sparse feature points. In contrast to previous work, the algorithm allows for a variable number of image frames to affect the segmentation process, thus enabling an arbitrary number of objects traveling at different relative speeds to be detected. Feature points are detected and tracked throughout an image sequence, and the features are grouped using a spatially constrained expectation-maximization (EM) algorithm that models the interactions between neighboring features using the Markov assumption. The primary parameter used by the algorithm is the amount of evidence that must accumulate before features are grouped. A statistical goodness-of-fit test monitors the change in the motion parameters of a group over time in order to automatically update the reference frame. Experimental results on a number of challenging image sequences demonstrate the effectiveness and computational efficiency of the technique.

[1]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[2]  Lihi Zelnik-Manor,et al.  Multi-body Factorization with Uncertainty: Revisiting Motion Consistency , 2005, International Journal of Computer Vision.

[3]  Serge J. Belongie,et al.  What went where , 2003, CVPR 2003.

[4]  Thomas Brox,et al.  Variational Motion Segmentation with Level Sets , 2006, ECCV.

[5]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[6]  Mubarak Shah,et al.  Accurate motion layer segmentation and matting , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  S. Shankar Sastry,et al.  Optimal segmentation of dynamic scenes from two perspective views , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Renaud Keriven,et al.  Robust Segmentation of Hidden Layers in Video Sequences , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[10]  Andrew Zisserman,et al.  Learning Layered Motion Segmentations of Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Yiannis Aloimonos,et al.  Motion segmentation using occlusions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Daniel Cremers,et al.  Motion Competition: A variational framework for piecewise parametric motion segmentation , 2005 .

[13]  Daniel Cremers,et al.  Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation , 2005, International Journal of Computer Vision.

[14]  Iasonas Kokkinos,et al.  An expectation maximization approach to the synergy between image segmentation and object categorization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Takeo Kanade,et al.  A multi-body factorization method for motion analysis , 1995, Proceedings of IEEE International Conference on Computer Vision.

[16]  Ben J. A. Kröse,et al.  Efficient Greedy Learning of Gaussian Mixture Models , 2003, Neural Computation.

[17]  Mubarak Shah,et al.  Motion layer extraction in the presence of occlusion using graph cuts , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Adrian Barbu,et al.  On the Relationship Between Image and Motion Segmentation , 2004, SCVMA.

[19]  Brendan J. Frey,et al.  Video Epitomes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Pascal Fua,et al.  Combining Stereo and Monocular Information to Compute Dense Depth Maps that Preserve Depth Discontinuities , 1991, IJCAI.

[21]  Shrinivas J. Pundlik,et al.  Motion Segmentation at Any Speed , 2006, BMVC.

[22]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Yair Weiss,et al.  Incorporating Non-motion Cues into 3D Motion Segmentation , 2006, ECCV.

[24]  A. Criminisi,et al.  Bilayer Segmentation of Live Video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Marc Pollefeys,et al.  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate , 2006, ECCV.

[26]  Andrew Zisserman,et al.  Object Level Grouping for Video Shots , 2004, International Journal of Computer Vision.

[27]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Stefano Soatto,et al.  A variational approach to scene reconstruction and image segmentation from motion-blur cues , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[29]  Cordelia Schmid,et al.  Segmenting, modeling, and matching video clips containing multiple moving objects , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[30]  Nebojsa Jojic,et al.  Escaping local minima through hierarchical model selection: Automatic object discovery, segmentation, and tracking in video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Theo Gevers,et al.  A Spatially Constrained Generative Model and an EM Algorithm for Image Segmentation , 2007, IEEE Transactions on Neural Networks.

[32]  Marc Toussaint Motion Segmentation Using Inference in Dynamic Bayesian Networks , 2007, BMVC.

[33]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[34]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[35]  René Vidal,et al.  A closed form solution to direct motion segmentation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Takeo Kanade,et al.  A subspace approach to layer extraction , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[37]  Patrick Pérez,et al.  Periodic motion detection and segmentation via approximate sequence alignment , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[38]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[39]  Harpreet S. Sawhney,et al.  Independent motion detection in 3D scenes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[40]  Paul Smith,et al.  Layered motion segmentation and depth ordering by tracking edges , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Nikos A. Vlassis,et al.  A Greedy EM Algorithm for Gaussian Mixture Learning , 2002, Neural Processing Letters.

[42]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[43]  P. Anandan,et al.  A unified approach to moving object detection in 2D and 3D scenes , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[44]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Gaurav S. Sukhatme,et al.  Detecting Moving Objects using a Single Camera on a Mobile Robot in an Outdoor Environment , 2004 .

[46]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, International Journal of Computer Vision.

[47]  Harpreet S. Sawhney,et al.  Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding , 1995, Proceedings of IEEE International Conference on Computer Vision.