Layered motion segmentation and depth ordering by tracking edges

This paper presents a new Bayesian framework for motion segmentation /sub i/viding a frame from an image sequence into layers representing different moving objects - by tracking edges between frames. Edges are found using the Canny edge detector, and the expectation-maximization algorithm is then used to fit motion models to these edges and also to calculate the probabilities of the edges obeying each motion model. The edges are also used to segment the image into regions of similar color. The most likely labeling for these regions is then calculated by using the edge probabilities, in association with a Markov random field-style prior. The identification of the relative depth ordering of the different motion layers is also determined, as an integral part of the process. An efficient implementation of this framework is presented for segmenting two motions (foreground and background) using two frames. It is then demonstrated how, by tracking the edges into further frames, the probabilities may be accumulated to provide an even more accurate and robust estimate, and segment an entire sequence. Further extensions are then presented to address the segmentation of more than two motions. Here, a hierarchical method of initializing the expectation-maximization algorithm is described, and it is demonstrated that the minimum description length principle may be used to automatically select the best number of motion layers. The results from over 30 sequences (demonstrating both two and three motions) are presented and discussed.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  W. B. Thompson,et al.  Combining motion and contrast for segmentation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[4]  W. Rey Introduction to Robust and Quasi-Robust Statistical Methods , 1983 .

[5]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Gilad Adiv,et al.  Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David W. Murray,et al.  Scene Segmentation from Visual Motion Using Global Optimization , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Edward H. Adelson,et al.  Layered representation for motion analysis , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael J. Black,et al.  Mixture models for optical flow computation , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[14]  Josef Bigün,et al.  Segmentation of moving objects by robust motion parameter estimation over multiple frames , 1994, ECCV.

[15]  Edward H. Adelson,et al.  Spatio-temporal segmentation of video data , 1994, Electronic Imaging.

[16]  Andrew Lippman,et al.  Spatio-temporal segmentation based on motion and static segmentation , 1995, Proceedings., International Conference on Image Processing.

[17]  Harpreet S. Sawhney,et al.  Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding , 1995, Proceedings of IEEE International Conference on Computer Vision.

[18]  P. Anandan,et al.  Efficient representations of video sequences and their applications , 1996, Signal Process. Image Commun..

[19]  Sushil K. Bhattacharjee,et al.  Robust region merging for spatio-temporal segmentation , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[20]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Frederic Dufaux,et al.  Regions merging based on robust statistical testing , 1996, Other Conferences.

[23]  J. Odobez,et al.  Separation of Moving Regions from Background in an Image Sequence Acquired with a Mobil Camera , 1997 .

[24]  Philip H. S. Torr An assessment of information criteria for motion model selection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Haluk Derin,et al.  Video Data Compression for Multimedia Computing , 1997 .

[26]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[27]  Patrick Bouthemy,et al.  Determining a Structured Spatio-Temporal Representation of Video Content for Efficient Visualization and Indexing , 1998, ECCV.

[28]  Graeme A. Jones,et al.  Segmentation of Global Motion using Temporal Probabilistic Classification , 1998, BMVC.

[29]  Fernand Meyer,et al.  Motion Segmenation and Depth Ordering Based on Morphological Segmentation , 1998, ECCV.

[30]  Jean-Marc Odobez,et al.  Direct incremental model-based image motion segmentation for video analysis , 1998, Signal Process..

[31]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[32]  Murat Kunt,et al.  Spatiotemporal Segmentation Based on Region Merging , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[34]  Gérard G. Medioni,et al.  Accurate motion flow estimation with discontinuities , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[35]  Patrick Bouthemy,et al.  Direct identification of moving objects and background from 2D motion models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[36]  David Sinclair Voronoi seeded colour image segmentation , 1999 .

[37]  Paul Smith,et al.  Motion Segmentation by Tracking Edge Information over Multiple Frames , 2000, ECCV.

[38]  Andrew Calway,et al.  Integrated segmentation and depth ordering of motion layers in image sequences , 2000, Image Vis. Comput..

[39]  Paul Smith,et al.  Edge-based motion segmentation , 2002 .

[40]  David J. Fleet,et al.  Probabilistic Detection and Tracking of Motion Boundaries , 2000, International Journal of Computer Vision.

[41]  Roberto Cipolla,et al.  Application of Lie Algebras to Visual Servoing , 2000, International Journal of Computer Vision.

[42]  Michal Irani,et al.  Computing occluding and transparent motions , 1994, International Journal of Computer Vision.