Layered segmentation and optical flow estimation over time

Layered models provide a compelling approach for estimating image motion and segmenting moving scenes. Previous methods, however, have failed to capture the structure of complex scenes, provide precise object boundaries, effectively estimate the number of layers in a scene, or robustly determine the depth order of the layers. Furthermore, previous methods have focused on optical flow between pairs of frames rather than longer sequences. We show that image sequences with more frames are needed to resolve ambiguities in depth ordering at occlusion boundaries; temporal layer constancy makes this feasible. Our generative model of image sequences is rich but difficult to optimize with traditional gradient descent methods. We propose a novel discrete approximation of the continuous objective in terms of a sequence of depth-ordered MRFs and extend graph-cut optimization methods with new “moves” that make joint layer segmentation and motion estimation feasible. Our optimizer, which mixes discrete and continuous optimization, automatically determines the number of layers and reasons about their depth ordering. We demonstrate the value of layered models, our optimization strategy, and the use of more than two frames on both the Middlebury optical flow benchmark and the MIT layer segmentation benchmark.

[1]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[2]  Pierre Hansen,et al.  Roof duality, complementation and persistency in quadratic 0–1 optimization , 1984, Math. Program..

[3]  William B. Thompson,et al.  Analysis of Accretion and Deletion at Boundaries in Dynamic Scenes , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  David W. Murray,et al.  Scene Segmentation from Visual Motion Using Global Optimization , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Michael J. Black,et al.  A model for the detection of motion over time , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[6]  A Phipps,et al.  Top to Bottom. , 1993 .

[7]  Michael J. Black,et al.  Mixture models for optical flow computation , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[9]  David Fleety,et al.  Second-Order Method for Occlusion Relationships in Motion Layers , 1995 .

[10]  X. Descombes,et al.  The Ising/Potts model is not well suited to segmentation tasks , 1996, 1996 IEEE Digital Signal Processing Workshop Proceedings.

[11]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Yair Weiss,et al.  Smoothness in layers: Motion segmentation using nonparametric mixture estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Richard Szeliski,et al.  An Integrated Bayesian Approach to Layer Extraction from Image Sequences , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[17]  David J. Fleet,et al.  A Layered Motion Representation with Occlusion and Compact Spatial Support , 2002, ECCV.

[18]  Serge J. Belongie,et al.  What went where , 2003, CVPR 2003.

[19]  David J. Fleet,et al.  Probabilistic Detection and Tracking of Motion Boundaries , 2000, International Journal of Computer Vision.

[20]  Mubarak Shah,et al.  Motion layer extraction in the presence of occlusion using graph cuts , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Daniel Cremers,et al.  Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation , 2005, International Journal of Computer Vision.

[22]  Andrew Zisserman,et al.  Learning Layered Motion Segmentations of Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23]  Thomas Brox,et al.  Variational Motion Segmentation with Level Sets , 2006, ECCV.

[24]  Jamie Shotton,et al.  The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Vladimir Kolmogorov,et al.  Minimizing Nonsubmodular Functions with Graph Cuts-A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Václav Hlavác,et al.  Efficient MRF Deformation Model for Non-Rigid Image Matching , 2007, CVPR.

[27]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[28]  Edward H. Adelson,et al.  Human-assisted motion annotation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Carsten Rother,et al.  FusionFlow: Discrete-continuous optimization for optical flow estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Michael I. Jordan,et al.  Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes , 2008, NIPS.

[31]  Daphna Weinshall,et al.  Motion Segmentation and Depth Ordering Using an Occlusion Detector , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Daniel Cremers,et al.  High resolution motion layer decomposition using dual-space graph cuts , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Nikos Paragios,et al.  Segmentation, ordering and multi-object tracking using graphical models , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Daniel Cremers,et al.  Anisotropic Huber-L1 Optical Flow , 2009, BMVC.

[35]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Michael J. Black,et al.  Layered image motion with explicit occlusions, temporal consistency, and depth ordering , 2010, NIPS.

[38]  Henning Zimmer,et al.  Modeling temporal coherence for optical flow , 2011, 2011 International Conference on Computer Vision.

[39]  Yasuyuki Matsushita,et al.  Motion detail preserving optical flow estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.