LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues

Video sequences contain many cues that may be used to segment objects in them, such as color, gradient, color adjacency, shape, temporal coherence, camera and object motion, and easily-trackable points. This paper introduces LIVEcut, a novel method for interactively selecting objects in video sequences by extracting and leveraging as much of this information as possible. Using a graph-cut optimization framework, LIVEcut propagates the selection forward frame by frame, allowing the user to correct any mistakes along the way if needed. Enhanced methods of extracting many of the features are provided. In order to use the most accurate information from the various potentially-conflicting features, each feature is automatically weighted locally based on its estimated accuracy using the previous implicitly-validated frame. Feature weights are further updated by learning from the user corrections required in the previous frame. The effectiveness of LIVEcut is shown through timing comparisons to other interactive methods, accuracy comparisons to unsupervised methods, and qualitatively through selections on various video sequences.

[1]  Scott Schaefer,et al.  Image deformation using moving least squares , 2006, ACM Trans. Graph..

[2]  Tao Zhang,et al.  Interactive graph cut based segmentation with shape priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Michael F. Cohen,et al.  Optimized Color Sampling for Robust Matting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, ACM Trans. Graph..

[5]  David Salesin,et al.  Keyframe-based tracking for rotoscoping and animation , 2004, ACM Trans. Graph..

[6]  Marie-Pierre Jolly,et al.  Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.

[7]  Guillermo Sapiro,et al.  A Geodesic Framework for Fast Interactive Image and Video Segmentation and Matting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  William A. Barrett,et al.  Intelligent scissors for image composition , 1995, SIGGRAPH.

[9]  Harry Shum,et al.  Lazy snapping , 2004, ACM Trans. Graph..

[10]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  William A. Barrett,et al.  Interactive segmentation of image volumes with Live Surface , 2007, Comput. Graph..

[12]  Peng Tang,et al.  Video object segmentation based on graph cut with dynamic shape prior constraint , 2008, 2008 19th International Conference on Pattern Recognition.

[13]  A. Criminisi,et al.  Bilayer Segmentation of Live Video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[16]  B. S. Manjunath,et al.  Shape prior segmentation of multiple objects with graph cuts , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Irfan A. Essa,et al.  Tree-based Classifiers for Bilayer Video Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Harry Shum,et al.  Video object cut and paste , 2005, ACM Trans. Graph..

[21]  Larry S. Davis,et al.  Improved fast gauss transform and efficient kernel density estimation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[22]  Roberto Cipolla,et al.  Principled fusion of high-level model and low-level cues for motion segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[24]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[25]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Luc Van Gool,et al.  Transductive object cutout , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Robert T. Collins,et al.  Shape constrained figure-ground segmentation and tracking , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.